Optimizing Apache Spark & Tuning Best Practices

Niveau
Tijdsduur
Locatie
Op locatie, Online
Startdatum en plaats

Optimizing Apache Spark & Tuning Best Practices

Xebia Academy
Logo van Xebia Academy
Opleiderscore: starstarstarstarstar_half 8,6 Xebia Academy heeft een gemiddelde beoordeling van 8,6 (uit 105 ervaringen)

Tip: meer info over het programma, prijs, en inschrijven? Download de brochure!

Startdata en plaatsen

computer Online: Virtual
22 aug. 2024 tot 23 aug. 2024
Toon rooster
event 22 augustus 2024, 09:00-17:00, Virtual, Dag 1
event 23 augustus 2024, 09:00-17:00, Virtual, Dag 2
placeWibautstraat 200, Amsterdam
7 okt. 2024 tot 8 okt. 2024
Toon rooster
event 7 oktober 2024, 09:00-17:00, Wibautstraat 200, Amsterdam, Dag 1
event 8 oktober 2024, 09:00-17:00, Wibautstraat 200, Amsterdam, Dag 2
placeWibautstraat 200, Amsterdam
25 nov. 2024 tot 26 nov. 2024
Toon rooster
event 25 november 2024, 09:00-17:00, Wibautstraat 200, Amsterdam, Dag 1
event 26 november 2024, 09:00-17:00, Wibautstraat 200, Amsterdam, Dag 2

Beschrijving

This live-virtual course is perfect for

Data and Machine Learning Engineers who deal with transformation of large volumes of data and need production-quality code. Expert Data Scientists can also participate: they will learn how to get the most performance out of Spark and how simple tweaks can increase the performance dramatically.

What will you learn during Optimizing Apache Spark & Tuning Best Practices?

After this training, you will have learned how Apache Spark works internally, the best practices to write performant code, and have acquired essential skills necessary to debug and tweak your Spark applications.

Program

Fundamentals

  • Spark execution model: Driver/Executors
  • Spark resource …

Lees de volledige beschrijving

Veelgestelde vragen

Er zijn nog geen veelgestelde vragen over dit product. Als je een vraag hebt, neem dan contact op met onze klantenservice.

Nog niet gevonden wat je zocht? Bekijk deze onderwerpen: Apache Spark, Apache, Apache Hadoop, Scala en Splunk.

This live-virtual course is perfect for

Data and Machine Learning Engineers who deal with transformation of large volumes of data and need production-quality code. Expert Data Scientists can also participate: they will learn how to get the most performance out of Spark and how simple tweaks can increase the performance dramatically.

What will you learn during Optimizing Apache Spark & Tuning Best Practices?

After this training, you will have learned how Apache Spark works internally, the best practices to write performant code, and have acquired essential skills necessary to debug and tweak your Spark applications.

Program

Fundamentals

  • Spark execution model: Driver/Executors
  • Spark resource managers (YARN, MESOS, K8s)
  • Understanding RDDs/DataFrames APIs and bindings
  • Difference between Actions and Transformations
  • How to read the Query plan (Physical/Logical)

Spark internals

  • Spark Memory model
  • Understanding persistence (caching)
  • Catalyst optimizer and Tungsten project
  • Shuffle service and how is shuffle operation executed
  • Concept of fair scheduling and pools
  • Java and Kryo serializer
  • Step into JVM world: what you need to know about GC when running Spark applications

Spark optimization: main problems and issues

  • The most common memory problems
  • Benefit of using early filtering
  • Understanding partition and predicate filtering
  • Join optimization
  • Combating Data skew (preprocessing, broadcasting, salting)
  • Understanding shuffle partitions: how to tackle memory/disk spill
  • Downside of using UDF’s
  • Executor idle timeout
  • Data formats examples

Moving to production

  • Debugging / troubleshooting
  • Productionizing your Spark application
  • Dynamic allocation and dynamic partitioning
  • Profiling your Spark application (Sparklint)
  • JVM profiler

Data Engineering Trainers

This Data Engineering training is brought to you by Xebia Data. Xebia Data is part of Xebia, just like Xebia Academy. Xebia Data works with experts in their field who are always on the lookout for the most innovative ways to get the most out of data. Your trainer is a data guru who enjoys sharing his or her experiences to help you work with the latest tools.

Yes, I want to know more about Apache Spark!

After registering for this training, you will receive a confirmation email with practical information. A week before the training we share literature if there's anything you need to prepare. See you soon!

Virtual or in-person training: This training can be delivered both in-person or online. When hosting the in-person training, we provide lunch, snacks and drinks to the participants. Accordingly there is a discount for virtual trainings.

Scale up your skills
Boost your career

Get the training you need to succeed, in every IT field.
Learn from the world's leading experts with public and in-company courses at Xebia Academy.

Blijf op de hoogte van nieuwe ervaringen

Er zijn nog geen ervaringen.

Deel je ervaring

Heb je ervaring met deze cursus? Deel je ervaring en help anderen kiezen. Als dank voor de moeite doneert Springest € 1,- aan Stichting Edukans.

Er zijn nog geen veelgestelde vragen over dit product. Als je een vraag hebt, neem dan contact op met onze klantenservice.

Vraag nu gratis en vrijblijvend informatie aan:

(optioneel)
(optioneel)
(optioneel)
(optioneel)
(optioneel)
(optioneel)
(optioneel)