Optimizing Apache Spark & Tuning Best Practices
Startdata en plaatsen
computer Online: Virtual 25 sep. 2023 tot 26 sep. 2023Toon rooster event 25 september 2023, 09:00-17:00, Virtual, Dag 1 event 26 september 2023, 09:00-17:00, Virtual, Dag 2 |
computer Online: Virtual 7 dec. 2023 tot 8 dec. 2023Toon rooster event 7 december 2023, 09:00-17:00, Virtual, Dag 1 event 8 december 2023, 09:00-17:00, Virtual, Dag 2 |
Beschrijving
This live-virtual course is perfect for
Data and Machine Learning Engineers who deal with transformation of large volumes of data and need production-quality code. Expert Data Scientists can also participate: they will learn how to get the most performance out of Spark and how simple tweaks can increase the performance dramatically.
What will you learn during Optimizing Apache Spark & Tuning Best Practices?
After this training, you will have learned how Apache Spark works internally, the best practices to write performant code, and have acquired essential skills necessary to debug and tweak your Spark applications.
Program
Fundamentals
- Spark execution model: Driver/Executors
- Spark resource …
Veelgestelde vragen
Er zijn nog geen veelgestelde vragen over dit product. Als je een vraag hebt, neem dan contact op met onze klantenservice.
This live-virtual course is perfect for
Data and Machine Learning Engineers who deal with transformation of large volumes of data and need production-quality code. Expert Data Scientists can also participate: they will learn how to get the most performance out of Spark and how simple tweaks can increase the performance dramatically.
What will you learn during Optimizing Apache Spark & Tuning Best Practices?
After this training, you will have learned how Apache Spark works internally, the best practices to write performant code, and have acquired essential skills necessary to debug and tweak your Spark applications.
Program
Fundamentals
- Spark execution model: Driver/Executors
- Spark resource managers (YARN, MESOS, K8s)
- Understanding RDDs/DataFrames APIs and bindings
- Difference between Actions and Transformations
- How to read the Query plan (Physical/Logical)
Spark internals
- Spark Memory model
- Understanding persistence (caching)
- Catalyst optimizer and Tungsten project
- Shuffle service and how is shuffle operation executed
- Concept of fair scheduling and pools
- Java and Kryo serializer
- Step into JVM world: what you need to know about GC when running Spark applications
Spark optimization: main problems and issues
- The most common memory problems
- Benefit of using early filtering
- Understanding partition and predicate filtering
- Join optimization
- Combating Data skew (preprocessing, broadcasting, salting)
- Understanding shuffle partitions: how to tackle memory/disk spill
- Downside of using UDF’s
- Executor idle timeout
- Data formats examples
Moving to production
- Debugging / troubleshooting
- Productionizing your Spark application
- Dynamic allocation and dynamic partitioning
- Profiling your Spark application (Sparklint)
- JVM profiler
Data Engineering Trainers
This Data Engineering training is brought to you by our training partner, GoDataDriven. GoDataDriven works with experts in their field who are always on the lookout for the most innovative ways to get the most out of data. Your trainer is a data guru who enjoys sharing his or her experiences to help you work with the latest tools.
Yes, I want to know more about Apache Spark!
After registering for this training, you will receive a confirmation email with practical information. A week before the training we share literature if there's anything you need to prepare. See you soon!
Virtual or in-person training: This training can be delivered both in-person or online. When hosting the in-person training, we provide lunch, snacks and drinks to the participants. Accordingly there is a discount for virtual trainings.
Scale up your skills
Boost your career
Get the training you need to succeed, in every IT field.
Learn from the world's leading experts with public and in-company
courses at Xebia Academy.
Blijf op de hoogte van nieuwe ervaringen
Deel je ervaring
Heb je ervaring met deze cursus? Deel je ervaring en help anderen kiezen. Als dank voor de moeite doneert Springest € 1,- aan Stichting Edukans.Er zijn nog geen veelgestelde vragen over dit product. Als je een vraag hebt, neem dan contact op met onze klantenservice.