Big data in practice using Spark (EN/NL/FR)
Startdata en plaatsen
placeLeuven 19 mei. 2025 tot 20 mei. 2025Toon rooster event 19 mei 2025, 09:00-17:00, Leuven event 20 mei 2025, 09:00-17:00, Leuven |
computer Online: Online 19 mei. 2025 tot 20 mei. 2025Toon rooster event 19 mei 2025, 09:00-17:00, Online event 20 mei 2025, 09:00-17:00, Online |
Beschrijving
This 2 day ABIS course will allow you to get hands-on practice on Linux with Spark and its libraries. You learn how to implement robust data processing (in Scala, Python, Java or R) on Spark with an SQL-style interface.
After successful completion of the course, you will have sufficient basic expertise to set up a Spark development environment, and use it to interrogate your data. You will be able to write simple SparkSQL scripts and programs (with the Scala based SparkShell or with PySpark) that use the MLlib, GraphX, and Streaming libraries.
Remark: Course description in English; Dutch and French versions are available on the ABIS website. Courses are planned in Dutch, English, and Frenc…
Veelgestelde vragen
Er zijn nog geen veelgestelde vragen over dit product. Als je een vraag hebt, neem dan contact op met onze klantenservice.
This 2 day ABIS course will allow you to get hands-on practice on Linux with Spark and its libraries. You learn how to implement robust data processing (in Scala, Python, Java or R) on Spark with an SQL-style interface.
After successful completion of the course, you will have sufficient basic expertise to set up a Spark development environment, and use it to interrogate your data. You will be able to write simple SparkSQL scripts and programs (with the Scala based SparkShell or with PySpark) that use the MLlib, GraphX, and Streaming libraries.
Remark: Course description in English; Dutch and French versions are available on the ABIS website. Courses are planned in Dutch, English, and French. Consult the ABIS website for alternate course formats.
Main Topics - Content:
- Motivation for Spark & base concepts
- The Apache Spark project and its components
- Getting to learn the Spark architecture and programming model
- The principles of Data Analytics
- Data sources
- Learn how to access data residing in Hadoop HDFS, Cassandra, AWS, or a relational database
- Interfaces
- Working with the several programming interfaces and the web interface (specifically: Spark-shell and PySpark)
- Writing and debugging programs for simple data analytic problems
- Data Frames and RDDs
- A short introduction to the use of the Spark libraries
- SparkSQL
- Machine learning (MLlib)
- Streaming (i.e., processing "volatile" data)
- Parallel computations in trees and graphs (GraphX)
Audience: Whoever wants to start practising Spark.
Background: Familiarity with the concepts of data clusters and distributed processing; minimal knowledge of SQL and Unix/Linux are useful. Minimal experience with at least one programming language is a must.
Didactics: Classroom instruction with practical exercises.
Duration: 2 days.
Blijf op de hoogte van nieuwe ervaringen
Deel je ervaring
Heb je ervaring met deze cursus? Deel je ervaring en help anderen kiezen. Als dank voor de moeite doneert Springest € 1,- aan Stichting Edukans.Er zijn nog geen veelgestelde vragen over dit product. Als je een vraag hebt, neem dan contact op met onze klantenservice.