Description
Course Description:
Learn the essentials of using Spark for your big data workloads.
Apache Spark is an important component in the Hadoop Ecosystem as a cluster computing engine used for Big Data. Building on top of the Hadoop YARN and HDFS ecosystem, Spark offers faster in-memory processing for computing tasks when compared to Map/Reduce. It can be programmed in Java, Scala, Python, and R along with SQL-based front-ends.
This course introduces Scala, Python, or R developers to the world of Spark programming. It begins with an overview of the ecosystem and hands-on experience with the platform such as working with the Spark Shell, using RDDs, and DataFrames. You’ll later explore a wider-scoped introduction to NoSQL, Spark Streaming, Spark SQL, Spark MLLib, and how the pieces are put together in a larger application.
Additional information
format | Instructor-Led |
---|---|
topic | Big Data |
length | 2 days |