Explore stock prices with Spark SQL

4.5
39 個評分
提供方
Coursera Project Network
1,967 人已註冊
在此指導項目中,您將:

Create an application that runs on a Spark cluster

Derive knowledge from data using Spark RDD and DataFrames

Store results in Parquet tables

Clock2 hours
Intermediate中級
Cloud無需下載
Video分屏視頻
Comment Dots英語(English)
Laptop僅限桌面

In this 1-hour long project-based course, you will learn how to interact with a Spark cluster using Jupyter notebook and how to start a Spark application. You will learn how to utilize Spark Resisilent Distributed Datasets and Spark Data Frames to explore a dataset. We will load a dataset into our Spark program, and perform analysis on it by using Actions, Transformations, Spark DataFrame API and Spark SQL. You will learn how to choose the best tools to use for each scenario. Finally, you will learn to save your results in Parquet tables.

您要培養的技能

  • Spark SQL
  • Data Analysis
  • Big Data
  • Apache Spark
  • Distributed Computing

分步進行學習

在與您的工作區一起在分屏中播放的視頻中,您的授課教師將指導您完成每個步驟:

  1. By the end of Task 1, you will become familiar with the Jupyter notebook environment

  2. By the end of Task 2, you will be able to initialize a Spark application

  3. By the end of Task 3, you will be able to create Spark Resilient Distributed Datasets

  4. By the end of Task 4, you will be able to create Spark Data Frames in several ways

  5. By the end of Task 5, you will be able to explore data sets with Spark SQL

  6. By the end of Task 6, you will be able to write statistic queries and compare Spark DataFrames

  7. By the end of Task 7, you will be able to store DataFrames in Parquet tables

指導項目工作原理

您的工作空間就是瀏覽器中的雲桌面,無需下載

在分屏視頻中,您的授課教師會為您提供分步指導

常見問題

常見問題

還有其他問題嗎?請訪問 學生幫助中心