課程信息
4.7
1,684 個評分
355 個審閱
專項課程

第 4 門課程(共 5 門)

100% 在線

100% 在線

立即開始,按照自己的計劃學習。
可靈活調整截止日期

可靈活調整截止日期

根據您的日程表重置截止日期。
完成時間(小時)

完成時間大約為15 小時

建議:6 hours/week...
可選語言

英語(English)

字幕:英語(English)

您將獲得的技能

Scala ProgrammingBig DataApache SparkSQL
專項課程

第 4 門課程(共 5 門)

100% 在線

100% 在線

立即開始,按照自己的計劃學習。
可靈活調整截止日期

可靈活調整截止日期

根據您的日程表重置截止日期。
完成時間(小時)

完成時間大約為15 小時

建議:6 hours/week...
可選語言

英語(English)

字幕:英語(English)

教學大綱 - 您將從這門課程中學到什麼

1
完成時間(小時)
完成時間為 12 小時

Getting Started + Spark Basics

Get up and running with Scala on your computer. Complete an example assignment to familiarize yourself with our unique way of submitting assignments. In this week, we'll bridge the gap between data parallelism in the shared memory scenario (learned in the Parallel Programming course, prerequisite) and the distributed scenario. We'll look at important concerns that arise in distributed systems, like latency and failure. We'll go on to cover the basics of Spark, a functionally-oriented framework for big data processing in Scala. We'll end the first week by exercising what we learned about Spark by immediately getting our hands dirty analyzing a real-world data set....
Reading
7 個視頻 (總計 105 分鐘), 5 個閱讀材料, 3 個測驗
Video7 個視頻
Data-Parallel to Distributed Data-Parallel10分鐘
Latency24分鐘
RDDs, Spark's Distributed Collection9分鐘
RDDs: Transformation and Actions16分鐘
Evaluation in Spark: Unlike Scala Collections!20分鐘
Cluster Topology Matters!8分鐘
Reading5 個閱讀材料
Tools setup10分鐘
Eclipse tutorial10分鐘
Intellij IDEA Tutorial10分鐘
Sbt tutorial10分鐘
Submitting solutions10分鐘
2
完成時間(小時)
完成時間為 7 小時

Reduction Operations & Distributed Key-Value Pairs

This week, we'll look at a special kind of RDD called pair RDDs. With this specialized kind of RDD in hand, we'll cover essential operations on large data sets, such as reductions and joins....
Reading
4 個視頻 (總計 59 分鐘), 2 個測驗
Video4 個視頻
Pair RDDs6分鐘
Transformations and Actions on Pair RDDs20分鐘
Joins17分鐘
3
完成時間(小時)
完成時間為 1 小時

Partitioning and Shuffling

This week we'll look at some of the performance implications of using operations like joins. Is it possible to get the same result without having to pay for the overhead of moving data over the network? We'll answer this question by delving into how we can partition our data to achieve better data locality, in turn optimizing some of our Spark jobs....
Reading
4 個視頻 (總計 57 分鐘)
Video4 個視頻
Partitioning14分鐘
Optimizing with Partitioners11分鐘
Wide vs Narrow Dependencies16分鐘
4
完成時間(小時)
完成時間為 8 小時

Structured data: SQL, Dataframes, and Datasets

With our newfound understanding of the cost of data movement in a Spark job, and some experience optimizing jobs for data locality last week, this week we'll focus on how we can more easily achieve similar optimizations. Can structured data help us? We'll look at Spark SQL and its powerful optimizer which uses structure to apply impressive optimizations. We'll move on to cover DataFrames and Datasets, which give us a way to mix RDDs with the powerful automatic optimizations behind Spark SQL....
Reading
5 個視頻 (總計 133 分鐘), 2 個測驗
Video5 個視頻
Spark SQL17分鐘
DataFrames (1)26分鐘
DataFrames (2)30分鐘
Datasets43分鐘
4.7
355 個審閱Chevron Right
職業方向

10%

完成這些課程後已開始新的職業生涯
工作福利

15%

通過此課程獲得實實在在的工作福利
職業晉升

12%

加薪或升職

熱門審閱

創建者 CCJun 8th 2017

The sessions where clearly explained and focused. Some of the exercises contained slightly confusing hints and information, but I'm sure those mistakes will be ironed out in future iterations. Thanks!

創建者 CRApr 10th 2017

Great introduction to spark. Fun assignments. Since it was the first ever session, there were quite a few kinks with the assignments. But the discussion forums rescued me any time I was stuck.

講師

Avatar

Dr. Heather Miller

Research Scientist
EPFL

關於 École Polytechnique Fédérale de Lausanne

關於 Functional Programming in Scala 專項課程

Discover how to write elegant code that works the first time it is run. This Specialization provides a hands-on introduction to functional programming using the widespread programming language, Scala. It begins from the basic building blocks of the functional paradigm, first showing how to use these blocks to solve small problems, before building up to combining these concepts to architect larger functional programs. You'll see how the functional paradigm facilitates parallel and distributed programming, and through a series of hands on examples and programming assignments, you'll learn how to analyze data sets small to large; from parallel programming on multicore architectures, to distributed programming on a cluster using Apache Spark. A final capstone project will allow you to apply the skills you learned by building a large data-intensive application using real-world data....
Functional Programming in Scala

常見問題

  • 注册以便获得证书后,您将有权访问所有视频、测验和编程作业(如果适用)。只有在您的班次开课之后,才可以提交和审阅同学互评作业。如果您选择在不购买的情况下浏览课程,可能无法访问某些作业。

  • 您注册课程后,将有权访问专项课程中的所有课程,并且会在完成课程后获得证书。您的电子课程证书将添加到您的成就页中,您可以通过该页打印您的课程证书或将其添加到您的领英档案中。如果您只想阅读和查看课程内容,可以免费旁听课程。

還有其他問題嗎?請訪問 學生幫助中心