課程信息
4.0
207 個評分
60 個審閱
專項課程

第 1 門課程(共 5 門),位於

100% online

100% online

立即開始,按照自己的計劃學習。
可靈活調整截止日期

可靈活調整截止日期

根據您的日程表重置截止日期。
中級

中級

完成時間(小時)

完成時間大約為43 小時

建議:6 weeks of study, 6-8 hours/week...
可選語言

英語(English)

字幕:英語(English)...

您將獲得的技能

Python ProgrammingApache HadoopMapreduceApache Spark
專項課程

第 1 門課程(共 5 門),位於

100% online

100% online

立即開始,按照自己的計劃學習。
可靈活調整截止日期

可靈活調整截止日期

根據您的日程表重置截止日期。
中級

中級

完成時間(小時)

完成時間大約為43 小時

建議:6 weeks of study, 6-8 hours/week...
可選語言

英語(English)

字幕:英語(English)...

教學大綱 - 您將從這門課程中學到什麼

1
完成時間(小時)
完成時間為 14 分鐘

Welcome

...
Reading
8 個視頻(共 14 分鐘)
Video8 個視頻
Issues BigData can solve1分鐘
BigData Applications1分鐘
What is BigData Essentials?2分鐘
Course Structure2分鐘
Meet Emeli1分鐘
Meet Alexey2分鐘
Meet Ivan1分鐘
完成時間(小時)
完成時間為 8 小時

What are BigData and distributed file systems (e.g. HDFS)?

...
Reading
18 個視頻(共 136 分鐘), 10 個閱讀材料, 5 個測驗
Video18 個視頻
File system managing6分鐘
File content exploration 15分鐘
File content exploration 213分鐘
Processes4分鐘
Scaling Distributed File System9分鐘
Block and Replica States, Recovery Process 16分鐘
Block and Replica States, Recovery Process 27分鐘
HDFS Client9分鐘
Web UI, REST API4分鐘
Namenode Architecture8分鐘
Introduction10分鐘
Text formats9分鐘
Binary formats 18分鐘
Binary formats 28分鐘
Compression7分鐘
How to submit your first assignment3分鐘
How to Install Docker on Windows 7, 8, 104分鐘
Reading10 個閱讀材料
Basic Bash Commands10分鐘
Slack Channel is the quickest way to get answers to your questions10分鐘
HDFS Lesson Introduction10分鐘
Gentle Introduction into "curl"10分鐘
File formats extra (optional)10分鐘
Grading System: Instructions and Common Problems10分鐘
Docker Installation Guide10分鐘
Programming Assignment: Instructions and Common Problems10分鐘
FAQ How to show your code to teaching staff10分鐘
Slack channel "Bigdata-coursera" - the quickest to solve technical problems.10分鐘
Quiz2 個練習
Distributed File Systems16分鐘
Big Data and Distributed File Systems25分鐘
2
完成時間(小時)
完成時間為 3 小時

Solving Problems with MapReduce

...
Reading
17 個視頻(共 94 分鐘), 1 個閱讀材料, 3 個測驗
Video17 個視頻
Unreliable Components 28分鐘
MapReduce4分鐘
Distributed Shell8分鐘
Fault Tolerance7分鐘
Fault Tolerance. Live Demo3分鐘
Streaming7分鐘
Streaming in Python3分鐘
WordCount in Python5分鐘
Distributed Cache4分鐘
Environment, Counters4分鐘
Testing5分鐘
Combiner5分鐘
Partitioner7分鐘
Comparator1分鐘
Speculative Execution / Backup Tasks3分鐘
Compression4分鐘
Reading1 個閱讀材料
Hadoop Streaming Assignments: Intro and Code Samples10分鐘
Quiz3 個練習
Hadoop MapReduce Intro26分鐘
MapReduce Streaming26分鐘
Hadoop Streaming Final30分鐘
3
完成時間(小時)
完成時間為 4 小時

Solving Problems with MapReduce (practice week)

...
Reading
1 個視頻(共 3 分鐘), 5 個閱讀材料, 5 個測驗
Reading5 個閱讀材料
Hadoop Streaming Assignments: Intro and Code Samples10分鐘
Hints to Debug Hadoop Streaming Applications10分鐘
Grading System and Grading System Sandbox User Guide10分鐘
Hadoop Streaming Assignments: Instructions10分鐘
Hint to the "Stop words" programming assignment10分鐘
4
完成時間(小時)
完成時間為 3 小時

Introduction to Apache Spark

...
Reading
16 個視頻(共 95 分鐘), 2 個閱讀材料, 2 個測驗
Video16 個視頻
Welcome6分鐘
RDDs8分鐘
Transformations 16分鐘
Transformations 27分鐘
Actions5分鐘
Resiliency6分鐘
Execution & Scheduling6分鐘
Caching & Persistence5分鐘
Broadcast variables5分鐘
Accumulator variables5分鐘
Getting started with Spark & Python6分鐘
Working with text files6分鐘
Joins4分鐘
Broadcast & Accumulator variables5分鐘
Spark UI4分鐘
Cluster mode3分鐘
Reading2 個閱讀材料
Spark Assignments Intro10分鐘
Instructions for Spark programming assignment10分鐘
Quiz2 個練習
Lesson 1 Quiz20分鐘
Lesson 2 Quiz24分鐘
4.0

熱門審閱

創建者 SDJun 28th 2018

Absolutely essential for everyone who wants a proper introduction to HDFS, MapReduce and Spark. Brought to you by a great team of geniuses of their time ;)

創建者 MGOct 31st 2018

Interesting, useful, informative, accessible (and sometimes funny!) lectures.\n\nStimulating assignments.\n\nFast responses from instructors/mentors.

講師

Avatar

Ivan Puzyrevskiy

Technical Team Lead
Avatar

Alexey A. Dral

Founder and Chief Executive Officer
BigData Team

關於 Yandex

Yandex is a technology company that builds intelligent products and services powered by machine learning. Our goal is to help consumers and businesses better navigate the online and offline world....

關於 Big Data for Data Engineers 專項課程

This specialization is made for people working with data (either small or big). If you are a Data Analyst, Data Scientist, Data Engineer or Data Architect (or you want to become one) — don’t miss the opportunity to expand your knowledge and skills in the field of data engineering and data analysis on the large scale. In four concise courses you will learn the basics of Hadoop, MapReduce, Spark, methods of offline data processing for warehousing, real-time data processing and large-scale machine learning. And Capstone project for you to build and deploy your own Big Data Service (make your portfolio even more competitive). Over the course of the specialization, you will complete progressively harder programming assignments (mostly in Python). Make sure, you have some experience in it. This course will master your skills in designing solutions for common Big Data tasks: - creating batch and real-time data processing pipelines, - doing machine learning at scale, - deploying machine learning models into a production environment — and much more! Join some of best hands-on big data professionals, who know, their job inside-out, to learn the basics, as well as some tricks of the trade, from them. Special thanks to Prof. Mikhail Roytberg (APT dept., MIPT), Oleg Sukhoroslov (PhD, Senior Researcher, IITP RAS), Oleg Ivchenko (APT dept., MIPT), Pavel Akhtyamov (APT dept., MIPT), Vladimir Kuznetsov, Asya Roitberg, Eugene Baulin, Marina Sudarikova....
Big Data for Data Engineers

常見問題

  • 注册以便获得证书后,您将有权访问所有视频、测验和编程作业(如果适用)。只有在您的班次开课之后,才可以提交和审阅同学互评作业。如果您选择在不购买的情况下浏览课程,可能无法访问某些作业。

  • 您注册课程后,将有权访问专项课程中的所有课程,并且会在完成课程后获得证书。您的电子课程证书将添加到您的成就页中,您可以通过该页打印您的课程证书或将其添加到您的领英档案中。如果您只想阅读和查看课程内容,可以免费旁听课程。

還有其他問題嗎?請訪問 學生幫助中心