課程信息
4.3
659 個評分
141 個審閱
專項課程

第 1 門課程(共 4 門),位於

100% 在線

100% 在線

立即開始,按照自己的計劃學習。
可靈活調整截止日期

可靈活調整截止日期

根據您的日程表重置截止日期。
完成時間(小時)

完成時間大約為21 小時

建議:4 weeks of study, 6-8 hours/week...
可選語言

英語(English)

字幕:英語(English)...

您將獲得的技能

Relational AlgebraPython ProgrammingMapreduceSQL
專項課程

第 1 門課程(共 4 門),位於

100% 在線

100% 在線

立即開始,按照自己的計劃學習。
可靈活調整截止日期

可靈活調整截止日期

根據您的日程表重置截止日期。
完成時間(小時)

完成時間大約為21 小時

建議:4 weeks of study, 6-8 hours/week...
可選語言

英語(English)

字幕:英語(English)...

教學大綱 - 您將從這門課程中學到什麼

1
完成時間(小時)
完成時間為 6 小時

Data Science Context and Concepts

Understand the terminology and recurring principles associated with data science, and understand the structure of data science projects and emerging methodologies to approach them. Why does this emerging field exist? How does it relate to other fields? How does this course distinguish itself? What do data science projects look like, and how should they be approached? What are some examples of data science projects? ...
Reading
22 個視頻(共 125 分鐘), 4 個閱讀材料, 1 個測驗
Video22 個視頻
Appetite Whetting: Extreme Weather2分鐘
Appetite Whetting: Digital Humanities8分鐘
Appetite Whetting: Bibliometrics4分鐘
Appetite Whetting: Food, Music, Public Health5分鐘
Appetite Whetting: Public Health cont'd, Earthquakes, Legal4分鐘
Characterizing Data Science5分鐘
Characterizing Data Science, cont'd5分鐘
Distinguishing Data Science from Related Topics4分鐘
Four Dimensions of Data Science6分鐘
Tools vs. Abstractions7分鐘
Desktop Scale vs. Cloud Scale5分鐘
Hackers vs. Analysts2分鐘
Structs vs. Stats5分鐘
Structs vs. Stats cont'd5分鐘
A Fourth Paradigm of Science3分鐘
Data-Intensive Science Examples6分鐘
Big Data and the 3 Vs5分鐘
Big Data Definitions4分鐘
Big Data Sources6分鐘
Course Logistics7分鐘
Twitter Assignment: Getting Started14分鐘
Reading4 個閱讀材料
Supplementary: Three-Course Reading List10分鐘
Supplementary: Resources for Learning Python10分鐘
Supplementary: Class Virtual Machine10分鐘
Supplementary: Github Instructions10分鐘
2
完成時間(小時)
完成時間為 5 小時

Relational Databases and the Relational Algebra

Relational Databases are the workhouse of large-scale data management. Although originally motivated by problems in enterprise operations, they have proven remarkably capable for analytics as well. But most importantly, the principles underlying relational databases are universal in managing, manipulating, and analyzing data at scale. Even as the landscape of large-scale data systems has expanded dramatically in the last decade, relational models and languages have remained a unifying concept. For working with large-scale data, there is no more important programming model to learn....
Reading
24 個視頻(共 122 分鐘), 1 個測驗
Video24 個視頻
From Data Models to Databases4分鐘
Pre-Relational Databases5分鐘
Motivating Relational Databases3分鐘
Relational Databases: Key Ideas4分鐘
Algebraic Optimization Overview6分鐘
Relational Algebra Overview4分鐘
Relational Algebra Operators: Union, Difference, Selection6分鐘
Relational Algebra Operators: Projection, Cross Product4分鐘
Relational Algebra Operators: Cross Product cont'd, Join6分鐘
Relational Algebra Operators: Outer Join4分鐘
Relational Algebra Operators: Theta-Join4分鐘
From SQL to RA6分鐘
Thinking in RA: Logical Query Plans4分鐘
Practical SQL: Binning Timeseries5分鐘
Practical SQL: Genomic Intervals6分鐘
User-Defined Functions3分鐘
Support for User-Defined Functions4分鐘
Optimization: Physical Query Plans5分鐘
Optimization: Choosing Physical Plans4分鐘
Declarative Languages5分鐘
Declarative Languages: More Examples4分鐘
Views: Logical Data Independence5分鐘
Indexes6分鐘
3
完成時間(小時)
完成時間為 5 小時

MapReduce and Parallel Dataflow Programming

The MapReduce programming model (as distinct from its implementations) was proposed as a simplifying abstraction for parallel manipulation of massive datasets, and remains an important concept to know when using and evaluating modern big data platforms. ...
Reading
26 個視頻(共 122 分鐘), 1 個測驗
Video26 個視頻
A Sketch of Algorithmic Complexity5分鐘
A Sketch of Data-Parallel Algorithms5分鐘
"Pleasingly Parallel" Algorithms4分鐘
More General Distributed Algorithms4分鐘
MapReduce Abstraction4分鐘
MapReduce Data Model3分鐘
Map and Reduce Functions2分鐘
MapReduce Simple Example3分鐘
MapReduce Simple Example cont'd3分鐘
MapReduce Example: Word Length Histogram2分鐘
MapReduce Examples: Inverted Index, Join6分鐘
Relational Join: Map Phase4分鐘
Relational Join: Reduce Phase4分鐘
Simple Social Network Analysis: Counting Friends3分鐘
Matrix Multiply Overview5分鐘
Matrix Multiply Illustrated4分鐘
Shared Nothing Computing4分鐘
MapReduce Implementation5分鐘
MapReduce Phases6分鐘
A Design Space for Large-Scale Data Systems4分鐘
Parallel and Distributed Query Processing5分鐘
Teradata Example, MR Extensions5分鐘
RDBMS vs. MapReduce: Features6分鐘
RDBMS vs. Hadoop: Grep5分鐘
RDBMS vs. Hadoop: Select, Aggregate, Join3分鐘
4
完成時間(小時)
完成時間為 3 小時

NoSQL: Systems and Concepts

NoSQL systems are purely about scale rather than analytics, and are arguably less relevant for the practicing data scientist. However, they occupy an important place in many practical big data platform architectures, and data scientists need to understand their limitations and strengths to use them effectively....
Reading
36 個視頻(共 166 分鐘)
Video36 個視頻
NoSQL Roundup4分鐘
Relaxing Consistency Guarantees3分鐘
Two-Phase Commit and Consensus Protocols5分鐘
Eventual Consistency4分鐘
CAP Theorem4分鐘
Types of NoSQL Systems4分鐘
ACID, Major Impact Systems4分鐘
Memcached: Consistent Hashing2分鐘
Consistent Hashing, cont'd4分鐘
DynamoDB: Vector Clocks5分鐘
Vector Clocks, cont'd5分鐘
CouchDB Overview4分鐘
CouchB Views3分鐘
BigTable Overview5分鐘
BigTable Implementation5分鐘
HBase, Megastore3分鐘
Spanner5分鐘
Spanner cont'd, Google Systems6分鐘
MapReduce-based Systems5分鐘
Bringing Back Joins4分鐘
NoSQL Rebuttal4分鐘
Almost SQL: Pig4分鐘
Pig Architecture and Performance3分鐘
Data Model3分鐘
Load, Filter, Group5分鐘
Group, Distinct, Foreach, Flatten5分鐘
CoGroup, Join3分鐘
Join Algorithms3分鐘
Skew5分鐘
Other Commands3分鐘
Evaluation Walkthrough3分鐘
Review6分鐘
Context3分鐘
Spark Examples5分鐘
RDDs, Benefits6分鐘
完成時間(小時)
完成時間為 2 小時

Graph Analytics

Graph-structured data are increasingly common in data science contexts due to their ubiquity in modeling the communication between entities: people (social networks), computers (Internet communication), cities and countries (transportation networks), or corporations (financial transactions). Learn the common algorithms for extracting information from graph data and how to scale them up. ...
Reading
21 個視頻(共 91 分鐘)
Video21 個視頻
Structural Analysis4分鐘
Degree Histograms, Structure of the Web4分鐘
Connectivity and Centrality4分鐘
PageRank3分鐘
PageRank in more Detail3分鐘
Traversal Tasks: Spanning Trees and Circuits5分鐘
Traversal Tasks: Maximum Flow1分鐘
Pattern Matching6分鐘
Querying Edge Tables4分鐘
Relational Algebra and Datalog for Graphs4分鐘
Querying Hybrid Graph/Relational Data3分鐘
Graph Query Example: NSA6分鐘
Graph Query Example: Recursion4分鐘
Evaluation of Recursive Programs3分鐘
Recursive Queries in MapReduce4分鐘
The End-Game Problem3分鐘
Representation: Edge Table, Adjacency List4分鐘
Representation: Adjacency Matrix2分鐘
PageRank in MapReduce5分鐘
PageRank in Pregel5分鐘
4.3
141 個審閱Chevron Right

熱門審閱

創建者 HAJan 11th 2016

Great course that strikes a balance between teaching general principles and concepts, and providing hands-on technical skills and practice.\n\nThe lessons are well designed and clearly conveyed.

創建者 SLMay 28th 2016

I like the breadth of coverage of this class. Each of the exercise is a gem in that I get to learn something new also. I would highly recommend this even to experience practitioner also.

講師

Avatar

Bill Howe

Director of Research
Scalable Data Analytics

關於 University of Washington

Founded in 1861, the University of Washington is one of the oldest state-supported institutions of higher education on the West Coast and is one of the preeminent research universities in the world....

關於 Data Science at Scale 專項課程

Learn scalable data management, evaluate big data technologies, and design effective visualizations. This Specialization covers intermediate topics in data science. You will gain hands-on experience with scalable SQL and NoSQL data management solutions, data mining algorithms, and practical statistical and machine learning concepts. You will also learn to visualize data and communicate results, and you’ll explore legal and ethical issues that arise in working with big data. In the final Capstone Project, developed in partnership with the digital internship platform Coursolve, you’ll apply your new skills to a real-world data science project....
Data Science at Scale

常見問題

  • 注册以便获得证书后,您将有权访问所有视频、测验和编程作业(如果适用)。只有在您的班次开课之后,才可以提交和审阅同学互评作业。如果您选择在不购买的情况下浏览课程,可能无法访问某些作业。

  • 您注册课程后,将有权访问专项课程中的所有课程,并且会在完成课程后获得证书。您的电子课程证书将添加到您的成就页中,您可以通过该页打印您的课程证书或将其添加到您的领英档案中。如果您只想阅读和查看课程内容,可以免费旁听课程。

還有其他問題嗎?請訪問 學生幫助中心