Chevron Left
返回到 Distributed Computing with Spark SQL

學生對 加州大学戴维斯分校 提供的 Distributed Computing with Spark SQL 的評價和反饋

4.5
216 個評分
59 條評論

課程概述

This course is for students with SQL experience and now want to take the next step in gaining familiarity with distributed computing using Spark. Students will gain an understanding of when to use Spark and how Spark as an engine uniquely combines Data and AI technologies at scale. The four modules build on one another and by the end of the course the student will understand: Spark architecture, Spark DataFrame, optimizing reading/writing data, and how to build a machine learning model. The first module will introduce Spark, including how Spark works with distributed computing and what are Spark Dataframes. Module 2 covers the core concepts of Spark such as storage vs. computing, caching, partitions and Spark UI. The third module looks at Engineering Data Pipelines covering connecting to databases, schemas and type, file formats and writing good data. The final module looks at the application of Spark with Machine Learning through the business use case, a short introduction to what machine learning is, building and applying models and a final course conclusion. By understanding when to use Spark, either scaling out when the model or data is too large to process on a single machine, or having a need to simply speed up to get faster results, students will hone their SQL skills and become a more adept Data Scientist....

熱門審閱

GT
2020年6月9日

I highly recommend this course for anyone in the BI and Data space interested in learning Spark. The course gives an easy to understand to the framework and applicable hands on examples.

KS
2020年5月13日

Amazing course that really cuts through the fundamentals of using distributed computing power to analyze and manipulate data. Well organised structure on fundamentals

篩選依據:

1 - Distributed Computing with Spark SQL 的 25 個評論(共 59 個)

創建者 Steven O

2020年4月5日

A more appropriate title for the class would be "a brief introduction to Databricks". Very disappointing class. There are Youtube tutorials out there with more content than this class. This is one of the only classes that I have ever taken on Coursera where I could complete 2 weeks worth of all the lectures, assignments, and quizzes in a Sunday afternoon. I think this class was hastily slapped together, there is so little content. If your organization uses Spark and is not a Databricks client (as mine is), you will learn absolutely nothing here. The lectures are extremely short and devoid of any substance. I am still looking for a good online class in Spark. It certainly is not this one.

創建者 Sacha v W

2020年2月19日

very superficial using databricks. The courses misses depth to be of any use. It is more a Databricks commercial. Executing pieces of available course without sufficient practice

創建者 Joseph B

2020年1月6日

Extremely informative for those who are seeking to learn the fundamentals for distributed computing using Spark SQL.

創建者 Daniel Y

2020年9月9日

very useful

創建者 Zaynul A

2020年3月4日

Expecting more advance material

創建者 Alex C

2020年5月27日

it was an interesting course in as much as it has got me interested in spark and it was doable. I think it tried to cover too much ground in not enough depth. After completing I have gone off and am doing the datacamp spark courses which are also interesting.

The implementation stuff in databricks was really annoying in that the platform used a ´´ whatever it actually was - i still dont know!!!! i just had to copy and paste it every time...it was never mentioned that it didnt work like sql with [] or that it wasnt a apostrophe or whatever.

The use of jupyter notebooks itself was nice, and the exercises were also nice as a learning exercise, i got a lot out of them by having to actually find out some things and see ah ha thats how it works.

The presenters were very good. I could be critical of a few points but i wont as i am guessing its there first mooc or so, and my personal opinions are irrelevant in my annoyances :-)

All in all a nice course as it has good me interested and actually up and running with spark, so i can see where and how it fits and will look further...

Many thansk!

創建者 Noah M

2020年5月10日

A highly polished presentation, however I still feel only a superficial understanding of partitions and other Spark optimisation techniques. In Course 4 of this Specialization, I had to google myself how best to set partition parameters (ie. how to choose a value) which perhaps shouldve been covered in this course.

High-level definitions are given, but not so much in way of actual application to clarify the concepts.

創建者 Bryan B

2020年7月5日

The first module felt more like a sales pitch for DataBricks than anything else, and the last module was about machine learning, and not distributed computing. So, in my opinion, only 2 of the weeks attempted to focus on distributed computing, but even they failed. The course seemed to focus way more on SQL, and less on Spark and how it works. Sure, there were pieces of information on how to how to change the number of partitions, but how partitions work, or how Spark actually handles distributed computing was lackluster at best. If you have even a rudimentary understanding of data engineering, you should be able to ace this course with minimal effort, but you'll likely not take much away from it. Great course for absolute beginners though.

創建者 Palak S

2020年6月6日

I did not like the flow of content explained! I expected a lot from this course but at then end I just have basic idea of queries at the end of the course! Nothing in deep about Spark's core concepts. Also the assignment quiz on queries were very weird and not properly formed! The Week 3 assignmnet was not displaying feedback! It was a really messy course!

創建者 Daniel C J

2020年9月30日

While I wish I'd learned a bit of Python before taking this course (to help with troubleshooting in the final module), overall I found the course extremely well put-together and incredibly useful for understanding SQL's role in the larger world of data science. The instructors are easy to follow, and the notebooks in Databricks create great supplements to your course notes. A few of the questions were a little confusing, but overall, I was very glad I took this course.

創建者 Deepika S

2020年5月20日

This course is a great learning source for Distributed Computing with Spark SQL. I got started with course and learnt basic concepts, dos and don'ts.

Concepts are explained well and work notebooks provided needed hands on experience.

Thanks for the course.

Best,

Deepika Sharma

創建者 Serjesh S

2020年5月29日

I wanted to quickly revisit spark sql on Databricks platform after last time using spark (on premise)3 years ago .This course provided perfect refresher to all the important concepts.Module 4 is specifically pleasant and take it little closer to BigQueryML.

創建者 Takashi T

2020年10月11日

The course was easy and clear to follow. The assignments and quizzes were easy to complete. Also by checking discussion forum, I can see that both instructors check and provide helps to people who posted the questions. I highly recommend this course.

創建者 Pooja N D

2020年7月27日

Well explained course by the trainers and good assignments set by the trainers. I learnt a lot on spark, its architecture and working, which I can use in my several up coming projects. Thank you for the course!

創建者 George T

2020年6月10日

I highly recommend this course for anyone in the BI and Data space interested in learning Spark. The course gives an easy to understand to the framework and applicable hands on examples.

創建者 Kumar S

2020年5月14日

Amazing course that really cuts through the fundamentals of using distributed computing power to analyze and manipulate data. Well organised structure on fundamentals

創建者 Elliot T

2020年7月13日

Great introduction to Spark with Databricks that seems to be an intuituve tool! Really cool to do the link between SQL and Data Science with a basic ML example!

創建者 Dilin J K J

2020年2月11日

This has been an amazing course. What is worth mentioning is how the content was delivered. Nice hands on. Highly recommended for anyone who is new to Spark

創建者 oisin d

2020年3月26日

Great course, really well taught and delivered. Only thing I would say is you would really need knowledge of python to really understand this course 100%

創建者 Nilupa R

2020年11月1日

I loved engaging in this course. It is concise course that teaches more on spark sql and machine learning capabilities in understandable manner.

創建者 Isaac T

2020年2月23日

Great introduction to Spark SQL and ML Flow. I love that they give extra resources if you want to learn more. It was a fun learning journey.

創建者 Metodi S

2020年11月14日

This was one of the best courses I've taken on Coursera. It represents a perfect blend of easy to understand Spark, Python and ML.

創建者 scott j

2020年8月20日

Great course to learn more in depth about Apache Spark. Good instructors and course content. Thanks UC Davis and Coursera!

創建者 Tina M

2020年4月26日

The information was very beneficial, and the ability to use data bricks

helped put into practice the information learned.

創建者 GANESH H

2020年7月2日

Good course for understanding the distributed computing and how the machine learning is down in spark environment.