Big Data - Capstone Project

Big Data - Capstone Project

This course is part of Big Data Specialization

Taught in English

Some content may not be translated

Instructors: Ilkay Altintas

16,734 already enrolled

Included with Coursera Plus

Learn more

Course

Gain insight into a topic and learn the fundamentals

4.4

(393 reviews)

20 hours (approximately)

Flexible schedule

Learn at your own pace

View course modules

Skills you'll gain

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

1 quiz

Course

Gain insight into a topic and learn the fundamentals

4.4

(393 reviews)

20 hours (approximately)

Flexible schedule

Learn at your own pace

View course modules

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

Build your subject-matter expertise

This course is part of the Big Data Specialization

When you enroll in this course, you'll also be enrolled in this Specialization.

Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV

Share it on social media and in your performance review

There are 7 modules in this course

Welcome to the Capstone Project for Big Data! In this culminating project, you will build a big data ecosystem using tools and methods form the earlier courses in this specialization. You will analyze a data set simulating big data generated from a large number of users who are playing our imaginary game "Catch the Pink Flamingo". During the five week Capstone Project, you will walk through the typical big data science steps for acquiring, exploring, preparing, analyzing, and reporting. In the first two weeks, we will introduce you to the data set and guide you through some exploratory analysis using tools such as Splunk and Open Office. Then we will move into more challenging big data problems requiring the more advanced tools you have learned including KNIME, Spark's MLLib and Gephi. Finally, during the fifth and final week, we will show you how to bring it all together to create engaging and compelling reports and slide presentations. As a result of our collaboration with Splunk, a software company focus on analyzing machine-generated big data, learners with the top projects will be eligible to present to Splunk and meet Splunk recruiters and engineering leadership.

This week we provide an overview of the Eglence, Inc. Pink Flamingo game, including various aspects of the data which the company has access to about the game and users and what we might be interested in finding out.

What's included

4 videos4 readings

4 videosTotal 17 minutes

Welcome to the Big Data Capstone Project2 minutesPreview module
Welcome from Splunk: Rob Reed World Education Evangelist3 minutes
A Summary of Catch the Pink Flamingo7 minutes
A Conceptual Schema for Catch the Pink Flamingo4 minutes

4 readingsTotal 35 minutes

Planning, Preparation, and Review10 minutes
A Game by Eglence Inc. : Catch The Pink Flamingo10 minutes
Overview of the Catch the Pink Flamingo Data Model10 minutes
Overview of Final Project Design5 minutes

Next, we begin working with the simulated game data by exploring and preparing the data for ingestion into big data analytics applications.

What's included

6 readings1 quiz1 peer review

6 readingsTotal 140 minutes

Downloading the Game Data and Associated Scripts10 minutes
Understanding the CSV Files Generated by the Scripts20 minutes
Optional Review of Splunk0 minutes
“Catch the Pink Flamingo” Data Exploration with Splunk45 minutes
Aggregate Calculations Using Splunk45 minutes
Filtering the Data With Splunk20 minutes

1 quizTotal 30 minutes

Data Exploration With Splunk30 minutes

1 peer reviewTotal 60 minutes

Data Exploration Technical Appendix60 minutes

This week we do some data classification using KNIME.

What's included

4 readings1 peer review

This week we do some clustering with Spark.

What's included

2 readings1 peer review3 discussion prompts

2 readingsTotal 35 minutes

Informing business strategies based on client base5 minutes
Practice with PySpark MLlib Clustering30 minutes

1 peer reviewTotal 200 minutes

Recommending Actions from Clustering Analysis200 minutes

3 discussion promptsTotal 40 minutes

Is there only “one way” to cluster a client base?15 minutes
How many clusters?10 minutes
What kind of criteria might provide actionable information for Eglence Inc.?15 minutes

This week we apply what we learned from the 'Graph Analytics With Big Data' course to simulated chat data from Catch the Pink Flamingos using Neo4j. We analyze player chat behavior to find ways of improving the game.

What's included

2 readings1 peer review

What's included

1 video1 reading

What's included

1 video1 reading2 peer reviews

Instructors

Instructor ratings

4.8 (22 ratings)

Ilkay Altintas

University of California San Diego

14 Courses482,743 learners

Amarnath Gupta

University of California San Diego

10 Courses456,073 learners

Offered by

University of California San Diego

Recommended if you're interested in Data Analysis

University of California San Diego
Graph Analytics for Big Data
Course
University of California San Diego
Big Data Modeling and Management Systems
Course
University of Illinois at Urbana-Champaign
Untersuchen und Erstellen von Daten für Unternehmen
Course
University of California San Diego
Machine Learning With Big Data
Course

Why people choose Coursera for their career

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

Showing 3 of 393

4.4

393 reviews

5 stars
66.15%
4 stars
21.62%
3 stars
5.85%
2 stars
1.78%
1 star
4.58%

Reviewed on Jul 7, 2020

Reviewed on Mar 21, 2019

Reviewed on Dec 26, 2018

View more reviews

New to Data Analysis? Start here.

Open new doors with Coursera Plus

Unlimited access to 7,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Learn more

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Explore degrees

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Learn more

Frequently asked questions

Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.
The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

If you subscribed, you get a 7-day free trial during which you can cancel at no penalty. After that, we don’t give refunds, but you can cancel your subscription at any time. See our full refund policy.

Big Data - Capstone Project

Course

Skills you'll gain

Details to know

Course

See how employees at top companies are mastering in-demand skills

Build your subject-matter expertise

Earn a career certificate

There are 7 modules in this course

Simulating Big Data for an Online Game

What's included

Acquiring, Exploring, and Preparing the Data

What's included

Data Classification with KNIME

What's included

Clustering with Spark

What's included

Graph Analytics of Simulated Chat Data With Neo4j

What's included

Reporting and Presenting Your Work

What's included

Final Submission

What's included

Instructors

Offered by

Recommended if you're interested in Data Analysis

Graph Analytics for Big Data

Big Data Modeling and Management Systems

Untersuchen und Erstellen von Daten für Unternehmen

Machine Learning With Big Data

Why people choose Coursera for their career

Learner reviews

New to Data Analysis? Start here.

Open new doors with Coursera Plus

Advance your career with an online degree

Join over 3,400 global companies that choose Coursera for Business

Frequently asked questions

When will I have access to the lectures and assignments?

What will I get if I subscribe to this Specialization?

What is the refund policy?

More questions