Introduction to Parallel Programming with CUDA

This course is part of GPU Programming Specialization

Taught in English

Some content may not be translated

Instructor: Chancellor Thomas Pascale

4,511 already enrolled

Included with Coursera Plus

Learn more

Course

Gain insight into a topic and learn the fundamentals

3.0

(28 reviews)

Intermediate level

Recommended experience

21 hours (approximately)

Flexible schedule

Learn at your own pace

View course modules

What you'll learn

Students will learn how to utilize the CUDA framework to write C/C++ software that runs on CPUs and Nvidia GPUs.
Students will transform sequential CPU algorithms and programs into CUDA kernels that execute 100s to 1000s of times simultaneously on GPU hardware.

Skills you'll gain

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

5 quizzes

Course

Gain insight into a topic and learn the fundamentals

3.0

(28 reviews)

Intermediate level

Recommended experience

21 hours (approximately)

Flexible schedule

Learn at your own pace

View course modules

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

Build your subject-matter expertise

This course is part of the GPU Programming Specialization

When you enroll in this course, you'll also be enrolled in this Specialization.

Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV

Share it on social media and in your performance review

There are 5 modules in this course

This course will help prepare students for developing code that can process large amounts of data in parallel on Graphics Processing Units (GPUs). It will learn on how to implement software that can solve complex problems with the leading consumer to enterprise-grade GPUs available using Nvidia CUDA. They will focus on the hardware and software capabilities, including the use of 100s to 1000s of threads and various forms of memory.

The purpose of this module is for students to understand how the course will be run, topics, how they will be assessed, and expectations.

What's included

3 videos4 readings1 programming assignment1 discussion prompt1 ungraded lab

3 videosTotal 10 minutes

GPU Programming Specialization3 minutesPreview module
Course Expectations2 minutes
Coursera Lab and Assignment Overview5 minutes

4 readingsTotal 65 minutes

Course Overview10 minutes
Course Outline10 minutes
VS Code and GitHub Resources30 minutes
C++ Reading Material15 minutes

1 programming assignmentTotal 60 minutes

Simple CUDA Project Assignment60 minutes

1 discussion promptTotal 15 minutes

Large Scale Data and Challenges Discussion15 minutes

1 ungraded labTotal 60 minutes

Simple CUDA code Lab60 minutes

The single most important concept for using GPUs to solve complex and large-scale problems, is management of threads. CUDA provides two- and three-dimensional logical abstractions of threads, blocks and grids. Students will develop programs that utilize threads, blocks, and grids to process large 2 to 3-dimensional data sets.

What's included

8 videos1 reading2 quizzes2 programming assignments1 ungraded lab

8 videosTotal 49 minutes

Kernel Execution6 minutesPreview module
Divide and Conquer to GPU Algorithms5 minutes
Module 2 Lab Overview Video4 minutes
Module 2 Randomized Data Search Assignment Overview Video4 minutes
Threads and Blocks7 minutes
Threads, Blocks, and Grids6 minutes
Multidimensional Gaussian Blur Kernel Example7 minutes
Module2 Image Processing Assignment Overview Videos6 minutes

1 readingTotal 10 minutes

Nvidia CUDA Software and Hardware Reading Materials10 minutes

2 quizzesTotal 60 minutes

Multidimensional Data and Computation on the GPU Quiz30 minutes
CPU to GPU Algorithm Conversion Quiz30 minutes

2 programming assignmentsTotal 180 minutes

Data Search Programming Assignment60 minutes
Performing RGB to Grayscale on Image Data Assignment120 minutes

1 ungraded labTotal 30 minutes

CUDA Computation on Data Lab Activity30 minutes

To manage the access and modification of data in physical memory effectively, students will need to load data into CPU (host) and GPU (global) general-purpose memory. Students will create software that allocates host memory and transfers it into global memory for use by threads. Students will also learn the capabilities and speeds of these types of memories.

What's included

8 videos1 quiz1 programming assignment1 discussion prompt2 ungraded labs

8 videosTotal 22 minutes

Nvidia GPU Device Global Memory3 minutesPreview module
Linux CLI GPU Device Identification1 minute
GPU Device Global Memory Investigation2 minutes
Nvidia GPU Device Investigation Commands Lab Overview Video1 minute
Host Memory Allocation5 minutes
Device Global Memory Allocation3 minutes
Host and Device Global Memory Allocation Lab Overview Video3 minutes
Allocation and Assignment of Different Types of Host and Global Memory Overview Video2 minutes

1 quizTotal 30 minutes

CPU and GPU Global Memory Quiz30 minutes

1 programming assignmentTotal 120 minutes

Allocation and Assignment of Different Types of Host and Global Memory120 minutes

1 discussion promptTotal 10 minutes

Student CPU Memory and GPU Discussion10 minutes

2 ungraded labsTotal 120 minutes

Nvidia GPU Device Investigation Commands Lab60 minutes
Host and Device Global Memory Allocation Lab60 minutes

To improve performance in GPU software, students will need to utilized mutable (shared) and static (constant) memory. They will use them to apply masks to all items of a data set, to manage the communication between threads, and use for caching in complex programs.

What's included

6 videos1 quiz1 programming assignment1 discussion prompt1 ungraded lab

6 videosTotal 22 minutes

Nvidia GPU Device Shared and Constant Memory Video Lecture3 minutesPreview module
GPU Device Shared and Constant Memory Investigation3 minutes
GPU Device Shared Memory Allocation3 minutes
GPU Device Constant Memory Allocation4 minutes
CUDA Shared and Constant Memory Image Processing Lab Overview Video3 minutes
CUDA Shared and Constant Memory Image Manipulation Assignment Overview Video4 minutes

1 quizTotal 15 minutes

CUDA Constant and Shared Memory Quiz15 minutes

1 programming assignmentTotal 120 minutes

CUDA Shared and Constant Memory Image Manipulation Assignment120 minutes

1 discussion promptTotal 10 minutes

Shared and Constant Memory Discussion10 minutes

1 ungraded labTotal 60 minutes

CUDA Shared and Constant Memory Image Processing Lab60 minutes

In this module, students will learn the benefits and constraints of GPUs most hyper-localized memory, registers. While using this type of memory will be natural for students, gaining the largest performance boost from it, like all forms of memory, will require thoughtful design of software. Students will develop implementations of algorithms using each type of memory and generate performance analysis.

What's included

5 videos1 quiz1 programming assignment1 discussion prompt1 ungraded lab

5 videosTotal 23 minutes

CUDA GPU Device Register Memory3 minutesPreview module
CUDA GPU Device Register Memory Investigation3 minutes
CUDA GPU Device Memory Evaluation6 minutes
CUDA Device Memory Lab Overview Video4 minutes
CUDA Device Memory Analysis Assignment Overview Video5 minutes

1 quizTotal 15 minutes

Device Memory Quiz15 minutes

1 programming assignmentTotal 120 minutes

CUDA Device Memory Analysis Assignment120 minutes

1 discussion promptTotal 10 minutes

GPU Device Memory Analysis Discussion10 minutes

1 ungraded labTotal 60 minutes

CUDA Device Memory Lab60 minutes

Instructor

Instructor ratings

2.3 (10 ratings)

Chancellor Thomas Pascale

Johns Hopkins University

4 Courses11,164 learners

Offered by

Johns Hopkins University

Recommended if you're interested in Software Development

Johns Hopkins University
CUDA at Scale for the Enterprise
Course
Johns Hopkins University
Introduction to Concurrent Programming with GPUs
Course
Johns Hopkins University
CUDA Advanced Libraries
Course
Johns Hopkins University
GPU Programming
Specialization

Why people choose Coursera for their career

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

Showing 3 of 28

3.0

28 reviews

5 stars
21.42%
4 stars
17.85%
3 stars
25%
2 stars
14.28%
1 star
21.42%

Reviewed on Feb 21, 2024

View more reviews

New to Software Development? Start here.

Open new doors with Coursera Plus

Unlimited access to 7,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Learn more

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Explore degrees

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Learn more

Frequently asked questions

Yes, but for grading purposes you will still need to upload any software artifacts (source code, header files, etc.) into the Coursera lab environment.

Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.
The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

Introduction to Parallel Programming with CUDA

Course

What you'll learn

Skills you'll gain

Details to know

Course

See how employees at top companies are mastering in-demand skills

Build your subject-matter expertise

Earn a career certificate

There are 5 modules in this course

Course Overview

What's included

Threads, Blocks and Grids

What's included

Host and Global Memory

What's included

Shared and Constant Memory

What's included

Register Memory

What's included

Instructor

Offered by

Recommended if you're interested in Software Development

CUDA at Scale for the Enterprise

Introduction to Concurrent Programming with GPUs

CUDA Advanced Libraries

GPU Programming

Why people choose Coursera for their career

Learner reviews

New to Software Development? Start here.

Open new doors with Coursera Plus

Advance your career with an online degree

Join over 3,400 global companies that choose Coursera for Business

Frequently asked questions

Can I program on my own desktop/laptop?

When will I have access to the lectures and assignments?

What will I get if I subscribe to this Specialization?

More questions