課程信息

71,797 次近期查看
可分享的證書
完成後獲得證書
100% 在線
立即開始,按照自己的計劃學習。
第 3 門課程(共 4 門)
可靈活調整截止日期
根據您的日程表重置截止日期。
中級

Probabilities & Expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), implementing algorithms from pseudocode.

完成時間大約為30 小時
英語(English)
字幕:英語(English)

您將獲得的技能

Artificial Intelligence (AI)Machine LearningReinforcement LearningFunction ApproximationIntelligent Systems
可分享的證書
完成後獲得證書
100% 在線
立即開始,按照自己的計劃學習。
第 3 門課程(共 4 門)
可靈活調整截止日期
根據您的日程表重置截止日期。
中級

Probabilities & Expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), implementing algorithms from pseudocode.

完成時間大約為30 小時
英語(English)
字幕:英語(English)

提供方

阿尔伯塔大学 徽標

阿尔伯塔大学

Alberta Machine Intelligence Institute 徽標

Alberta Machine Intelligence Institute

教學大綱 - 您將從這門課程中學到什麼

內容評分Thumbs Up93%(1,246 個評分)Info
1

1

完成時間為 1 小時

Welcome to the Course!

完成時間為 1 小時
2 個視頻 (總計 12 分鐘), 2 個閱讀材料
2 個視頻
Meet your instructors!8分鐘
2 個閱讀材料
Read Me: Pre-requisites and Learning Objectives10分鐘
Reinforcement Learning Textbook10分鐘
完成時間為 6 小時

On-policy Prediction with Approximation

完成時間為 6 小時
13 個視頻 (總計 69 分鐘), 1 個閱讀材料, 2 個測驗
13 個視頻
Generalization and Discrimination5分鐘
Framing Value Estimation as Supervised Learning3分鐘
The Value Error Objective4分鐘
Introducing Gradient Descent7分鐘
Gradient Monte for Policy Evaluation5分鐘
State Aggregation with Monte Carlo7分鐘
Semi-Gradient TD for Policy Evaluation3分鐘
Comparing TD and Monte Carlo with State Aggregation4分鐘
Doina Precup: Building Knowledge for AI Agents with Reinforcement Learning7分鐘
The Linear TD Update3分鐘
The True Objective for TD5分鐘
Week 1 Summary4分鐘
1 個閱讀材料
Weekly Reading: On-policy Prediction with Approximation40分鐘
1 個練習
On-policy Prediction with Approximation30分鐘
2

2

完成時間為 8 小時

Constructing Features for Prediction

完成時間為 8 小時
11 個視頻 (總計 52 分鐘), 1 個閱讀材料, 2 個測驗
11 個視頻
Generalization Properties of Coarse Coding5分鐘
Tile Coding3分鐘
Using Tile Coding in TD4分鐘
What is a Neural Network?3分鐘
Non-linear Approximation with Neural Networks4分鐘
Deep Neural Networks3分鐘
Gradient Descent for Training Neural Networks8分鐘
Optimization Strategies for NNs4分鐘
David Silver on Deep Learning + RL = AI?9分鐘
Week 2 Review2分鐘
1 個閱讀材料
Weekly Reading: On-policy Prediction with Approximation II40分鐘
1 個練習
Constructing Features for Prediction28分鐘
3

3

完成時間為 8 小時

Control with Approximation

完成時間為 8 小時
7 個視頻 (總計 41 分鐘), 1 個閱讀材料, 2 個測驗
7 個視頻
Episodic Sarsa in Mountain Car5分鐘
Expected Sarsa with Function Approximation2分鐘
Exploration under Function Approximation3分鐘
Average Reward: A New Way of Formulating Control Problems10分鐘
Satinder Singh on Intrinsic Rewards12分鐘
Week 3 Review2分鐘
1 個閱讀材料
Weekly Reading: On-policy Control with Approximation40分鐘
1 個練習
Control with Approximation40分鐘
4

4

完成時間為 6 小時

Policy Gradient

完成時間為 6 小時
11 個視頻 (總計 55 分鐘), 1 個閱讀材料, 2 個測驗
11 個視頻
Advantages of Policy Parameterization5分鐘
The Objective for Learning Policies5分鐘
The Policy Gradient Theorem5分鐘
Estimating the Policy Gradient4分鐘
Actor-Critic Algorithm5分鐘
Actor-Critic with Softmax Policies3分鐘
Demonstration with Actor-Critic6分鐘
Gaussian Policies for Continuous Actions7分鐘
Week 4 Summary3分鐘
Congratulations! Course 4 Preview2分鐘
1 個閱讀材料
Weekly Reading: Policy Gradient Methods40分鐘
1 個練習
Policy Gradient Methods45分鐘

審閱

來自PREDICTION AND CONTROL WITH FUNCTION APPROXIMATION的熱門評論

查看所有評論

關於 强化学习 專項課程

The Reinforcement Learning Specialization consists of 4 courses exploring the power of adaptive learning systems and artificial intelligence (AI). Harnessing the full potential of artificial intelligence requires adaptive learning systems. Learn how Reinforcement Learning (RL) solutions help solve real-world problems through trial-and-error interaction by implementing a complete RL solution from beginning to end. By the end of this Specialization, learners will understand the foundations of much of modern probabilistic artificial intelligence (AI) and be prepared to take more advanced courses or to apply AI tools and ideas to real-world problems. This content will focus on “small-scale” problems in order to understand the foundations of Reinforcement Learning, as taught by world-renowned experts at the University of Alberta, Faculty of Science. The tools learned in this Specialization can be applied to game development (AI), customer interaction (how a website interacts with customers), smart assistants, recommender systems, supply chain, industrial control, finance, oil & gas pipelines, industrial control systems, and more....
强化学习

常見問題

  • Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

    • The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.
    • The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
  • 您注册课程后,将有权访问专项课程中的所有课程,并且会在完成课程后获得证书。您的电子课程证书将添加到您的成就页中,您可以通过该页打印您的课程证书或将其添加到您的领英档案中。如果您只想阅读和查看课程内容,可以免费旁听课程。

  • 如果订阅,您可以获得 7 天免费试听,在此期间,您可以取消课程,无需支付任何罚金。在此之后,我们不会退款,但您可以随时取消订阅。请阅读我们完整的退款政策

  • 是的,Coursera 可以为无法承担费用的学生提供助学金。通过点击左侧“注册”按钮下的“助学金”链接可以申请助学金。您可以根据屏幕提示完成申请,申请获批后会收到通知。您需要针对专项课程中的每一门课程完成上述步骤,包括毕业项目。了解更多

  • 此课程不提供大学学分,但部分大学可能会选择接受课程证书作为学分。查看您的合作院校,了解详情。Coursera 上的在线学位Mastertrack™ 证书提供获得大学学分的机会。

還有其他問題嗎?請訪問 學生幫助中心