185,083 次近期查看

## 10%

100% 在線

Probabilities & Expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), implementing algorithms from pseudocode.

### 您將學到的內容有

• Formalize problems as Markov Decision Processes

• Understand basic exploration methods and the exploration / exploitation tradeoff

• Understand value functions, as a general-purpose tool for optimal decision-making

• Know how to implement dynamic programming as an efficient solution approach to an industrial control problem

### 您將獲得的技能

Artificial Intelligence (AI)Machine LearningReinforcement LearningFunction ApproximationIntelligent Systems

## 10%

100% 在線

Probabilities & Expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), implementing algorithms from pseudocode.

1

## Welcome to the Course!

4 個視頻 （總計 20 分鐘）, 2 個閱讀材料
4 個視頻
Course Introduction5分鐘
2 個閱讀材料
Reinforcement Learning Textbook10分鐘
Read Me: Pre-requisites and Learning Objectives10分鐘

## An Introduction to Sequential Decision-Making

8 個視頻 （總計 46 分鐘）, 3 個閱讀材料, 2 個測驗
8 個視頻
Learning Action Values4分鐘
Estimating Action Values Incrementally5分鐘
Optimistic Initial Values6分鐘
Upper-Confidence Bound (UCB) Action Selection5分鐘
Jonathan Langford: Contextual Bandits for Real World Reinforcement Learning8分鐘
Week 1 Summary3分鐘
3 個閱讀材料
Module 1 Learning Objectives10分鐘
Chapter Summary30分鐘
1 個練習
Sequential Decision-Making45分鐘
2

## Markov Decision Processes

7 個視頻 （總計 36 分鐘）, 2 個閱讀材料, 2 個測驗
7 個視頻
Examples of MDPs4分鐘
The Goal of Reinforcement Learning3分鐘
Michael Littman: The Reward Hypothesis12分鐘
Examples of Episodic and Continuing Tasks3分鐘
Week 2 Summary1分鐘
2 個閱讀材料
Module 2 Learning Objectives10分鐘
1 個練習
MDPs45分鐘
3

## Value Functions & Bellman Equations

9 個視頻 （總計 56 分鐘）, 3 個閱讀材料, 2 個測驗
9 個視頻
Value Functions6分鐘
Rich Sutton and Andy Barto: A brief History of RL7分鐘
Bellman Equation Derivation6分鐘
Why Bellman Equations?5分鐘
Optimal Policies7分鐘
Optimal Value Functions5分鐘
Using Optimal Value Functions to Get Optimal Policies8分鐘
Week 3 Summary4分鐘
3 個閱讀材料
Module 3 Learning Objectives10分鐘
Chapter Summary13分鐘
2 個練習
[Practice] Value Functions and Bellman Equations45分鐘
Value Functions and Bellman Equations45分鐘
4

## Dynamic Programming

10 個視頻 （總計 72 分鐘）, 3 個閱讀材料, 2 個測驗
10 個視頻
Iterative Policy Evaluation8分鐘
Policy Improvement4分鐘
Policy Iteration8分鐘
Flexibility of the Policy Iteration Framework4分鐘
Efficiency of Dynamic Programming5分鐘
Warren Powell: Approximate Dynamic Programming for Fleet Management (Short)7分鐘
Warren Powell: Approximate Dynamic Programming for Fleet Management (Long)21分鐘
Week 4 Summary2分鐘
Congratulations!3分鐘
3 個閱讀材料
Module 4 Learning Objectives10分鐘