Policy gradient formalism

Loading...
來自 National Research University Higher School of Economics 的課程
Practical Reinforcement Learning
119 個評分
National Research University Higher School of Economics
119 個評分
課程 4(共 7 門,Specialization Advanced Machine Learning
從本節課中
Policy-based methods
We spent 3 previous modules working on the value-based methods: learning state values, action values and whatnot. Now's the time to see an alternative approach that doesn't require you to predict all future rewards to learn something.

與講師見面

  • Pavel Shvechikov
    Pavel Shvechikov
    Researcher at HSE and Sberbank AI Lab
    HSE Faculty of Computer Science
  • Alexander Panin
    Alexander Panin
    Lecturer
    HSE Faculty of Computer Science

探索我們的目錄

免費加入並獲得個性化推薦、更新和優惠。