#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning - Lex Fridman Podcast

David Silver leads the reinforcement learning research group at DeepMind and was lead researcher on AlphaGo, AlphaZero and co-lead on AlphaStar, and MuZero and lot of important work in reinforcement learning. Support this podcast by signing up with these sponsors: – MasterClass: https://masterclass.com/lex – Cash App – use code “LexPodcast” and download: – Cash App (App Store): https://apple.co/2sPrUHe – Cash App (Google Play): https://bit.ly/2MlvP5w EPISODE LINKS: Reinforcement learning (book): https://amzn.to/2Jwp5zG This conversation is part of the Artificial Intelligence podcast. If you would like to get more information about this podcast go to https://lexfridman.com/ai or connect with @lexfridman on Twitter , LinkedIn , Facebook , Medium , or YouTube where you can watch the video versions of these conversations. If you enjoy the podcast, please rate it 5 stars on Apple Podcasts , follow on Spotify , or support it on Patreon . Here’s the outline of the episode. On some podcast players you should be able to click the timestamp to jump to that time. OUTLINE: 00:00 – Introduction 04:09 – First program 11:11 – AlphaGo 21:42 – Rule of the game of Go 25:37 – Reinforcement learning: personal journey 30:15 – What is reinforcement learning? 43:51 – AlphaGo (continued) 53:40 – Supervised learning and self play in AlphaGo 1:06:12 – Lee Sedol retirement from Go play 1:08:57 – Garry Kasparov 1:14:10 – Alpha Zero and self play 1:31:29 – Creativity in AlphaZero 1:35:21 – AlphaZero applications 1:37:59 – Reward functions 1:40:51 – Meaning of life

#86 &#8211; David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning

About this episode

#86 – David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning