John Schulman on dead ends, scaling RL, and building research institutions

A conversation with John Schulman on the first year LLMs could have been useful, building research teams, and where RL goes from here. 00:00 - Speedrunning ChatGPT 09:22 - Archetypes of research managers 11:56 - Was OpenAI inspired by Bell Labs? 16:54 - The absence of value functions 18:23 - Continual learning 21:09 - Brittle generalization 24:05 - Co-training generators and verifiers, GANs 27:06 - John’s personal use of AI for research 28:54 - Day in the life 33:01 - Slowdowns in consequential ML ideas 36:21 - "Peer review" within the labs 39:19 - Distribution shift in researchers 43:33 - Future of RL 45:33 - Will the labs coordinate if the world needs them to? 44:46 - Forecasting ills in AGI and engineering 47:53 - Thinking Machines

About this episode