Sutton on RL vs LLMs

1 min read

Overview

  • LLMs vs RL: Sutton argues large language models mimic human outputs rather than building true world models—they predict what people say, not what will happen in the world.

  • Goals and Ground Truth: Without goals, there's no definition of "right" action. Reinforcement learning provides ground truth through rewards, enabling continual learning that LLMs fundamentally lack.

  • AI Succession: Sutton views the transition to digital intelligence as inevitable and potentially positive—a major stage in the universe's evolution from replication to design.

Takeaways

Dwarkesh Patel interviews Richard Sutton, 2025 Turing Award recipient and reinforcement learning pioneer. Sutton contends that scalable intelligence requires learning from experience with clear goals, not imitating human-generated text.

"Intelligence is the computational part of the ability to achieve goals. You have to have goals or you're just a behaving system."

Copyright 2025, Ran DingPrivacyTerms