people, ideas, machines

Exploration, exploitation, and thinking

I’ve been thinking about AI and machine learning for around a decade. One of the useful and surprising things about this is how the ideas sometimes apply to life. In reinforcement learning (RL) for example, there’s a tension between two forces: exploration and exploitation. Exploration is trying new things. Exploitation is sticking with what already works.

Good learners start by exploring a lot. Later, when they know more, they can safely exploit. But if they skip the exploration phase, they get stuck. They keep repeating whatever happened to work first, even if it isn’t very good.

Humans aren’t so different. We also need a period of exploration, where we struggle and figure things out for ourselves. That’s how we build the mental muscles for judgement. If you outsource that too early, you never develop the muscles at all.

AI complicates this. Used well, it can make people smarter. It’s like calculators, once we had them, we could focus on harder maths. But we learned to add first. Used too early, AI removes the struggle. And the struggle is the point. That’s how you learn to think.

Recent studies find that people who lean on AI too quickly do not engage as deeply. They remember less. They do less real thinking. Over time the habit becomes a kind of thought atrophy.

Here is the paradox. AI could help us explore more, yet it is often used for premature exploitation. That’s why so many students and young workers risk getting stuck. They trade the long-term reward of thinking for the short-term reward of an easy answer.

This is the exploitation trap. It gives you an answer, but at the cost of the skills you need to find answers yourself, maybe even better ones. And the younger you are, the bigger the cost, because you may skip the exploration phase entirely.

The right balance is more like the one RL agents follow: explore first, exploit later.

In practice it’s never all one or the other. We always need both. Even RL agents keep exploring a little, no matter how experienced they get. For people it helps to name the goal. If the goal is output, use AI. If the goal is learning, by all means use AI, but embrace the struggle.

An RL agent that skips exploration never learns its environment. A generation that skips exploration may never learn to think.

The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects from a Survey of Knowledge Workers (Microsoft & Carnegie Mellon, 2025)

ChatGPT's Impact On Our Brains According to an MIT Study (TIME, 2025)

Does ChatGPT Make You Dumber? What a New MIT Study Really Found (Marketing AI Institute, 2025)

Increased AI use linked to eroding critical thinking skills (Phys.org, 2025)

Evaluating the Impact of AI Dependency on Cognitive Ability among Young Adults (AMH International, 2024)

#AI #RL #musing