AI-Enhanced Learning: Promise, Pitfalls, and Proof-Points

How to turn AI tutors into real performance gains—without losing the craft.

In a classroom in Austin, Texas, kids knock out academics in just two hours a day—and still learn twice as much, twice as fast according to co-founder MacKenzie Price. 

That’s the routine at Alpha School, where they use AI-powered, self-paced learning plans while the teachers lean into the human elements of learning like "emotional support, motivation, and developing personal connections to students." 

The Wall Street Journal says this innovative model may represent a profound shift in how we approach education.[1]

The Alpha School isn’t an isolated case—it’s part of an emerging trend in education. 

Looking at whether this approach could be applied broadly across education and professional sales training requires a careful look at the research. The big question now isn’t whether AI belongs in learning, but how to roll it out so the technology delivers real gains without dulling hard-earned skills. The research is starting to show both sides of the coin, and the lessons are worth a closer look.

The Promise: Doubling Learning Outcomes

The evidence for AI's potential is pretty compelling. A Harvard University study led by Gregory Kestin and Kelly Miller found that students using an AI tutor showed more than twice the learning gains compared to those in traditional classes. 

Their experiment involved 194 students in an undergraduate physics course. They divided the students into two groups. One group began with an AI-supported lesson completed at home, while the other tackled the same material through an active learning lesson in class. In the second week the groups swapped formats, so every student experienced both approaches. 

Before each lesson, students took a test to establish baseline knowledge. After each lesson, they took tests to measure content mastery and evaluate their learning experience.

The results? Students using their custom AI tutor achieved a median score improvement of 1.75 points compared to just 0.75 points in the control group. The AI-tutored group also reported higher engagement and motivation, completing tasks in significantly less time—a median of 49 minutes versus 60 minutes for active learning. [2]

This echoes what education researchers have known for decades. In the 1980s, educational psychologist Benjamin Bloom identified what's now called "Bloom's Two Sigma Problem": one-to-one tutoring produces learning outcomes two standard deviations better than group instruction—essentially moving students from the 50th to the 98th percentile.

In a huge advancement, AI technology potentially makes personalized tutoring economically viable at scale for the first time.

More proof comes from Ghana. About 1,800 students in grades 3 to 9 across 11 schools used an AI math tutor called Rori over WhatsApp. The Rori group outscored the control group by a healthy margin (effect size 0.37). This study is particularly notable because it demonstrates that effective AI tutoring can work even on basic mobile devices on low-bandwidth networks, breaking traditional barriers to educational technology in developing regions. [3]

The Perils: When AI Undermines Learning

But not all AI implementations lead to positive outcomes. Researchers at the University of Pennsylvania conducted an experiment where high school students were given access to GPT-4 for homework. They used two distinct GPT-based tutoring systems: one with a standard ChatGPT-like interface and one with carefully designed safeguards. While homework scores shot up, students using the standard ChatGPT without proper guidance scored 17% worse on final exams than the control group.[4]

This illustrates what Professor Ethan Mollick calls "Illusory Knowledge”—Students believe they’re learning, but they’re actually weakening their own skill development. As he notes, "they don't actually realize that getting AI to do their homework is undermining their learning." 

Similarly, researchers at the University of Cologne found specific patterns of AI use that hindered learning. Their study, combining observational data with controlled experiments, revealed that students who approached AI as "solution providers" to complete exercises without personal cognitive effort showed significant learning impairments. This was especially pronounced when copy-paste functionality was available, enabling students to bypass essential mental processing.[5]

This study spotted an important perception gap—students consistently overestimated how much they had learned using AI, creating a false sense of competence. The self-perceived benefits of using AI exceeded their actual learning improvements, creating a dangerous gap between confidence and competence.

These learnings should be a big reality check: While student confidence shot up, skills didn’t.

The Implement Gap: Design Makes All the Difference

The key insight from these studies is that AI implementation—how it's designed and integrated into learning—matters a lot.

In the UPenn study, giving students a GPT with a basic tutor prompt rather than standard ChatGPT boosted homework scores without lowering final exam grades. The researchers attribute this to how their GPT tutor broke problems into manageable steps, required student input at critical junctures, and provided explanations alongside solutions.

The carefully crafted AI tutor from the Harvard study was designed with pedagogical best practices, active learning principles, and growth mindset promotion. The researchers designed their system to ask guiding questions rather than just providing answers, encouraging students to articulate their understanding and identify misconceptions.

This reveals a crucial distinction: AI systems that promote thinking rather than replace it drive better outcomes. As the Alpha School demonstrates, the best implementations transform the role of human teachers rather than eliminating them, allowing them to focus on emotional support, motivation, and interpersonal connections.

What It Means for Sales Training (and Other Professional Learning)

AI tutors aren’t just good for students—they’re a gift for revenue teams too, particularly in sales orgs where performance varies dramatically between top and average performers.

BCG’s research says it all: With AI assistance, under-performers jump 43%, while above-average reps inch up about 17%. This is a big opportunity for sales teams: if you nudge the middle 60% of your salesforce toward the leaders, you can dramatically improve overall results.

Consider the compounding impact: improving conversion rates by just 10% at each stage of your sales process can nearly double revenue with the same lead volume. When you design AI-powered training systems, you get those gains without adding headcount.

Designing AI Training That Actually Works

The research points to five principles for effective AI learning:

  1. Make people think, not copy-paste. Design tools that push reps to engage in active thinking rather than simply handing them the answers.

  2. Surface blind spots. Help learners understand what they know and don't know to combat the illusory knowledge (and avoid the over-confidence trap).

  3. Ease off the training wheels. Gradually reduce AI support as skills develop.

  4. Elevate people skills (don’t replace them). Let humans handle the motivation, relationships, emotions, and nuance of learning.

  5. Test without the AI crutch. Measure reps on live calls and deals to verify genuine learning and skills.

The Alpha School's two-hour approach works because it optimizes AI for what it does best while preserving human connection for humans. As their co-founder explains, teachers become guides that do "what humans do best: emotional support, motivation and developing a personal connection to students."

Companies are all over the map with AI in learning. The real question isn’t if AI will shape training, but how to roll it out in a way that actually lines up with solid learning science.

The research is pretty clear: when AI tools are built around proven principles—active thinking, feedback loops, spaced practice—people learn more. Skip the science and implement without these considerations, and the same tools can produce suboptimal or even negative results.

That leaves leaders with some big calls to make about how to approach these technologies in their learning environments.

As The Wall Street Journal noted, tech wasn’t ready to replace teachers back in 2012, but by 2025 the game has changed. The question is whether we’ll leverage this technology to "make America smart again" by moving everyone "up the value chain" through improved education and training.[1]

References

[1] Kessler, A. (2025, March 16). Make America Smart Again. The Wall Street Journal.

[2] Kestin, G., Miller, K., Klales, A., Milbourne, T., & Ponti, G. (2024). AI Tutoring Outperforms Active Learning. Harvard University Department of Physics.

[3] Henkel, O., Horne-Robinson, H., Kozhakhmotova, N., & Lee, A. (2024). Effective and Scalable Math Support: Evidence on the Impact of an AI-Tutor on Math Achievement in Ghana. University of Oxford & Rising Academics.

[4] Bastani, H., Bastani, O., Sungu, A., Ge, H., Kabakci, O., & Mariman, R. (2024). Generative AI Can Harm Learning. University of Pennsylvania, Wharton AI & Analytics.

[5] Lehmann, M., Cornelius, P.B., & Sting, F.J. (2024). AI Meets the Classroom: When Does ChatGPT Harm Learning? University of Cologne & Rotterdam School of Management, Erasmus University.