December 3, 2025·6 min read

What Medical Learners Actually Want from AI Standardized Patients

New CHI 2026 research reveals six key requirements for AI-SP design—straight from the medical students who would use them.

ClinicalSim Team

ClinicalSim

The Study

A research team from CUHK Shenzhen, Nankai University, and MIT Media Lab spent months with clinical-year medical students. They wanted to know: what would actually make AI standardized patients useful?

They interviewed 12 students. Ran three co-design workshops. The findings don't match what most vendors assume about medical simulation.

The Problem

The paper's title says it all: "It Talks Like a Patient, But Feels Different."

Current AI systems generate natural-sounding dialogue. But students describe the experience as flat. Something essential is missing.

This is a design problem more than a technical one.

What Students Actually Asked For

1. Different Modes for Different Goals

Students don't want one generic AI patient. They want options:

An OSCE mode with rigid, standardized, exam-like encounters
A practice mode with open-ended scenarios where they can explore
A skill-building mode with scaffolded practice and adjustable difficulty

The simulation should match what you're trying to learn, rather than aim for some abstract "realism."

2. Clear Information Rules

Current AI patients feel like guessing games. Students want transparent rules:

Visual cues (jaundice, rashes) should be immediately apparent
Information should require appropriate clinical questions
Rules should stay consistent across encounters

3. Ways to Actually Examine Patients

Human actors can't safely do physical exams. Students wanted AI patients to fill this gap:

Body maps for visual examination
Sensory cues (sounds, images)
Virtual test ordering with realistic timing
Safe practice of procedures that would be dangerous on humans

The goal is practicing how communication and evidence-gathering work together, even without perfect sensory fidelity.

4. Backup Options When Voice Fails

Voice-based AI breaks down. Students anticipated this:

Voice as the primary input method
Text fallback when speech recognition fails
Keyword shortcuts for specific queries
Hints when they get stuck

This prevents technical failures from derailing the learning experience.

5. Control Over Difficulty and Feedback

Students reframed AI patients as practice tools rather than tests:

Adjustable difficulty from "helpful mode" to "stress-test mode"
Light feedback during encounters (trust indicators, facial expressions)
Structured review after sessions
Links from mistakes to textbook references

6. Variable Patient Personalities

Real patients differ wildly. Students wanted to practice with:

Selectable personas (elderly, pediatric, anxious)
Variable emotional responses (crying, frustration, relief)
Different cultural backgrounds and health beliefs
Replayable scenarios with the same case but different patient types

This makes affect a training variable you can control.

The Real Insight

Students judge AI patients by whether the system helps them learn, not by how natural the conversation sounds.

Educational value matters more than conversational fluency.

What actually matters:

Can I tell what mode I'm in?
Can I adjust the difficulty?
Do I understand how I'm doing?
Does the feedback help me improve?

Where AI Patients Fit

The researchers position AI patients as complements to human actors:

Human SPs are better suited for high-stakes assessment, embodied presence, nuanced social interaction, and expert feedback.

AI patients are better suited for unlimited repetition, configurable difficulty, structured feedback, and on-demand availability.

Each tool does what it's best at.

What This Means for ClinicalSim

This research validates some choices we made:

Voice-first with text backup matches what students wanted
Structured scenarios with clear objectives align with their needs
Immediate feedback addresses the gap they identified
On-demand availability fills the gap between limited human SP sessions

It also points to what's next, including explicit difficulty modes, more patient personas, and stronger post-session debriefs.

The Bottom Line

Medical students want reliable, transparent, controllable tools that let them practice high-stakes conversations without the logistical headaches. Perfect human replication is less important than consistent educational value.

The "empathy gap" is a design challenge. Building for instructional usability matters more than optimizing for conversational fluency.

Based on: Gao, Z., et al. (2026). "It Talks Like a Patient, But Feels Different": Co-Designing AI Standardized Patients with Medical Learners. CHI 2026, Barcelona, Spain.