May 11, 2026·7 min read

What Programs Lost When Step 2 CS Disappeared, and What Hasn't Replaced It

USMLE Step 2 CS was permanently discontinued in 2021. Five years later, residency programs still have no standardized way to assess communication skills. Milestones 2.0 raised the bar, but gave programs no new tools to meet it.

ClinicalSim.ai Team

ClinicalSim.ai

On January 26, 2021, the NBME and the Federation of State Medical Boards announced they were permanently discontinuing USMLE Step 2 Clinical Skills. The exam had been suspended since May 2020 because of COVID-19, and after a year of review, the co-sponsors decided that bringing it back wasn't worth the effort. They framed the decision as an opportunity to "work with colleagues in medical education and at the state medical boards to determine innovative ways to assess clinical skills" (USMLE.org, January 2021).

That was five years ago. The innovative ways haven't arrived.

What Step 2 CS actually did

Step 2 CS wasn't beloved. It was expensive ($1,300+ per attempt), required travel to one of five testing centers, and tested a narrow slice of clinical skills in a high-pressure, artificial setting. Medical students and program directors complained about it for years.

But it did something that nothing else in the system did: it provided a standardized, independent assessment of whether a medical student could communicate with a patient, take a focused history, and document a clinical encounter. Every student who matched into a residency program had passed the same communication bar, regardless of which medical school they attended or how their school ran its own OSCEs.

That common denominator is gone. And the downstream effects are showing up in residency programs that now have to assess communication competency without any shared reference point for what incoming residents can actually do.

The gap no one has filled

When the USMLE announced the discontinuation, they said that clinical reasoning and communication skills would be assessed through other steps in the exam sequence. Step 3 still includes computer-based case simulations, and some communication content was added to Step 1. But these are written exams testing clinical reasoning, not live assessments of a trainee's ability to sit with a patient and have a conversation.

Medical schools have responded in scattered, uncoordinated ways. Some expanded their OSCE programs. Others lean more heavily on clinical rotation evaluations. A few have built capstone clinical skills assessments for fourth-year students. But there's no standardization across these approaches, which means a program director in Chicago has no way of comparing the communication training a resident received at one medical school versus another.

The question that Step 2 CS answered, however imperfectly ("can this person function safely in a patient communication setting?"), now has no consistent answer at the point of residency entry.

Milestones 2.0 raised the standard

In the same period that Step 2 CS disappeared, the ACGME rolled out Milestones 2.0, which created harmonized Interpersonal and Communication Skills (ICS) subcompetencies across all specialties for the first time. Before the harmonization work, ICS was described in 176 different ways across the 26 core specialties (Edgar, Roberts, and Holmboe, Journal of Graduate Medical Education, 2018). Programs used different frameworks, different language, and different assessment criteria for what was supposed to be the same competency.

Milestones 2.0 fixed the language problem. Three harmonized ICS subcompetencies now apply across every specialty: ICS-1 covers patient and family-centered communication, ICS-2 covers interprofessional and team communication, and ICS-3 covers communication within healthcare systems. That's real progress, and it gives Clinical Competency Committees a shared framework for the first time.

But a shared framework without shared assessment tools is a mandate without a method.

The assessment gap in practice

The ACGME's own survey data shows the disconnect. 96.1% of respondents said they understood ICS-1 as a concept. 87.4% agreed they should be using it. But only 80.9% said they knew how to effectively assess it (ACGME, Strengthening ICS via Harmonized Subcompetencies). For ICS-3, the numbers followed the same pattern: 91.7% understood it, 87.0% agreed it should be used, and only 81.1% knew how to assess it.

That means roughly 1 in 5 GME stakeholders responsible for evaluating communication skills don't feel equipped to do it. And these are the people sitting on Clinical Competency Committees making milestone decisions about trainees.

A 2025 study by Santen et al. in Academic Medicine analyzed ACGME harmonized milestone data from PGY-1 residents across the six largest specialties and found that program-level differences accounted for roughly 22.5% of the variance in ICS ratings and 23.6% of the variance in professionalism ratings. In plain terms: about a quarter of the rating a resident receives on these competencies has more to do with where they train than what they actually do. That's not a small effect for an assessment system that's supposed to measure the resident.

What programs are actually doing (and why it's not enough)

Most programs have defaulted to the tools they already had: faculty observation during clinical encounters, informal feedback from attendings, and the occasional standardized patient encounter if their simulation center can accommodate it. The problem is that workplace-based assessment depends heavily on the assessor's frame of reference. A 2023 study by Kogan, Conforti, and Holmboe in the Journal of Graduate Medical Education found that faculty often need explicit frame-of-reference training to discriminate reliably between learner performance levels, and that without it, direct observation produces inconsistent ratings.

So a resident's communication assessment depends on which attending was in the room, what kind of day the attending was having, and whether the clinical situation happened to involve a conversation that was observable. That's a thin dataset for a competency that ACGME considers foundational across every specialty.

The math problem gets worse when you look at reliability. Research on communication assessment has shown that approximately 45 patient-completed assessments are needed to produce a highly reliable estimate of a single provider's communication skills (Holmboe et al., Assessing Interpersonal and Communication Skills, JGME 2021). Most programs collect 2-3 documented observations over the course of months. The gap between what's needed and what's actually happening is enormous.

Five years of "innovative approaches" that haven't materialized

The USMLE's January 2021 announcement specifically mentioned the intention to develop innovative assessment approaches. Five years later, there is no new standardized exam, no unified clinical skills assessment framework, and no widely adopted replacement for what Step 2 CS provided. The various efforts by medical schools and professional organizations have produced guidelines and recommendations, but nothing that functions as a shared, standardized benchmark.

This matters for residency programs because they're now absorbing the full burden of communication assessment that used to be partially distributed across the USMLE system. Programs didn't get additional resources, additional faculty time, or additional assessment tools to take on this work. They just got the responsibility.

And the stakes keep rising. Communication failures were identified as a contributing factor in roughly 30% of 23,658 malpractice cases analyzed in CRICO Strategies' 2015 benchmarking report, with associated costs of $1.7 billion and nearly 2,000 patient deaths over five years (CRICO Strategies, Malpractice Risks in Communication Failures, 2015). HCAHPS scores tied to patient communication directly affect hospital reimbursement through CMS Value-Based Purchasing. The ACGME's Milestones 2.0 framework explicitly requires programs to demonstrate that graduates can communicate effectively at each developmental stage. The external pressure to get communication right has increased at the exact moment the system's capacity to assess it decreased.

What filling this gap actually requires

The solution to the post-Step 2 CS assessment vacuum isn't another high-stakes exam. Step 2 CS was a binary pass/fail that tested communication in an artificial setting on a single day. Programs need something different: longitudinal data on how trainees communicate across multiple encounters, mapped to the specific ICS milestones that CCCs have to evaluate.

That means structured practice opportunities where communication skills can be observed, measured, and documented repeatedly over time. It means assessment data that's consistent across encounters rather than dependent on which faculty member happened to be present. And it means giving programs tools that don't require 45 patient assessments or dozens of faculty hours to produce a reliable picture of where a trainee stands.

Purpose-built simulation can fill this gap without recreating the problems that made Step 2 CS unpopular. On-demand practice eliminates the scheduling bottleneck. Milestone-aligned assessment generates the longitudinal data CCCs need. And standardized scenarios ensure that every trainee's communication is evaluated against the same criteria, regardless of which clinical encounters they happened to get during rotations.

Five years after Step 2 CS disappeared, programs are still waiting for the system to catch up. The assessment bar went up with Milestones 2.0, the only standardized communication assessment went away, and no one has filled the space in between. The tools to close that gap exist now. Programs shouldn't have to wait another five years for someone to build the replacement the system promised and never delivered.

References

Work to Relaunch USMLE Step 2 CS Discontinued. USMLE.org announcement. 2021. [Link]
Edgar L, Roberts S, Holmboe E. Milestones 2.0: A Step Forward. Journal of Graduate Medical Education. 2018. [Link]
Strengthening Interpersonal and Communication Skills via Harmonized Subcompetencies. ACGME. 2021. [Link]
Santen SA, Ryan MS, Fancher TL, et al.. Variability in Learner Performance Using the ACGME Harmonized Milestones During the First Year of Postgraduate Training. Academic Medicine. 2025. [Link]
Kogan JR, Conforti LN, Holmboe ES. Faculty Perceptions of Frame of Reference Training to Improve Workplace-Based Assessment. Journal of Graduate Medical Education. 2023. [Link]
Holmboe ES, et al.. Assessing Interpersonal and Communication Skills. Journal of Graduate Medical Education (Milestones 2.0 supplement). 2021. [Link]
Malpractice Risks in Communication Failures: 2015 Annual Benchmarking Report. CRICO Strategies. 2015. [Link]