Why do we need to innovate our assessments to improve how we evaluate clinical reasoning skills?

I’ve always been interested in the way people think. As a first-generation college student, my parents encouraged me to follow my passions, which led me to a degree in Tibetan Buddhism. I wasn’t sure what to do after graduating, but soon realized I wanted to be an academic. I earned degrees in philosophy and psychology at Virginia Commonwealth University and completed a cognitive science master’s program at James Madison University.

I went on to pursue a PhD in quantitative methods at the University of Texas at Austin. While wrapping up, I was fortunate enough to join the NBME Assessment Science and Psychometric Internship program. Learning about the extensive research happening at NBME gave me a new perspective on non-academic work. Shortly after my internship ended, the organization hired me as a measurement scientist.

Natural Language Processing advances assessment capabilities

My first project at NBME was working on the use of Natural Language Processing (NLP) for automated scoring of USMLE® Step 2 Clinical Skills (CS). Natural Language Processing is a subfield of Artificial Intelligence that explores the processing of natural language by computer systems. Everyday examples of NLP applications include Alexa, Siri and ChatGPT.

In assessment, traditional uses for NLP systems include automated scoring of essays and free-text responses. Recent advances have enabled the exploration of improving test development processes by helping to create exam content and predicting how a test question will perform.

While the project of using NLP for automated scoring of Step 2 CS never operationalized because the exam was discontinued, the research behind it continues to be built upon to assess clinical reasoning.

Assessing the process of clinical reasoning

My current research focuses on studying and developing assessments of clinical reasoning, which encompasses a variety of overlapping and interconnected skills and behaviors. Many measures focus on the outcome of the process; I am interested in studying and developing measures of the process itself.

Let’s compare it to a multiple-choice math test. If a student chooses the right answer, are we confident they had the skills to solve the problem? Maybe they did – they solved it using skills they were taught. Or maybe they guessed correctly. Or they had the wrong idea about how to solve the problem and chose the answer that was closest to what they thought it should be. This is like a clinical reasoning assessment that is focused only on the outcome of the reasoning process.

Multiple-choice questions are a reliable tool to assess knowledge of a broad range of topics; however, technology has vastly improved, allowing us to assess using new methods. To return to the previous example, let’s say the math test doesn’t have answers to choose from – students must show their work and circle their answer. With this type of assessment, we can learn about their thought process, recognize their problem-solving approaches, identify errors in thinking and guide them to improve.

Innovating clinical reasoning assessment through NLP

We can build on this to explore the application of NLP models in clinical skills assessment for medical students. In practice, patients give a lot of information – how does the physician sort through it to come up with the correct diagnosis and treatment plan? That’s what clinical reasoning is – the thought process a student or physician uses to get to the root of the problem and take the appropriate steps to address it, and getting it right is critical. We hope these developments into assessing clinical reasoning will support better, high-quality assessments that are reflective of the skills students need to build, ultimately helping them develop into better physicians.

In many respects, my research is nothing new. There is a plethora of articles and books and many brilliant researchers who develop and push the boundaries of clinical reasoning assessment. What I hope to bring is a different perspective. I’ve studied cognitive science and quantitative methods, and I work at an organization that has a relationship with most medical schools in the United States. I aim to leverage my skills and work to build upon, collaborate and amplify the research of others. I also have the unique opportunity to give back to the program that jumpstarted my career in measurement science – this will be my fourth year serving as coordinator of the NBME Assessment Science and Psychometric Internship program.

Ultimately, I hope my research serves to create assessments that make learners better health care professionals. That is one of the honors of working at NBME – the work has a broad impact. Clinical reasoning is just one piece of that puzzle, and if I do my job well it means that I’ve played a small part in supporting the health of the public.