I often find myself witness to conversations about assessment methods adopted by medical schools. Conversations regarding these methods have ranged from formal feedback discussed with faculty and administration to less elegant rants during my dedicated study periods leading up to board examinations. On either occasion, I wasn’t the only one. And on either occasion, the general consensus was always the same: “We need better assessment strategies.”
A seemingly collective consensus, this statement reflected a deeper, more unresolved question. A question highlighting the shortcomings, flaws and, in some cases, failure to assess us as future doctors.
This bothered me — a lot.
I do not wish for this piece to be dismissed as another rant (thankfully, I am way past my ‘dedicated’ study block). Surely, our assessment methods have evolved over time, largely in the right direction. Many of these have been based on sound pedagogical theories. But in practice, there have been significant unintended consequences.
To understand the issue surrounding assessments, we must understand that it has become increasingly challenging to train physicians suited to face contemporary changes. To future physicians who have access to a repository of ever-expanding information on their smartphones, being tested on ‘high-yield’ minutiae serves little purpose. Being able to think critically (and perhaps even imaginatively) in order to make sense of that information for patient care is what counts. And thus, no matter how standardized an examination is, lack of contextual reference renders it futile.
For example, it is common to encounter interesting minutiae (think: leucine zipper motif, HLA-subtypes, optochin sensitivities) in tests otherwise used to evaluate clinical competence. This is especially common during the preclinical years of medical school. Unfortunately, the premise that basic science facts stimulate critical thinking (or even clinical thinking) is deeply flawed. Mechanistic reasoning introduces dangerous cognitive bias, whereby evidence-based medicine is subordinate to pathophysiological reasoning.
One could argue that having topic guidelines and themes determined by expert clinicians (and dare I say, not just academics) ensures these tests are criterion-referenced in the best possible way. This addresses the above problem of ensuring relevant content in assessments. But the nature of assessments still remains a barrier.
Multiple choice questions (MCQs), for instance, are a good way to assess the cognitive domain of medicine. (The other domains being ‘motor’ and ‘affective’ which represent the skills and attitudes of medicine, respectively.) Efforts to incorporate diagnostic reasoning and script concordance items in MCQs rather than esoteric factoids provide better insight into clinical judgment. Rigorous standardization has ensured validity and reliability of these.
But can an assessment truly test sound clinical judgment in an artificial situation? Especially when real patient presentations are vague and may require acceptable yet imperfect interventions in diagnosis and treatment? Patients after all do not come with multiple choice options!
The same principle might apply when testing skills that are more hands-on. Be it a clinical exam with simulated mannequins or even the USMLE Step 2 Clinical Skills test with standardized patients, strict checkboxes on an examiner’s pro forma evaluation cannot be the metric to grade future physicians. This is because real patient encounters tend to be more fluid, built upon conversations in real time. As much as we would like to, the spontaneity of these conversations while eliciting relevant history and performing a physical cannot be captured in an evaluator’s checklist.
This challenge is even greater when assessing communication skills and professionalism. Lack of well-established definitions has made their assessment tricky. How would one, for instance, grade a joke by a student to reduce a patient’s anxiety during a clinical encounter?
Thus, there has been an increasing trend to place more importance on ‘competencies,’ i.e. predetermined abilities as curriculum outcomes. Medicine however requires mastering competencies spread over multiple cognitive, psychomotor and attitudinal domains. The assessment methods should therefore be robust enough to incorporate this diversity in skills, a task that cannot be accomplished by individual assessments alone.
In a series of recommendations, the National Research Council (the operating arm of the U.S. National Academy of Sciences, Engineering and Medicine) proposed adopting a ‘systems-based approach to assessments.’ Building upon this as well as the work of others, the proceedings of the Ottawa Conference of Assessment of Competence in Medicine and the Healthcare Professions in 2018 proposed a robust framework for a systems-based assessment in medical education. This development made several waves within the academic community for its methodical approach to assessment.
At first glance, a ‘system of assessment’ seems to merely combine multiple individual assessments to form, well, a system. But it accomplishes more than just that. Relying on multiple sources ensures the inherent weakness and biases of individual assessments are overcome. This also provides a larger sample of data that can be used to generate quality feedback for students.
Perhaps most importantly, a ‘systems approach’ integrates complementary and overlapping attributes in a coordinated manner. This provides multiple ways to assess a competency over time and also promises to be more cost-efficient by eliminating redundancies.
The question remains: if implemented, what will this strategy look like? For starters, ‘traditional methods’ that overemphasize testing knowledge at the expense of other competencies can be done with. Having a broader range of learning outcomes can drive more holistic learning. The Ottawa conference mentions workplace-based portfolios like records, reflections, community projects, rounds and handoffs as more desirable alternatives. Combining these with written tests, Objective Structured Clinical Exams (OSCEs) or Problem Based Learning (PBL) courses generate better feedback to determine the progress of a student.
CP van der Vleut of Maastricht University has been a major proponent of this strategy and is highly regarded for his contributions in this assessment. He suggests combining learning tasks (mentioned above) with learning artifacts (e.g. research reports, presentations, etc. that allow faculty to assess a student’s grasp over a subject). He also recommends supportive activities like reflection under supervision to make students more self-aware in this process.
Ultimately, assessments have been considered more important for student learning than the choice of instructional method itself. In fact, innovative instructional methods have barely shown significant effects on learning when compared to traditional methods. It is therefore imperative that we call for better assessment strategies considering the integral role they play in our learning as students.
In other words, it is not just an assessment ‘of’ our learning, but an assessment ‘for’ our learning.
As medical students, we (unfortunately) confront significant financial hurdles and mental health challenges in addition to recalibrating our caffeine homeostasis — all in the pursuit of clinical competence. For us to believe that an arbitrary score is a good substitute for said competence is selling us short.
As for that deep, unresolved question on “better assessment strategies,” one can only hope that these conversations are followed by decisive action. It is, after all, high time that medical education keeps pace with actual medical practice.
Image credit: Custom drawing by Akanksha Mishra for this column.
Medical education today struggles to keep pace with actual medical practice. Moving from an information-driven curriculum to a value driven one has propelled a vast array of research and scholarship in teaching methods, assessments and competencies. In this column, I hope to share insights on some of these areas as well as call for learning that is more adaptive and less standardized.