Throughout my science education career, if I was asked what I do, I responded “I write standardized tests.” Let me assure you, this doesn’t win you too many fans outside of science education assessment circles. But in my opinion, there is nothing better to help one develop an understanding and intuition about how students learn than interviewing hundreds of students, listening to their thinking as they reason through questions.
When I listen to students think aloud as they answer questions, I learn a lot about what they know and about their exam-taking processes too. For example, while interviewing a student on a multiple true-false format physiology question, the student answered all the true-false statements then said “Wait, let me go back. There’s always some exception I might be missing.” For this student, physiology always broke the rules and the exams they typically took tried to test whether they knew the exceptions. Although my intention for the question was to have the students apply general conceptual knowledge, the student, like most others I interviewed, was instead spending a lot of time making sure they had recalled all the right information. Eventually, moments like this led to a simple change in question format that created an interesting shift in the way questions elicited thinking from faculty and students alike.
The interview mentioned above occurred during the process of writing a programmatic physiology assessment, Phys-MAPS.2 The goal of this assessment and the others in a suite of Bio-MAPS assessments was to build tools that could measure student learning across biology majors. Our working team3 and I chose to build all the assessments using a multiple true-false format, where for each question, a short scenario is described, followed by several (often 4-6) statements about the scenario that students identify as either true or false. We chose this format for its high utility assessing how students can hold both correct and incorrect ideas about a topic simultaneously,4 highly pertinent to learning across a major. In addition, the multiple true-false format has the benefit of facilitating easy and quick grading for a large number of students while still allowing for a rich understanding of student thinking comparable to essay assessments.5

Example of Modified Multiple True-False Design (from a question similar to but not on the Phys-MAPS)
However, as I was creating the physiology-specific assessment and Dr. Mindi Summers was creating the ecology-evolution-specific assessment, we ran into challenges when writing statements that needed to be absolutely “true” or “false.” Sometimes we had to write overly complex scenarios for the questions because too many constraints were needed for a “true” or “false” answer. In addition, discipline experts were refusing to ever say something was “true” or “false” (especially, but not solely, the evolutionary biologists). Thus, many of our statements had to be re-written as something that was “likely to be true” or “unlikely to be true”, making the statements bulky and long.
Dr. Summers was the first to bring up in our working group meeting the idea of modifying the true-false format. She suggested changing the prompt. What initially read “Based on this information and your knowledge of biology, evaluate each statement as true or false,” became “Based on this information and your knowledge of biology, evaluate each statement as likely or unlikely to be true.” I was instantly sold. I thought back to the student who had spent so much extra time trying to search her brain for the exceptions to the general rules. Surely, this was going to help!
It did. For starters, the discipline experts we were consulting were much more inclined to agree the answers were scientifically accurate. And for good reason! We science experts do not often work in the absolutes of “true” and “false”. In fact, I’m pretty sure a whole field of math was created for exactly this reason. I also saw a difference in how students responded to the new language. In my interviews, I noticed students took considerably less time on the assessment and I never again heard a student stop to try to remember all the exceptions they might know. Better yet, I started hearing language that reflected students were applying knowledge rather than trying to remember facts. For example, in the previous true-false format, I often heard “Oh, I just learned this,” and then I would watch the student close their eyes and agonize trying to remember a piece of information, when all the information they needed to answer the question was right in front of them. With the new “likely or unlikely to be true” format, I was hearing more “well that’s generally true, so I think it would work here too.” It appeared that students had shifted to a more conceptual rather than factual mindset.
But what really convinced me that we were on to something worthwhile was the awareness of some students of what they were truly being asked to do. “Wait, so basically what you want me to do is hypothesize whether this would be true [in this new scenario] based on what I already know?” YES!!! (I do my inner happy dance every time.)
We educators hear the message from a million places that we should teach science as we do science. I maintain that this should count towards how we assess science knowledge and skills too, asking students to apply their knowledge in new contexts where there is no known answer. But when science explores the unknown, how do you ask about the unknown and still have a right answer to grade? (Easily, on a scantron, that is.) As scientists, we use our knowledge to make predictions all the time, not thinking that our hypotheses will absolutely be true, but that they are the mostly likely outcome given what we already know. Why not show our students how much we value that skill by asking them to do the same?
1 Answer: Likely to be true.
2 More information about the Phys-MAPS and all of the Bio-MAPS programmatic assessments can be found on: http://cperl.lassp.cornell.edu/bio-maps
3 The Bio-MAPS working group includes: Drs. Michelle Smith, Jennifer Knight, Alison Crowe, Sara Brownell, Brian Couch, Mindi Summers, Scott Freeman, Christian Wright and myself.
4 Couch, B. A., Hubbard, J. K., and Brassil, C. E. (2018). Multiple–true–false questions reveal the limits of the multiple–choice format for detecting students with incomplete understandings. BioScience 68, 455–463.
5 Hubbard, J. K., Potts, M. A., and Couch, B. A. (2017). How question types reveal student thinking: An experimental comparison of multiple-true-false and free-response formats. CBE Life Sci. Educ.