Making the Grade

I required listening quizzes as part of my jazz history course. These consisted of 10 excerpts selected from the assigned listening. Students would choose from titles and artists on numbered and lettered lists. So, if I played, “West End Blues,” by Louis Armstrong, a student could answer the letter of the title and the number of the artist. In general, these quizzes were a good way to measure whether students could recognize major tunes by major artists, one of the goals of the course.

One day, when I was playing in a children’s concert, I asked my son to administer a listening quiz for me. He had taken the course the year before, so he understood the routine. Normally, I would allow students to choose up to three items to be replayed to check their answers. My son simply played the excerpt CD with no replays. I discovered that the results without the replays were pretty much the same as always. Those who had listened to the list a few times tended to get high grades, while those who had relied on their in-class listening and “winged it” tended to receive low grades. Nevertheless, I still think about what grades really mean.

When we were first married, one of our hobbies was trying new recipes. We came up with a simple 3-point rating system to evaluate them. A rating of “3” meant the dish was good enough for guests. A “2” meant it was good enough for us. And a “1” meant we’d better not make it again. This is what happens when teachers rate food.

I’ve been following some of the recent discussion about “grade inflation.” It seems to me that while we shouldn’t assign high grades for poor or mediocre work, the problem is that letter grades exist at all. Our little 3-point scale for recipes was really a pass/fail assessment, in a way a measure of whether the recipe was “ready for prime time” or not. If the “dish” was student achievement, we might have said the work was either professionally competitive, basically competent, or needed improvement.

We’ve had A,B,C,D, and F grades for a long time. Yet, grading students like eggs has always been troubling for me. Professors are fallible. Students are not objects to be sorted. And the meaning of the letters has become distorted over time. In grad school, I was required to earn at least a B- for a course to “count.” A mere C was not acceptable. In itself, this requirement sounds like “grade inflation,” until we think about it.

Grad students are like pre-sorted eggs. 16 years of education and entrance requirements assure a high percentage of grad students are very bright. It would be surprising if their grades did not appear to be “inflated.” The same is true of Ivy League Universities. When a school is able to reject 90 to 95 percent of its applicants, its selected talent pool should be able to earn satisfactory GPA’s without breaking a sweat. If George W. Bush was a C student at Yale, he was bright enough to be average at an excellent university. Further, as the population has increased, elite universities have been able to select a smaller percentage of the general population. I would expect a significant number of A grades from this cohort, just as I would expect a pro basketball team to have a high percentage of agile tall people who play the game very well.

However, selectiveness and ability can’t explain away the problem of grading. Professors must still decide what the standards are. What knowledge and skills should a student master by the end of the course? How should that knowledge and those skills be measured? Are the measurements valid? What about effort and growth? Should a student who starts with little or nothing and becomes proficient receive a higher grade than a student who starts with a lot and does very little? There are many more questions like these.

I tried to set standards that students could meet if they put in the work. At the end of a course, one of my students told me, “I get it now, Dr. Murray. A student would have to try to fail one of your courses.” I took this as a compliment. I had offered alternate assignments, project options, study guides, practice tests and quizzes, and a variety of ways to earn points. The technical term is “multiple measures.” Unlike some engineering courses I took – with a midterm (30%) and a final (70%) – I made sure that one component could not make or break a grade. If all the work was completed and test grades were satisfactory, my students were practically guaranteed a B or C. Sadly I had some failures, but usually these students understood that I couldn’t award points for doing nothing. And I made it clear that every student who earned enough points would get an A. I never tied grades to “the bell curve” (10% A, 20% B, etc.). These students were admitted to college, so I assumed my grading practices would not reflect a Gaussian distribution of a large population.

While standards are important, there is still the element of professional judgement. A colleague once told me they would not pass a “weak” student. Another said they didn’t want their course to be the only reason a student couldn’t graduate or complete the major. At the end of the semester, those of us who assign grades must be able to look in a mirror and grade ourselves. If we acted with integrity, if we were fair and impartial, and if we were true to our profession, I’d say we earned a 3. If not, we need to find a new recipe.

Leave a comment