Jon Held's Test Grading Method
  Welcome | First Sailing trip | Hughes gulf crossing & cal25 salvage | Alaska trip letters | West coast trip | Misc letters | My brothers new house | Brother's house Pg2 | My Test Grading System | Brother's house pg3 | my dinghy & folding bike wheel | More good stuff  

test grading

This page describes a method of scoring college level tests that I have developed and used in every course I've taught since developing it.   I had to develop it because I had the task of teaching what I believe was the largest engineering class ever taught at K-State.  I had to find a much faster way of grading a problems type test.
Basically this is a test grading system that can be very quickly graded.  It applies mostly to problems type courses.  I used it for all the engineering classes I have taught since developing it.  I loved it and the students did too.
This system allows me to get the test back to the students the next class period.  I feel that is very important in a learning environment, quick feedback.  The student learn about their previous mistakes before starting new material.

The students get more points for correct work and lose less for mistakes.  It all evens out in the end because you grade on a curve anyway.  The students can skip one of the four problems and still get a 90 if they do everything else perfect.  Their best problem is worth 40% of the grade and their worst only 10%.

This system promotes accuracy and completeness, it rewards finishing one problem more then just getting a good start on 3.  Just like in the real world, one finished and usable bridge is much better than 3 partially finished bridges.   

The class was Engineering Economics.  The semester before the class was not offered due to a faculty shortage, normally there were 2 or three sections with 40 or so students per section.  So this semester I had two semesters worth of students all combined into one giant lecture.  I agreed to teach it, I enjoy a challenge.  The class was so large we had to find a larger lecture hall than what was available in the engineering buildings.  It was an auditorium. The class was a disaster due to classroom layout, the blackboards were far too small, too far away from the students along with many other problems.  There were over 240 students in the class.  This test method was really the only good thing to come out of that semester.

Basically the system involves four problems on a test.  The problems have to be such that they involve many (maybe 10 or so) steps or decisions to come up with a final numerical answer.  A typical engineering design course fits this mold very well.   I often would have 2 part problems, part A that basically asked do you have any idea what's going on and then a part B that is more involved.

These problems are graded on a 4 level system.  Perfect  ----  Almost perfect  ---  More than 1/2 right  ---  Less than half right.

The marks I use to represent these 4 levels are  3,2,1,0  (completely arbitrary, they could just as well be P for perfect, A for almost, H for half, and Z for zero) This is all that is recorded in the grade book and on the test sheet handed back to the student.  I think you should take the time to quickly put some marks on the students work with a couple circles, x's, check marks, and/or a line at the end to show what was there and what was not correct.   I always grade in ink and if the problem is not correct or finished  I draw a line across the paper at the end of the students work.  The important thing with this system is that you don't have to spend the time to fully understand the students work in order to do this grading.   When the students take the test they work in pencil but I require them to write the final answer in ink and box it.  Finding it is faster and easier when bolded like that and you know what they think is the answer.  Often the correct answer might be on the paper, they may have worked the problem 2 ways, but they chose the incorrect one as their answer.   They are required to have all the work, the answer boxed, in ink with correct number and units to be perfect.  Almost perfect (2) is rarely given on the first grading, a 1 (more then half right) is marked and it is left for the student to rework the problem after the test is returned, (that's the next 50 point test) discover their mistake and then argue (not much argument here) for the almost perfect grade.  Almost perfect is given when they prove it was one stupid math error, no theory errors. 

This system allows me to get the test back to the students the next class period.  I feel that is very important in a learning environment, quick feedback.
The retake test (50 points) is a take home test. The exact same problems, due one week later, work with anybody, even come in and ask me.  The idea is to learn about their mistakes, correct them, and get it right.  Basically make it worth some points to force them to hopefully learn what they didn't know the first time.  If they got a 105 (everything perfect) on the first test, they just hand it back in, otherwise redo the problems that where not perfect.  If a student missed a test due to illness, they take the make up at the end of the semester to replace that grade, but they still hand in the retake for the test they missed.  This retake is the best part of this system in my opinion, most of the time I believe most students never learn from their mistakes.  They take a test, get it back a week later after new work has been lectured on, look at the grade and then stuff it away in the notebook, and never ever look at it after that.

Perfect is obviously easy to grade. More then half right, is almost just as easy and quick to grade.  You ask yourself, "Does it appear that they in the correct ball park?"  Less than half right, sometimes takes a judgment and a little more time, they have missed many important steps. they get nothing for this.  Normally it takes 10 seconds or less to grade a problem.  Basically you check the answer, quickly scan the method, make a couple marks and write a number 3,1 or 0.  Their answer in ink reduces the students ability to change mistakes after the fact to argue for a 2 and a couple marks on the paper reduces the ability to add material later and upgrade a 0 to a 1.  There have been times I have photocopied all works earning a zero and caught a couple students cheating like that.  In that case I offer them a chance to drop the class before I push for expulsion from the university.  They've been caught red handed, hopefully they learn something from the experience. 

This zero credit for less than half right forces the students to concentrate on the problems that they know how to work and not just try to put something on the paper to try and get some partial credit for stuff they know little about.  It takes a long time to figure out how to grade that using the old method.  Hard decisions, is it worth 3 points or 4 out of 25, etc.  That's what makes it take so long to grade the old way.   If they are clueless, with this method it's best to leave it blank, punt that one and try another. 

My tests are closed book, I also allow the student to bring in one 9.5x11 hand written crib sheet, front and back, even edges, LOL.  No photocopies, they have to spend the time writing and organizing this.  It's a good study aid to make them at least open the book and copy the formulas.  With open book tests I found the poorer students often waste too much time thumbing through the text trying to find a similar example problem.  This hand written paper (cheat sheet) eliminates the advantage a student might have with an expensive programmable calculator over one that could not afford such luxury.

This system results with 4 numbers as the grade recorded in your record book and also on the returned test, like 1331 or 0321  This is then turned into a score with the following set of rules.  You never do this conversion, all you do is record the 4 numbers.  The computer does the conversion when it calculates and prints students grades and class averages.  The students learn the conversion very quickly and are quite satisfied with a 4 number record of test performance.

The students best problem is worth 40, 35, 30, 0 for a grade of 3, 2, 1, or 0

2nd best   30, 25, 20, 0  for  grades of 3, 2, 1, 0

3rd best    20, 15, 10, 0  for grades of 3, 2, 1, 0

worst        10, 5, 5, 0    for grades of 3, 2, 1, 0

There also is extra credit given for no 0's,  +10 points for no 0's no 3's, +5 points for no 0's some 3's.  This seems backwards but they've already got their extra credit if they have a 3 and got 40 points for one problem.

I developed, modified and tested this system by using old pregraded tests I had in my files. Before the beginning of the semester, I tried this system and many other similar schemes and compared them to my old way of 25 points per problem with partial credit given for the amount of work completed correctly.  This is the sliding scale that works well for me and my style of giving partial credit for partial completeness.  I would imagine or at least hope my style is very similar to most teachers, but there are some very harsh graders out there that wouldn't like this system. I suspect their harshness might be due to not wanting to devote the time it took to do a good job of grading and even they might like this system.  Using this method I can grade 40 student tests in under an hour, where the old fashion way took 6 to 8 hours.  Very rarely would there be a score in the range from 60-100% that  varied by more than 5% out of a 100. This fast method normally gave the students on the average 2-3% higher grades.  This is not a problem, grading with a curve takes care of that, you just make the test 2 or 3% harder.  The tests that scored below 60% using the old system quite often would get 30% or 0% using this system.  That is one of the best features of this system, it sends a very strong signal to those students.

I always offer to grade the test the normal method if the student desires it after the fact.  I try to be very fair when doing that, and I do that with the student watching and explain my decisions on points given or deducted.  Normally the fast method ends up being a couple points higher (that's part of the reason for the designed in 2-3%), but if the old method gives them a better score, that's what goes into the grade book.  I use a letter to replace the first problem score and then follow that with the test score. My computer program system recognizes this as normal percentage score and replaces it automatically.

I also offer a makeup test at the end of the semester to replace a missed or low score test.  It can't replace a retake.  Previously excused absences are handled with an "equivalent" old or modified test taken in the office at a convenient time.   Test differences (slight problem difficulty ratings) don't matter as much with this grading system.


TRY IT, I'll bet you might LIKE IT

       --------------------       ----------------------

It save times, lots of it.

The students get their test back the next class period.  It's still fresh in their mind.

You could give more tests, I don't, except for the 50 point take home redo tests.  I think the students learn so much from these retakes.  Using this take home idea you don't have to spend (AKA, waste) a class period going over the test, just the highlights of problems everybody did poorly on. If that happens, you may well have not done a very good job of teaching or you expected too much.....  It always amazes me how often they don't get the problems right on the take home retest.

Another big advantage of this system is that it identifies the poor students and sends them a very strong signal to get with the program or get out very quickly.  Before adopting this system I often had students that were in left field, earning just some partial credit on every problem, mostly a gift from the grader.  Every test they had was a failing grade but not by enough to really get their attention.  They always had hope that they would just get a good grade on the next test they would pass.  Come time to make the decision to drop, they were still hopeful.  This system identifies them quickly and gets their attention, a 30 or a zero on the first test is sufficient to wake them up.     

Teaching is much more fun with reduced drudgery of grading.

The students learn more.

At the end of the semester when you make the borderline decisions on where to draw the line between A's and B's you have more info on the students performance, instead of just a score, you know if they did average work across the board or mixed perfects and nothings.   You can also do statistics on what areas you might better improve your teaching, relating overall student performance to individual problems.    

Many teachers I have talked to feel that the system is just too course of a grading system. They feel they need to spend the time to decide if a student's performance on a problem should be given a 16 or an 18 out of 25 points.  Even after I point out that at the end of the semester that the 16 or 18 will most likely make no difference in deciding if they fall into the category of A, B, C, D, or F.  If you only have 5 categories in the end, why not let statistics do it's thing and start with similar data in the beginning.  I think we all have had enough statistics to believe that a sample size of 44 problems during the semester, (5 tests, each with 4 problems, 5 redo tests, and a 4 problem final) should be enough to allow statistics to do it's job.  I think reluctance to try this is mostly due to people being afraid to experiment with something new.  The advantages here are worth the risk.   

The bottom line.  The students like it, they learn more and it's less work for the instructor.  It's a win/win situation.


        ------------------         ------------------           

I've found it very difficult to convince another faculty to try this system.  To my knowledge none ever has.  If you do please send me an email and tell me, and let me know how you liked it.

It does slightly inflate the grades (about 3 percent), I designed that into it, that is one of the reasons that the students like it. If you adjust the difficulty of the test slightly, everything is right back to normal.

A computer is used (almost needed) to convert the four number (ex, 2310) grades to 100 point grades.  A spread sheet with a little programed logic does a fine job.

A perfect score 3333 gives a student 105%.  I've never had a student get more than 100% total by the end of the semester.  Once again the curve and statistics takes care of that.



A students records might look like this.


test 1,  retake,   test 2,   retake,   test 3,  retake,  test 4,  retake,  make up,   final 

1101     2333     0010      3133     2211     3333     1022     3333     2321     3021 


these would correspond to scores of

  70      100          30        100         85        105         70        105         90         75

 with the final counting double, 4 50 point retakes, and this students makeup replacing test 2 the students final average is 

675/800 = 84.4%


I have often made up a 5 problem final with the additional best problem worth 50, 45, 35, 0  for 3,2,1,0 and scaled the total test score up from 150 to 200 points