Should we grade our own assessments? Some science teachers use “internal reliability statistics” that measure whether questions on a test are potentially confusing or misleading. For example, if there is a question that many otherwise high scoring students get wrong, but that otherwise low scoring students get right, then the question may be imperfectly worded or confusing.
One teacher admitted that when he first used this process, he was horrified to only get a 70 on this test of his assessment; that is, 30% of his test was potentially misleading. However, he was able to revise the test and score higher on the second internal reliability test. Nevertheless, he indicated that when he participated in this process, some language arts tests scored as low as 30% on this measure. This, surely, is a cause for some concern.
Clearly, there is great potential here for teachers to be defensive, after all, their work is being graded. However, there is also potentially a great benefit to students here: this process can eliminate misleading questions and insure that assessments are actually testing the knowledge and skills which they are intended to. How can we create an environment where teachers embrace the opportunity to get objective feedback on their assessments and the chance to revise them.