Is it fair to use student test scores to evaluate teachers?  There has been a great deal of interest in this question generated by President Obama’s Race to the Top initiative.  However, the question has always been hanging around education, at least throughout my career.  In the independent school world, the question comes up with regard to using Advanced Placement test scores to “evaluate” whether an AP teacher is doing a good job.  It is not uncommon for administrators to say, “Since his (or her) students always get 4s or 5s on the AP, they must be a good teacher.”

When I hear good teacher I think of Parker Palmer’s quote from Courage to Teach:

They (good teachers) are able to weave a complex web of connections among themselves, their subjects, and their students so that students can learn to weave a world for themselves.

Different teachers use a variety of techniques to achieve this goal, but they all have the inner drive to connect with their students. When I draw on my conversations with students and teachers about what good teaching looks like, the common ingredient is that good teachers go beyond their craft and extending themselves with great generosity to help students learn and grow. In doing this they make a connection that allows the student to be receptive to learning and creates a lasting bond between student and teacher.

In one conversation with a teacher, when asked what does good teaching means to you, she said:

The people that I think about as really good teachers would do anything to help you learn alongside them. They are willing to give you that time, at lunch, office hours, all sorts of different times. If you are struggling they are going to help support you until you get to where you want to be.

Within the context of building relationships, establishing trust is an extremely important component of good teaching.  Here is what a teacher said to me about the importance of relationships.

I think the affective or personality traits have to come first, because if I know everything in the world about the subject but the kids can’t relate to me, they don’t trust me, or they don’t feel like I am in it for the right reasons, then I am not sure they will ever hear all the great things I have to say.

The teacher working side-by-side with his or her students builds a connection that fosters the trust that allows for learning to occur.  Good teachers know this to be true.  Here is what a student shared with me about her good teachers.

I think teachers who care for their students are demonstrating one of the best qualities you can find in good teachers. I think it comes down to treating your students on equal footing, you get to know them well, you show a lot of personal interest in them as individuals, and you help them to learn to the best of their ability and have the best experience in your classroom.

In the book, The Collaborative Teacher, the authors write about what it means to be a spectacular teacher.

Spectacular teachers build trusting relationships, get to know their students, provide classroom structure, engage students, and show respect and interest.

Notice that all of these stories or quotes speak about the importance of gettting to know students, caring for them, providing structure, engaging them in learning, showing respect for them and interest in them, and of course having a passion for what you teach.  There is rarely a reference to my favorite teacher helped me score well on the AP, SAT, CRCT, ITBS, or any other standardized test.  This really does not matter in the long run. 

Finally, in their book, The Skillful Teacher: Building Your Teaching SkillsJon Saphier, Mary Ann Haley-Speca, and Robert Gower write:

Teaching is one of the most complex human endeavors imaginable.  We know good teaching is many things, among them a caring person and a skillful practitioner. There is an old saying that you are born to be a teacher.

Saphier has a website, Research for Better Teaching, that outlines many of the qualities of good teaching and provide resources (videos) to watch good teaching in action.

From my reading and research, it is impossible to measure good teaching by looking ONLY or MOSTLY at students’ test scores.  Value-added measurement (VAM) is the process by which school districts use student test scores to evaluate teachers.  VAM is being used in Los Angeles, New York City, Washington DC, New Orleans, Seattle, and other cities across the country.  While research tells us that teacher quality has a significant effect on student achievement, the research is not limited to a teacher’s influence on test scores.

In his article, Neither Fair Nor Accurate: Research-based Reasons Why High-Stakes Tests Should Not Be Used to Evaluate Teachers, Wayne Au outlines six reasons why using VAM is not an effective process for evaluating teachers.

1. Statistical error rates are reasonably high, as much as 35% when using 1 years’s worth of data.

2. Year-to-Year Instability: test scores of the same student taught by the same teacher vary from year-to-year.

3. Day-to-Day score instability: test scores can vary due to random factors like conditions at home, illness, and others.

4. Nonrandom Student Assignments: grouping and tracking can greatly influence student test scores.  Teacher have little or no  control over these factors.

5. Imprecise measurement: high-stakes tests and other standardized tests are unable to account for the complexities of learning or all the factors that influence learning. 

6. Out-of-School factors that influence learning, such as poverty, health, lack of nutrition, parenting, and others.  Teachers have no influence over these factors.

In a New York Times article, Hurdles Emerge in Rising Efforts to Rate Teachers, Daniel Koretz, a Harvard professor said:

Because tests were too easy and predictable, it was impossible to know whether rising scores in a classroom were due to inappropriate test preparation or gains in real learning. Rankings that include the tougher standards will not be available until the next academic year.

 As you can see, even if there was a reasonable correlation between student test scores and quality teaching, it still does not seem logical to use this as the primary basis for evaluating teachers.

In Georgia, which recently won $400 million Race To the Top dollars, there is a controversy brewing over this very question (see AJC article from January 2 and AJC on January 1).  The state is developing a process in which 50% of a teacher’s evaluation will be related to a VAM of student test scores.  Tim Callahan, spokesman for the Professional Association of Georgia Educators is quoted as saying:

The organization is OK with making student test scores a part of evaluations, but not 50 percent. The educator-advocacy group also has concerns about the lack of teacher input and the transition to new leadership both in the governor’s office and the Department of Education.

I do understand the need to evaluate teachers effectively (see my blog post, Meaningful Evaluation of Faculty).  In this post, I put forth a slightly different approach that is being developed at The Westminster Schools for its Faculty Assessment and Annual Review.  As a formative assessment process, it has five components built in: (1) a faculty self-evaluation tool; (2) a student feedback process depending upon the age group; (3) peer, colleague, or mentor observation and feedback; (4) principal, department chair, or dean of faculty observation and feedback; and (5) feedback from others (director of athletics if the faculty member is coaching).  It is a more comprehensive process.  Granted it takes more time that measuring quality based on student test scores, but it is more authentic and more effective in promoting faculty professional growth.

The more I read about teaching and listen to the stories of good teachers and their students, the more convinced I am that skillful teachers are fashioned over time.  Good teachers are constantly striving for a balance between honing their skills and nurturing their students’ desire to learn.  I would be more hopeful if we professionalized teaching and used a variety of criteria, test scores being one of many, to evaluate teachers.