Original+Listserv+Posts

Ann Goethals Yes and so it goes. The PAR (peer assistance and review) [program my district (Niles 219) is rolling out this year is what the system needs. Veteran teachers released from classroom duties to mentor and evaluate first year teachers. So far this year I have been in each of my 12 teachers classrooms 3-4 times in an onward effort to make evaluation a "teaching" experience. After initial hostility and suspicion (evals have been used as baseball bats a time or two in the past dontyaknow) we are engaged in some meaningful dialogue about teaching and learning. New teachers are learning quickly. And my ability to be in classrooms more makes my evaluation more genuine instead of directors witnessing a once or twice a year dog and pony show with all the bells and whistles of the day (era) on display.

Unfortunately, like all things educational, this expensive program (two FTEs out of the classroom) will be "considered" for renewal during our contract. But I know this much: I could have used this program in 1987 when I started and probably would not have seriously considered leaving the classroom, because frankly, I was drowning and didn't really come up for air until year 4.

Michael Smith What I'm wondering about is whether teacher is the appropriate unit of analysis. I think not. It suggests that "good" teachers only do "good things," if you see what I mean. George and I submitted a proposal to follow up on work George has done that analyzes the impact of the distribution of time across instructional episodes of different sorts. Seems to me that it's more important to know, say, the impact of spending the first 10 minutes each day journaling (George found a negative impact on writing post-tests) or having students work in peer response groups, and so on. Unfortunately, it wasn't funded.

I think that the issue of teacher assessment is hugely important, so I'm in on however we decide to pursue it.

Here's a summary of what we proposed: Observations will be segmented into episodes by noting changes in material, objective, or type of activity. The episodes will then be coded by function: instruction, assessment, management, and diversion. Our coding of the instructional and assessment episodes is informed by relevant research. Episodes of instruction and assessment will be coded as promoting declarative or procedural knowledge. Instructional episodes will be further coded for content: composition, literature, reading, vocabulary, grammar, etc. Episodes focusing on composition will be coded for the processes/knowledge on which they focused (studying models of form, discussing content, drafting, revising, pre-writing, writing in journals, learning a rubric (criteria) for judging writing, providing feedback, teacher conferences, and so forth). Episodes focusing on reading will be coded for the processes/knowledge on which they focused (content, pre-reading, genre, literary terms, strategy instruction, and so forth). Reading episodes will also be coded for the kinds of materials students read (non-literary, literary, multi-cultural, popular cultural, and so forth). All instructional episodes will be coded for the classroom organizations employed: whole class, small group, or individual work. All lessons will also rated as being highly connected. moderately connected, or essentially unconnected to previous instruction. We will work on observing and coding for high reliabilities. In past studies, we have achieved reliabilities of over //r// = .9 on segmenting episodes and coding content of episodes with exact agreement of over .8.

Liz Kenney It may be true that the teacher is not the appropriate unit of analysis in deciding what leads most predictably to student learning. But the reality on the ground is that teachers are being evaluated for their "effectiveness" and that the way this happens is being radically reimagined across the country. I would love to be part of more conversations about how to make this evaluation process more effective and more meaningful.

Declan Fitzpatrick

Technically I'm on the administrator side of the issue (I'm not a certified administrator, and I don't formally evaluate people).

Where I work we have been working with all administrators in the district to "tune our eye" so that we all agree on the what effective instruction looks like.

We are using an observation tool that aligns with our long term professional development plan, and doing brief observations in pairs, discussing what we saw, applying a rating scale based on the standards, and capturing the data by placing dots on charts. We are not tracking data by individual teacher, we looking for trends and patterns across buildings, and across the district.

Here are the areas we look at:


 * Evidence of planning**—


 * Long term planning
 * Daily planning


 * Learning Objectives**—Explicit, essential, relevant:

The idea here is to develop a prototype for a teacher evaluation system that is fair, has teacher buy-in, that draws on multiple sources of evidence, that is affordable, and that would have stakeholder support. This project may evolve in scope and form, depending on how contributers shape it. These criteria don't speak for themselves. We are using key terms and phrases from the four areas of PD that we are engaged in. We are still in process developing some level of inter-rater reliability, making the standards meaningful for teachers, and earning buy-in, but this is by far the more significant work we have done with principals in the four years I have been here.
 * What are they?
 * Who can tell you?
 * Why are they important?
 * How are they modeled?
 * Assessment and Feedback—**Clear targets, quality design, descriptive feedback, purposeful use:
 * How is feedback fast, specific, actionable, and relevant?
 * How does it redirect new learning?
 * Engagement—**Collaborative structures, inquiry learning**,** positive supports
 * How many are interacting?
 * What motivates participation?
 * What guides their behavior?
 * How do you know students are thinking at high levels?

Bernie Josefsberg “Numbers in policy debates cannot be understood without probing how they are produced by people: what makes people decide to count something and then find instances of it; how the measurers and measured are linked; what incentives people have to make the numbers appear high or low; and what opportunities they have to behave strategically…” Deborah A. Stone “Any reader of my blog knows already that I’m a skeptic of the usefulness of Value-added models for guiding high stakes decisions regarding personnel in schools. As I’ve explained on previous occasions, while statistical models of large numbers of data points – like lots of teachers or lots of schools – might provide us with some useful information on the extent of variation in student outcomes across schools or teachers and might reveal for us some useful patterns – it’s generally not a useful exercise to try to say anything about any one single point within the data set. Yes, teacher “effectiveness” estimates tend to be based on the many student points across students taught by that teacher, but are still highly unstable. Unstable to the point, where even as a researcher hoping to find value in this information, I’ve become skeptical.” Bruce D. Baker [] //The Fidelity Issue “R” Us// //A Proposal for the Second Edition of the __Superintendent’s Fieldbook__// Policy bells are pealing the imminent arrival of the National Common Core Standards and the intensified dose of academic rigor they purportedly provide. State level directives insist that local districts do all that is necessary to come to grips with the Standards prior to the appearance of their accompanying “next generation” of assessments. “All that is necessary” represents the local component of a distally designed system of standards and tests that by now has succeeded in displacing most if not all competing notions of how public schooling should be done. In the absence of policy alternatives, “les jeux sont faits” with respect to how improving public schools should be done. Standards and the tests may be viewed as “the colder, action at a distance” elements of the system. To achieve their intended impact, these elements, in turn, must rely upon the vivifying warmth of school-house actors. When taken together -- whether across the nation or across a single school district – these actors comprise a universe of human vagaries, inclinations, commitments and capabilities. In short, they are the messy, noisy, and non-psychometric recipients of the handiwork devolving from the CCSSI, the CCSSO, the NGA Center, the SBAC, and the PARCC.

Proposed is a dialogue on the Common Core and Next Generation Assessments between two such actors --- a superintendent in his 39th professional year and a high school English teacher in his 16th. Their dialogue will touch upon the possibilities, requisites and cautions embedded in this newest version of standards and accountability. Since both have lived with and through earlier versions, they may be described as “Standards Reenactors” about whom it may fairly be said that “The Fidelity Issue ‘R’ Us.”

Steve Gevinson: I'm late coming to the conversation, and I haven't been to the wiki, and I apologize for all of that. But I wanted to mention that during my time as a teacher evaluator -- 8 years as division chair (technically an administrator) -- I found that the best piece of our formal evaluation process, which was mostly based on the Charlotte Danielson model, was talking to teachers about their student evaluations. We had the division secretary administer a standard survey with about 25 statements with 5 possible responses (Strongly Agree to Strongly Disagree) to students of non-tenured teachers at the end of first semester. The students also were able to write free responses, which were later typed up by the secretary to preserve student anonymity. We ran the surveys through some machine that produced many useful statistics.For my own purposes and to prepare for my discussions with teachers, I was able to develop quite useful comparison charts. Teachers were given their results in advance of our meeting and asked to reflect on them in writing and submit the reflection before we met. What I liked most about the process was that teachers were seeing pretty objective data on how their teaching was working according to their students, and I could use the comparison data for various purposes, ranging from reassuring teachers about poor scores in, say, returning papers in a timely fashion (as English teachers typically score relatively poorly on that question, even though, I hasten to add, we must always strive to improve in that area) to discouraging teachers from dodging distinct indications of problems by criticizing the instrument or the students. These discussions with teachers were by far the most substantive and productive that I ever engaged in during the evaluation process. Best of all, they fostered genuine, serious reflection by teachers on their practices and opened opportunities for thinking together about solutions. As I saw the purpose of the evaluation process, we were trying to promote improvement and growth in teachers. This tool worked quite well with probationary teachers, and I think it would have worked well, too, with tenured teachers, but we never negotiated it into the evaluation process, as I think we should have. There's much more to say about teacher evaluation, obviously, but we can say, at least in a school like mine, that it should be one piece of the teacher building process that starts in a school with effective hiring procedures, includes superb mentoring, incorporates formal evaluation as part of a growth model, and provides ongoing professional development opportunities. Formal discussions with a trusted, sympathetic, knowledgeable supervisor (not saying that I was this, but that is what one needs in the process for it to work well) about thoughtful student responses to one's teaching should be a key part of the overall effort. Pete is certainly right in pointing out the foolishness and wrongheadedness of the push for teacher evaluation based on standardized tests and simplistic thinking, and right, too, in maintaining that great working conditions are most likely to support and perpetuate great teaching, not least in the way they provide the opportunity to develop a superb teaching culture in a school. A strong teaching culture is more effective, I would say, than any formal evaluation system in promoting excellent teaching.Steve Gevinson
 * Bernard Josefsberg**
 * Jonathan Budd**
 * November 3, 2011**

This was written by Andy Horne, Dean of the UGA COE, as a draft, so it's not for distribution. But he has some good ideas on evaluation, so I'll include it here.

Dear Allen,

I’m so far behind on so many “good intentions” and one includes getting back to you with discussions about how education is doing locally, state-wide, and nationally. At the Blank Foundation meeting where we had a chance for a brief conversation I mentioned that I have some disagreements with the current move to privatize so much of education and to find alternative models when much is working within the public sector. You had suggested we might follow up with further discussions and I dropped the ball.

I’m one of the first to admit we have some major problems with public education. When I was teacher (decades ago) I was dissatisfied with the pay schedule because all teachers were compensated on the same scale regardless of the quality of teaching, and there was tremendous variation, so I’m glad to see that one of the current moves is to find ways of identifying talent and leadership rather than discourage the creative and innovative teachers by forcing a fixed scale that offers no differentiation. Having said that, I’ve believed for a long time we should identify our poorly performing educators, give them a chance for professional development and remediation, and if they cannot meet adequate standards, move them out of education. But that takes us to the next topic: evaluation.

I am quite opposed to the current models of educator evaluation, looking at outcomes of students on standardized test scores to account for the major accountability measure of a teacher’s impact. I know we have moved away from process evaluation and into outcome measures, but we are way off target in our current expectations. First, one-shot, high stake measures are flimsy at best. Second, the highly selective areas (math, science, reading) neglect the broader goal of education: educated and responsible citizens capable of creativity, innovation, and leadership, none of which are measured in many of the standards – and they can be with broader approaches. Third, I’m not excusing teaching ability, but we do know how the current testing measures correlate with poverty, family conditions, and community circumstances and while some teachers are exceptional enough to overcome the experiences, I’ve not yet found any model that can be taken to scale large enough to meet our educational demands. We are moving in that direction with improved education, and we have some examples that have worked well (the Ron Clark Academy, but in that case – how many “teacher of the year” candidates can America produce, how many have the solid support of parents, the selectivity, and the resources available, and how many will pay sufficiently for the talented to be in those conditions?). We have other examples right in our communities. Our current Professional Development School program is a model – taking schools with 95% free/reduced price lunch, 85% minority - and having performance well above state averages and outperforming private and charter schools hands down. But this is taking a premier program (University of Georgia) with talented university faculty combined with hand-picked classroom teachers and excellent leadership (Clarke County Superintendent Phil Lanoue, Principal, advisory board) to form a collaborative which has grown from one school to six. And it is demonstrating that change can happen, in public schools, with daunting conditions, but it is highly resourced and will go to scale slowly – but it demonstrates these situations can happen.

I am in favor of stricter review, higher standards, and the demands for increased performance, but I would hope we can be data driven with a broad approach to understanding education and an appreciation for the understanding that educators actually do have about model learning programs for children.

I fear a major motivator for changing the process is not educational improvement, but rather three beliefs that seem t to be driving much of the demand for change:


 * 1) A desire to shrink government and with the reduction of government, corresponding reduction of public services
 * 2) An awareness that there are enormous profits that can be made through the privatization of education, as is evidenced by the income being produced by some of the newly formed alternative education models (much of it a transfer of tax-payer provided funds from state to private accounts)
 * 3) A desire to produce workers to meet specific business interests rather than an educated and informed citizenry that have flexible skills and adequate preparation in innovation, creativity, problem solving and decision making.

As I have watched the move toward reform in education it places a greater emphasis on rote learning, boring instruction, and reduced engagement in life-preparing educational experiences. Our Torrance Center for Creativity in Learning is an international center for research and development in creativity and innovation in education and the leaders of that center have expressed extreme dismay at the direction our schools are taking.

Anyway, I am attaching a number of documents that might be of interest. I know I am biased in this discussion, but I approach it with a passion for creating the best learning environments for all of our children – not just the privileged few – and toward providing the caliber of education that will keep our country to leader in innovation, creativity, new ideas, and citizens that can lead with distinction, rather than the model we have seen in so many of our public arenas recently. We need change – dramatic change – but it should be change lead by scholarship, research, empirical data – not a belief system unfounded in experience or evaluation.

If you are interested in further discussions or if you would like to visit our Professional Development School Collaboration, please let me know. I’ll be glad to do follow-ups and to share any additional information you may request.

Warm regards, and thank you so much for your care and interest in education.

Arthur M. Horne, Ph.D.

Dean and Distinguished Research Professor

Peter Smagorinsky One way to create space for better evaluations is to reduce the amount of testing required of students, and by implication teachers. Dedicate that time and those resources instead to situated evaluations of real teaching