Zapped by VAM: Critique of the Value-added Model (VAM) for Evaluating Teachers

This is a letter I wrote to the LA Times regarding their article on using the value-added model for evaluating teacher effectivness. Access the LA Times article at this URL:

www.latimes.com/news/local/la-me-teachers-value-20100815,0,2695044.story

For a critique of the LA Times Value-added Model data base, see this web page from the National Education Policy Center:

nepc.colorado.edu/publication/due-diligence

JKM

August 21, 2010

Mr. Felch, Mr. Song, & Mr. Smith,

I’m assuming that you got my message about the possibility of discussing the Times’ value-added data. I explained several points of concern about the interpretability of the value-add model (VAM) data. Let me take as an example the case you present in your article of August 14 titled “Grading the Teachers.” You give the example of Ms. Caruso who teaches third grade at Third Street Elementary in Hancock Park. According to your report, Ms. Caruso is rated in the bottom 10% of effectiveness in your VAM data because she is not “boosting” students’ test scores. According to her principal Suzie Oh, Ms. Caruso is one of her most effective teachers. Keep in mind in this analysis that the US Department of Education (2010) found a 25% error rate in the use of VAM data for evaluating teachers.

Let’s examine what her students’ test scores may mean, since apparently they are not an accurate measure of her teaching effectiveness. The article states that “on average” her students were in the 80th percentile at the beginning of third grade with her but …” had sunk 11 percentiles points in math and 5 points in English.” First of all, a five percent average difference in test scores may not statistically significant unless it is outside the range of a standard deviation in test scores. There is a normal level of what we call “wobbling” in any student’s test scores from grade to grade. A 5 point average “wobble” for a group of students is not in any way meaningful, let alone an indicator that the teacher is less effective or has caused this “wobble.” It is much more due to the nature of the grade level tests, which also indicate different levels of difficulty in the academic content and skills the students learn in second grade vs. third grade, etc. The curriculum is not even or equal in difficulty as students move up the grades. Additionally, different students come with various levels of skill from earlier grades, so they may find third grade to be “a breeze” or they may encounter some challenge. So if Ms. Caruso ordinarily has three or four students who encounter greater challenges in English language arts in third grade, perhaps due to a shift from decoding to comprehension in reading, these students can “throw off” her average, thus causing this difference in the class averages over time. This is not unexpected and does not reflect one whit on her effectiveness as a teacher.

Now let’s look at what may have happened in students’ test scores in math, where you indicate an 11 point “sinking.” This difference may be more than a standard deviation & therefore, statistically significant, but is it significant in terms of Ms. Caruso’s teaching effectiveness in math? Again, the first place to look is at the curriculum. Is third grade math more difficult than second grade math? We certainly hope so! Consequently, the third grade standardized test in math is more difficult than was the second grade test. It’s logical. So what does this tell us about Ms. Caruso’s effectiveness? Nothing. She is keeping her third graders at or above grade level in math, ticking right along as they move on to fourth grade. Good for her! The second place to look is at the second versus the third grade math achievement tests. I understand from several LAUSD teachers that the format of the 2d grade math test is different from the 3rd grade. In second grade, students are read the math questions out loud and only one or two problems appear on each page, with lots of graphics to reinforce the words. However, in third grade, students are required to read the test questions on their own, in smaller print, and answer several problems per page. Is this change in format, increased demands on reading skills, & more problems responsible for the “sinking” in students’ math scores? This alone explains the difference, thereby casting doubt on the assumption that Ms. Caruso’s teaching effectiveness is the reason. In fact, Ms. Caruso meets the definition of an effective teacher in the federal Race to the Top education law. Her students came to her 3rd grade performing a standard deviation above the mean (above grade level) and left her classroom performing a standard deviation above the mean. Ms. C should not be chastised because her students stayed at the same high level of performance under her tutalage as when they came to her.

Obviously, Ms. Caruso was disappointed and distressed by her “value-added score.” Obviously she did not have the benefit of this expert analysis of the meaning (in this case, meaninglessness) of her so-called effectiveness scores. She labels herself an “ineffective teacher” based on her value-added score and asks in all sincerity, “What do I need to do to bring my average up”? Sadly, the only answer that value-added data can give her is “Teach to the test!” This requires her to become much better at guessing what will be on the standardized test, which will become her focus, rather than being assured that her students are making good progress and that she is doing a very good job of teaching them the grade-level content and skills. Ms. Caruso will not have anyone to explain to her that the value-added model is least valid & reliable when it is based on student test scores that are either very high or very low. Many teachers whose test scores from year to year show very little variation because of consistently effective teaching practices and programs in their schools will be labeled as “ineffective” based on this VAM statistical analysis. Ms. Caruso and her colleagues will not have a chance to challenge and question the theoretical and pedagogical assumptions of the researcher who designed the VAM study or the possible alternative interpretations of the data. Her self-perception as a teacher and her professional prestige within her school community has been damaged by this score, but she will have very little opportunity to correct her own or their misunderstandings and false judgments about her because of this VA score.

As a fellow educator and researcher, my heart goes out to Ms. Caruso and her fellow LAUSD teachers. We as an education community must do all we can to prevent the misuse of data to demoralize and denigrate teachers and to skew teaching to fit a faulty research model.

Jill Kerper Mora Associate Professor Emerita San Diego State University
jmora@mail.sdsu.edu

Here is the first letter I wrote to the LA Times reporters in response to their VAM articles of August, 2010.

Mr. Felch, Mr. Song, and Mr. Smith,

I am a semi-retired professor at San Diego State University in the School of Teacher Education. My area of expertise is in preparing teachers for educating English Language Learners (ELL). I am a nationally and internationally recognized expert on teacher effectiveness with teaching language minority students. For example, see these articles of mine recently published in the Encyclopedia of Bilingual Education:

Mora, J.K. (2008). Bilingual teacher licensure. In J. González (Ed.). Encyclopedia of bilingual education, Vol. 1 (p. 95-100), Thousand Oaks, CA: Sage Publishing.

Mora, J.K. (2008). Teacher qualifications. In J. González (Ed.). Encyclopedia of bilingual education, Vol. 2 (p. 816-821), Thousand Oaks, CA: Sage Publishing.

I am also familiar with the value-added model (VAM) of teacher evaluation and have written extensively critiquing this model, in particular as to how it applies to teachers who serve populations of language minority students. You can see a sample of my writing on the Voice of San Diego, where I was an invited “Bogger for a Day” blog on this topic.

I offer you my qualifications because I am very urgently interested in dialoguing with you regarding the proposed release of data generated by Richard Buddin & your newspaper’s staff using value-added model research. As you point out in your articles and in the demographic data about the LAUSD, the district’s population is around 50% English Language Learners (ELL). However, none of the VAM research, to the best of my knowledge, has been conducted on student populations with this large percentage of ELL. Additionally, none of the value-added models with which I am familiar take into account students’ second-language learning factors in teacher effectiveness. This is of grave concern in interpreting the statistics you have generated for several reasons, which I would like to have the opportunity to dialogue with you about. Among these are 1) the reliability & validity of standardized test scores for ELL with different levels of language proficiency, 2) the norming of standardized tests, which rarely include large populations of ELL students. Both of these issues call into question your basic assumptions in the data about what test scores tell about how much students have learned from a particular teacher. In addition, you appear to have not considered how the “expected gains” in test scores of ELL, since these are highly dependent upon their level of language proficiency AND the match or mismatch between the emphasis on different skills at different grade levels. There are other problems in the research design & data set that I wish to further explore.

You claim to have isolated “the contribution of a teacher or school to student learning”. This assertion is problematic given the fact that ELL may be in any of several different types of programs. These programs themselves vary in quality & in the projected achievement outcomes for ELL students. This is very well documented in the research literature, including a meta-analysis of the research by the federal National Literacy Panel on Language-minority Children & Youth. Therefore, rather than measuring a “teacher effect”, you may in fact be measuring a “program effect”, which means that you cannot claim to have isolated the effect of a teacher. I do not see this limitation of VAM documented in your video or in Dr. Buddin’s technical paper.

In short, I am deeply concerned that the LA Times’ research and data set have not been properly vetted, critiqued & reviewed by experts in language minority education who can properly address the limitations of VAM research with populations such as LAUSD. This means that many teachers who are effective with these populations may be labeled as less effective because of flaws in the data or faulty research assumptions in the design of the study.

In addition, as an expert in teacher effectiveness in general, what I often term “generic” teaching skills, I believe that your research is based on flawed assumptions and misconstrued statistics that seriously effect the interpretability of your data set. This is very serious because teachers’ reputations are on the line if this data is released to the public. It will be impossible for teachers whose professional reputation is diminished by an “effectiveness score” that is invalid to explain the complexities of VAM research, especially as it applies to the demographics of teachers’ student populations and the types of programs they are under mandate to implement (not of their own choosing based on scientific research findings. The public, however, will find these statistics to be compelling & will not question their soundness or validity. In fact, they will assume that any attempts to explain these “scores” to be akin to “making excuses.” This then shifts the conversation in education reform away from many vital issues & concerns that must be part of policy formulation and program improvement in order for us to improve students’ academic achievement.

I have written frequently to the comment blogs for several articles this past week on this subject. Of course, I post anonymously in that venue and cannot address the technical aspects of this project nor support my comments based on my scholarly credentials & experience. Consequently, I ask for an opportunity to explore these issues with you in depth, perhaps face to face. I would also like us to be joined by key colleagues with expertise in the area of language minority education and value-added model research

Below I have provided the postings I made in response to the Los Angeles Times articles on the Value-added Model (VAM) data analysis of 6,000 LAUSD teachers.

Posted August 18, 2010

Entry 1

As an education researcher I am very familiar with the “value-added” teacher evaluation model research on the feasibility (reliability & validity) of using students’ test scores to evaluate teachers. This model has proven itself to be meaningless. A few researchers have created some very complex & elaborate statistical models that must account for dozens of variables simultaneously. At the end of this mathematical pipeline, there are so many caveats placed on the interpretation of the results that we can’t use the data to identify high v. low performing teachers. In other words, it can’t be done. Then add the element of lack of full native-speaker level English language proficiency, which affects scores of 37% of all CA students statewide, more in the LA district. Tests in English designed for & standardized on native English speaking populations are invalid for drawing conclusions English language learners & how much they have learned from their teachers. The reason why is obvious: They can’t understand the language of the test. It is hubris for the LA Times to think that a few reporters on their staff can outdo the education research community & figure this out. Stop this nonsense!

Entry 2

More about the Value-added Model for teacher evaluation: The research (what little there is) indicates that a value-added model cannot be used with a teacher unless there are at least three years’ worth of data. This data must be collected on a teacher who teaches at only one school at the same grade level if there is any hope at all of eliminating extraneous variables that may effect conclusions about the continuity of his/her teaching. As far as I can tell, any teachers whose alleged effectiveness scores have been reported who meet any of these categories, should challenge the LA Times as to their validity: 1) teachers with less than 3 years of test score data at a single grade level at the same school. 2) Teachers who have English language learners in their classrooms and/or where the percentage of ELL in his/her has changed over 3 years time. 3) Teachers who have students “pulled out” for any type of instruction with another teacher during the school day (ELD, special education, Resource, etc.) This is because his/her students’ test scores cannot possibly reflect only his/her effectiveness. This eliminates all secondary teachers in departmentalized settings. Heaven forbid we mistakenly reward the science teacher because the English teacher down the hall taught them how to read their textbook! This is just for starters. Whatever was the LA Times thinking?

Entry 3

If you look carefully at the Race to the Top federal education law, an effective teacher is defined (I’m paraphrasing here) as one who has a majority of students in his/her class who grow one academic year (grade level) in test scores for one academic year of instruction. English Language Learners do not make this rate of growth, even when they have excellent teachers, because they are still learning English & can’t be expected to perform on grade level on a standardized test. Does this mean that the LA Times “team of reporters” can measure the effectiveness of teachers who teach English learners & compare their effectiveness to teachers who don’t? In my opinion, these teachers have grounds for a lawsuit against the LA Times for defamation of character because they have no valid statistical or scientific evaluative data to form a basis for their rankings.

Entry 4

What exactly is the message here? Have you looked at the LA Times’ technical report on the statistical methods it used to crunch the numbers on these teachers? Can you look at this with an expert eye? I can & I have and my conclusion is that their methodology is a bunch of mumbo jumbo that does not stand up to scientific scrutiny. They have based their model on some sort of “expected gains” for students in English language arts & math. English language learners’ expected gains in English proficiency, which is not measured by an English language arts test, varies considerably from year to year. They have no information in their technical paper that indicates that they have taken this well-documented phenomenon into account in making judgments about teachers’ effectiveness. The research they cite on value-added is very thin, certainly insufficient to support the scholarly & scientific soundness of their methodology. So what “message” does this send about, & to, teachers? The real message here is that the LA Times is willing to publish unreliable, unscientific & invalid evaluative data about them that damages them professionally without properly vetting the data & with disregard for the consequences. Keep in mind, in this case the “messenger” generated its own news. The LA Times must be held accountable.

Entry 5

From what you are saying I conclude that you don’t know much about the volatility of test scores across grade levels. I’m still studying the technical report, but the fallacious assumptions you & the statisticians are making are numerous. Tests are not designed to be like pouring some amount of water into a beaker, with each school year expected to have “poured in” the same amount each year without any variations, except possibly those attributable to a teacher’s effectiveness. This is not true of many students, for reasons that are beyond the teacher’s control & do not reflect one way or another on his/her teaching effectiveness. That’s my point. The theories behind this statistical analysis are flawed so therefore the statistics are flawed & no reasonable conclusions can be drawn from the data, let alone ranking teachers against other teachers based on false research assumptions. The LA Times has chosen to by-pass the usual scholarly peer review process where these highly technical features of a study are vetted by experts who can identify flaws & unfounded conclusions. Instead, they have put teachers’ reputations out there for public scrutiny based on faulty data & a dubious scientific methodology. This is causing tremendous damage to these teachers. The Times must be held accountable!

Entry 6

Some posters are saying here that there is “no accountability at all” from teachers. This is very, very wrong. Teachers are held accountable in many, many ways. To mention a few, there are report cards, parent conferences, classroom visits for observation & evaluation purposes by principals & other administrators, the opinions of his/her colleagues about how well managed his/her classroom is or how well his/her students do as they move up through the grades or to different subjects in a school, & so on. There are curriculum supervisors who make sure that teachers are teaching the content standards & benchmark measures for their students, usually at the district level. The professional opinion of administrators & colleagues are a very important in making decisions about a teacher’s performance, & more valid than test scores that may be meaningless in showing effective teaching. Teachers must teach for two or three years before earning tenure. It is not difficult for a qualified observer (usually a fellow educator) to stop teachers who are struggling in their assignments or who are not fit for the profession. Only a few of such teachers are able to complete their credential requirements, let alone be granted tenure by a district. The important question: Where is the LA Times’ accountability to the LA district’s teachers?

Entry 7 August 19, 2010

As an education researcher, I am familiar with the value-added model research. The technical paper that supports what the LA Times did here is extremely thin on scholarship & misrepresents the current state of thinking about VA teacher evaluation. The author did not cite the National Academies’ 20 page letter to Secretary of Education Arne Duncan dated October 5, 2009 in which the Board on Testing & Assessment of the National Research Council summarized why there is insufficient science to support the use of the value-added model for evaluating teachers. We now see that the LA Times’ technicians (an economist, not an education researcher) & its team of reporters think that they are smarter than the National Academies for the sciences, who advise the nation on issues of science, engineering & medicine. The LA Times owes the teachers of LAUSD an apology (at least) & should scrap their plans for any more releases of this kind until the experts from the education research community, have had the opportunity to review & comment on their statistical model & procedures. This release of so-called teacher effectiveness data was not properly vetted & we must hold the newspaper accountable to the resulting damage to the professional reputations of teachers.

Entry 8

Please do not misunderstand what I am saying about ELL. I am saying that if students who are not yet fluent in English are given a test that is designed for native English speakers on grade level content & skills, the test does not give valid & reliable information about what the students know & have learned. This is because of the mismatch between their English language skills & the test. It does not reflect on the effectiveness of their teacher. If you look at the statewide data base on test scores for ELL, you will see that their CELDT level indicating their level of English proficiency is highly correlated with their test scores on grade level achievement tests. I am sure that your bilingual children are very bright & are doing well in school. Congratulations. I want to see all bilingual learners have the very best opportunities to do just as well. Unfairly evaluating their teachers based on faulty test data & faulty statistical analyses will not help us achieve this objective. I am bilingual myself & my bilingualism is one of the greatest gifts my parents gave me, so I applaud you for doing the same for your children.

Entry 9

I totally agree. It is impossible to isolate a single teacher’s impact on his/her students’ test scores. These teachers must speak up in the strongest way possible to protest this terrible injustice to their professionalism & their public reputations. It is not about the “status quo” being unacceptable. It is about a newspaper perpetrating a huge injustice on these noble public servants.

Entry 10

You should read the LA Times’ technical report that describes their statistical model & analysis. You will see that they don’t have a scientifically sound basis for making any type of rankings. As an expert in education researcher & a strong advocate for teachers, I am appalled at what the LA Times has done. I hope they get a lot of heat from the public and not just the education community.

August 20, 2010

Entry 11

The most obvious case of standardized achievement tests not being valid measures of students’ learning is when the students are not fully proficient in English, which is the case with a large percentage of students in the LA district. We cannot conclude that students perform poorly on these tests because they have not been taught well. Our first assumption must be that they cannot demonstrate what they learned because of the mismatch between their language proficiency (comprehension, reading & writing) & the questions on the test. This also means that the norming group for the test must be examined since test scores are reported in terms of percentile rankings, so apples are being compared to oranges when ELL’s scores are compared to & ranked against the norming population made up of native English speakers who are fully proficient. Then we must also take into account that the tests changed in the LA Times data sets from the SAT-9 to the CST, so they are not using a single measure longitudinally, another factor that invalidates their results. I can list more reasons: mismatch between what is taught & what is on the test; variability in the difficulty & reading levels of the tests from year to year, etc. As a statistician, you can see how these factors pollute the data set.

Entry 12

Not only statistical principles applied here must be sound, but the pedagogical assumptions must also be sound. These must also be documented through the body of research available on the subject. The value-added research has shown that no statistical inferences can be drawn unless a pattern is observable for at least 3 years for a teacher at the same grade level or subject & in the same school in order to reduce “noise” from other variables. An N=60 students is not sufficient, especially when they are drawn from different school/grade/classroom contexts. This data is so contaminated that any conclusions drawn from this data set are extremely highly suspect, if they are interpretable at all. This “study” is a scientific research disaster!

Entry 13

As an education researcher, I am very concerned about the ethical issues involved in the release of the data from Richard Buddin’s study to the public with teachers’ names associated with alleged rankings. Ethically, researchers are required to get a release form from any participant in a study if data they are identified in any way in association with data used in a study. This is called the protection of human subjects. Can the LAUSD allow the LA Times to use their employees’ data without their prior written consent simply because they are teachers? I don’t think that the school district could release student data with the students’ name attached without prior written parental consent. Could Mr. Buddin be held liable for abuse of ethical standards in conducting research? Could the LA Times be liable for their breach of these standards? The research community should be raising a strong voice of protest here over the treatment (mistreatment) of research subjects. The LAUSD & the LA Times should be ashamed of their failure to consider ethical standards in releasing this data to the public. This whole fiasco is going to backfire on the Times & on the LAUSD if they don’t intercede on behalf of their teachers.

Entry 14

This study by Richard Buddin is not scientifically sound & does not pass muster with the educational research community. It is what we who use statistics in research call a classic case of “Garbage in and garbage out.” Yet these data have a huge potential for doing damage to many teachers’ professional reputations. It is not a matter of some kids complaining but a matter of standards & ethics. Don’t trivialize the concerns of teachers. The LA Times has yet to hear from the academic research community. They will soon & when they do, they better pay close attention. Teachers have no problem with being evaluated & being held accountable using evaluative procedures that are reliable, valid, standardized & that meet ethical & legal standards. They do object however to a public flogging by a local newspaper based on shoddy & unvetted so-called research.

Entry 15

I am glad that you think your “evaluation” is an accurate reflection of your actual effectiveness, but what do you say to those thousands of your colleagues who do not feel the same way? What do you say to teachers from other grades whose classroom settings don’t fit the value-added model so they either get off the hook & are not subjected to this excerpted type of so-called evaluation or whose students’ scores cannot be used to reflect their effectiveness for a multitude of reasons? Unless all teachers can be evaluated using the same standards of measurement that are fair, accurate & complete, a subset of teachers should not go along with this because they don’t dispute their individual results. We educators, teachers & researchers (like me) all need to be unified on this.

Entry 16

I believe that many teachers are concerned about the ethical issues I raise. I don’t know if there is a legal issue or cause of action for teachers since many may be harmed by the public release of the data, which is unfair & demoralizing to teachers, individually & collectively. I have grave concerns about this research based on my knowledge & scholarship in education research, with expertise in educating language minority students. The technical paper posted by the LA Times written by Richard Buddin is the basis for my analysis, as well as the video of reporters Jason Song & Jason Felch explaining the “value-added” approach they used to create the data base. From a scientific perspective, the theories & assumptions on which the data analysis & conclusions are based are unsound. The number, quality & focus of the studies they cite to support their research is thin & does not include the research critiquing the value-added model, such as a report by the National Academy of Science & the National Research Council (October, 2009), which concluded that the science of the value added model is not “ready for prime time” because of lack of a scientific base. The Times has not vetted their data through the appropriate scholarly & scientific processes.

Entry 17

No, research data is not fact. As an accountant, you must be familiar with the old axiom about data: Garbage in, garbage out. The data analysis methods used to create this data set are flawed because the educational assumptions behind the analysis are false. One assumption that must apply to these data to have any validity & reliability at all is that test scores accurately measure what students have learned under the instruction a single teacher during an academic year. This is what the LA Times claimed to have isolated. They call it the “teacher effect.” However, their method of analysis neither isolates a single teacher’s “effect” nor do standardized test scores tell us what students have learned from a teacher, most especially in the case of the 50% of LAUSD students who are or were classified as English Language Learners. You put garbage into the computer, crunch it up a bit using faulty procedures & you still get garbage out. The public should not be misled into believing that teachers’ effectiveness can be judged based on garbage statistics.

August 21, 2010

Entry 18

You speak about “student’s achievement line” as if such a thing exists that can “make a case for/against a teacher.” The federal Race to the Top law defines an effective teacher as one who produces a year of academic growth for a year of instruction in the majority of his/her students. In other words, if the majority of a teachers’ students pass from his/her grade level to the next grade having learned the grade level curriculum, this is a successful teacher. We cannot use this sort of statistical analysis of students’ test scores to differentiate levels of effectiveness, in part because this model becomes increasingly unreliable for the very high scores & the very low scores. According to the ethical standards of testing & research, a test designed for one purpose (measuring student achievement) cannot be used for another purpose (allegedly measuring teacher effectiveness). This is fundamental to any possibility for interpreting data.

Entry 19

You are exactly right. The value-added model (VAM) used to generate these teacher ratings can only be used on a narrow subset of teachers who fit the theoretical model. K-2 teachers can’t be “evaluated” this way because their students don’t take standardized tests. Middle & high school teachers don’t fit the model because they don’t have the same students for a major portion of the school day, their students are taught by 4-6 other teachers, & their subjects aren’t tested on uniform standardized tests (thank goodness!). So the sad truth of the LA Times’ so called research is that grades 3-5 teachers get put on the hot seat & take it on the chin for the rest of the teaching force. In addition, VAM has never been piloted on test scores from a population of students like LAUSD’s. These data are fatally flawed & should not be made public.

August 22, 2010

Entry 20

Thank you to the researchers & knowledgeable experts who have stepped up to the plate to challenge the LA Times on their unethical use of a not-ready-for-prime-time data analysis model for evaluating teachers using students’ test scores. One issue that must be addressed is the fact that the value-added model (VAM) has not been piloted with any student population with demographics similar to LAUSD’s students. The Times reports that 50% of LAUSD’s students are classified as English Language Learners (ELL). Yet, VAM data analysis does not address the language factor. One of the core assumptions of VAM analysis is that the standardized tests used are valid & reliable measures of what students know & have learned from their teachers. Students who are not yet fully proficient in English cannot demonstrate their learning on a test in a language they do not understand, read & write. The validity of these tests is also questionable since they were normed on a population where only a small percentage of students share their linguistic characteristics. The language factor skews & distorts VAM data in a number of ways, which the lone researcher Buddin & the LA Times fail to address in their data analysis & interpretation. This increases the probability of erroneous evaluations on the so-called “effectiveness scores” of LA teachers who work with ELL.

August 23, 2010

Entry 21

How much “imperfection” do you believe teachers should tolerate in the way they are evaluated? The value-added model (VAM) in general misclassifies teachers 25% of the time. We only have one VAM unpublished study that was done on a student population like LAUSD. We expert researchers know that standardized tests are not valid for telling us how much English Language Learners have learned because they don’t understand, read or write the language of the test. So a research design that ignores the language factor cannot possibly be accurate in rating teachers’ effectiveness or comparing teachers. Consequently, the LA Times’ VAM study is subject to a much higher error rate. Where do teachers whose professional reputations are damaged go to get their reputations back when erroneous evaluations are made public? What effect does this have on the morale of our teaching force & on our ability to attract highly qualified college graduates into teaching? Is it sound public policy to have a publishing company publically evaluating our teachers using unscientific & faulty research methods? What does the LA Times have to gain versus the harm done to public servants & the public interests in education? Its’ all about curriculum control. If publishers can have teachers bowing down to their tests using their textbooks & materials, they control public education.

Entry 22

George Skelton claims that “parents have a right to know” teachers’ value-added evaluations from the LA Times. He doesn’t know what he is talking about. What about teachers’ rights to privacy and more importantly, to be evaluated fairly and accurately by their employers, not by their local newspaper? There are many reasons to object to the LA Times’ value-added model (VAM) study & the release of so-called effectiveness scores for 6,000 LAUSD teachers. 1) The Times’ had only one researcher, an economist on loan from RAND design the statistical analysis using the VAM model, which is a not-ready-for-prime-time research model that has not been proven to be scientifically sound or valid. 2) The Times did not have their data set reviewed & critiqued by experts from the professional education research community, who are now stepping forward to point out the huge flaws & erroneous assumptions of the research methodology & data set. 3) The VAM is very limited in its use for providing data on teachers because only a narrow subset of classrooms grades 3-5 fit the theoretical model. Therefore, value-added can never be used consistently & uniformly for evaluating teachers, a fundamental criterion for any system of evaluation. 4) Newspapers are not research or academic institutions & are pure amateurs at educational research. Is it sound public policy to turn teacher evaluation over to a market-driven publishing company who has vested interests in tests, instructional materials, etc.? 4) The Times cannot force the LAUSD to use its data to evaluate teachers. Accountability is a two-way street.

August 24, 2010

Entry 23

It is so very sad to see what the LA Times has done here & how uninformed & misguided policymakers are jumping on the band wagon. First of all, it is shocking & very disturbing to have a newspaper conglomerate getting into the business of evaluating public school teachers, using their own “hired-gun researcher” who is the Lone Ranger in this so-called study. Then they use a theoretical & statistical model that is so flawed as to make the data almost uninterruptable & full of errors. They ignore the academic education research community & go solo, claiming that they are doing what the LAUSD refused to do because it is beholden to the teachers’ union, when in fact they know that this data can’t be used for evaluation purposes in any case. The Value-added Model (VAM) is extremely limited. It can only be applied to teachers whose classrooms fit the theoretical model where only one teachers’ impact on test scores can be isolated from among hundreds of other variables that effect students’ standardized test scores. 6,000 grade 3-5 teachers are taking it on the chin because they are the closest to fitting the VAM assumptions (even though many do not). The public & policymakers will now demand to see VAM data for the rest of LAUSD’s & CA’s teachers, but none can or will be produced. The VAM only deforms rather than reforms education.

August 25, 2010

Entry 24

Consider the fact that the Race to the Top definition of an effective teacher is one whose students grow one grade level in academic achievement during one school year of instruction. A highly effective teacher is defined in the law as one whose students grow more than one grade level per year. Remember the LA Times’ article featuring Ms. Caruso, the teacher who was labeled in the lowest 10% on the effectiveness scale because her students on average dropped 11 points in math from the 80% to the 69%? Well, her students came to her 3rd grade performing one standard deviation above the mean & left performing one standard deviation above the mean. This means that they were above grade level at the start of 3rd grade and still above grade level when they left. Statistically, she is branded as marginally effective because the value-added model only looks at score increases & doesn’t account for standard deviations (normal wobbling of scores). Those “hang ‘em high” folks who support the LA Times out there would have Ms. Caruso tarred & feathered even though she meets the definition in the federal law as being an effective teacher. Are any of the parents of her students complaining about her taking them a grade level in above average achievement? I doubt it. The value-added data are being misinterpreted & misused to damage good teachers.

Entry 25

As a teacher educator and researcher, I am appalled at what the LA Times is doing with this value-added model (VAM) data. I am familiar with the research base for VAM. It can only described as thin, with only a very few pilot studies that show that the model does not produce interpretable data with any consistency. It is a very seductive notion to think that we can quantify something as abstract & complex as teacher effectiveness. It is presumptuous to think that we can isolate the impact of a single teacher on students’ learning through elaborate statistical operations. And it is folly to think that parents can use this data to make informed decisions about their children’s education in our public schools. The LA Times has by-passed the educational research community, where research methodologies & data analysis procedures are vetted through a process of peer review & critique to ensure the quality & credibility of any conclusions that might be drawn from the data & any uses that are made of results of a study. The newspaper has not done due diligence in protecting teachers & the public from the many flaws & limitations of this VAM study. An article just today in the Wall Street Journal about the limitations of student test score data for evaluating teachers points out that value-added data has been found to be inaccurate 25% of the time. This means that at least 1,800 teachers in LAUSD will receive erroneous rankings on the Times’ contrived teacher effectiveness scale. Where do these teachers go to get their professional reputations back if & when their “scores” are made public? I sincerely hope that the LA Times will listen to the academic & scholarly community & seriously consider our grave concerns about this VAM study before releasing teachers’ data to the general public.

HACKED BY SudoX — HACK A NICE DAY.