One of the fi rst schooling experiments I ever conducted was, in fact, an indirect test of this proposition. My study was inspired by a very famous educational researcher at the time, W. James Popham (mentioned previously as a critic of value-added teacher assessment), who conducted a series of experiments that were designed to find a way to measure teaching proficiency but inadvertently found instead that neither teacher experience nor training had any effect upon student learning.
The rationale for his studies was innocuous enough. Popham hypothesized that perhaps one reason we cannot differentiate exemplary teachers from abysmally ineffective ones (always defi ned, incidentally, by how much their students learned ) was that our standardized tests simply weren’t sensitive enough to measure teacher performance. Just as today, these large amorphous tests weren’t that closely matched to the school curriculum, so commercial tests themselves didn’t necessarily assess what teachers actually taught in their classrooms.
How then could they be used to measure teaching performance? Especially since up to 60 % of these test scores are due to individual differences in student backgrounds, thereby leaving only 40 % to be explained by other factors (of which teacher differences may account for only a small percentage). So, Popham decided to start from scratch and develop a series of teaching performance tests . First, he designed experimental units based upon discrete instructional objectives, which refl ect small pieces of instruction that can be tested directly such as:
Sample Instructional Objective: “Given any two single digit numbers, the student will be able to supply their sum.”
Then, each instructional objective was accompanied by a test item that assessed its mastery:
Sample Test Item Assessing this Objective: 7 + 4 = ___.)
The use of instructional objectives and tests based upon them accomplished two crucial functions:
- They ensured that the teachers knew exactly what they were expected to teach, and
- The resulting tests assessed exactly what the teacher was expected to teach, nothing more and nothing less.
Thus, for our exceedingly simple illustrative instructional objective above (Popham used more complex ones in his studies involving high school students), there are exactly 100 (and only 100) test items that can be generated to assess the degree to which students mastered the objective (and presumably how well the teacher performed her or his job).
Before advocating the use of his tests as a full-blown measure of teacher profi ciency, however, Popham wisely decided to validate his approach via a technique called the “known-groups” approach. The logic behind this technique involved fi nding two groups of teachers who were “known” to differ on the “thing” being assessed, having them teach the same instructional unit to a comparable classroom, and then seeing if the students taught by the two groups differed in the amount they learned.
In this case, the “thing” was teacher proficiency in eliciting learning, so the first task was to find two groups of teachers: one of whom was known to be much more proficient than the other. But therein lay a classic Catch. How could anyone identify proficient versus non proficient teachers if a test didn’t yet exist that was capable of rank ordering instructional success?
No problem for Popham. He simply defined his proficient group as professionally trained, credentialed, experienced teachers and his nonproficient group as individuals who had never had any formal teacher training or teaching experience, such as housewives, electricians, and auto mechanics. (The housewives taught social studies, while the other two groups taught topics in their respective vocations.)
Read More : Teacher Training