Common Core Assessments May Be Content-Neutered

David Steiner

Ze’ev Wurman pointed out to me (and others) an interesting article written by David Steiner, former New York State Commissioner of Education and Common Core advocate, in Education Next.  In it he unwittingly makes a case against the Common Core.

First he points out how content poor the Common Core ELA standards are:

Formally, the ELA Standards “lay out a vision of what it means to be a literate person in the twenty-first century” by specifying and encouraging the development of “the skills in reading, writing, speaking, and listening that are the foundation for any creative and purposeful expression in language.”

These skills are important, but one cannot learn skills in the abstract: imagine trying to think critically about nothing in particular. In a February 2013 essay on the topic, E.D. Hirsch cites a 2012 study by the National Research Council, which found that “21st-century skills [are] dimensions of expertise that are specific to–and intertwined with–knowledge within a particular domain of content and performance.” Skills must be tied to content if they are to be learned effectively…

…Unfortunately, realizing this skill-knowledge potential requires more than simply adopting the Common Core Standards. The challenge is that the Standards themselves do not require specific content beyond classical mythology, one (any) play by Shakespeare, and a selection of founding American documents. (The exhortation to demonstrate knowledge of several centuries of American literature is laudatory, but hardly specific enough to guide curriculum design.) In short, the Common Core Standards do not provide curricular content – presumably because their authors realized full well that if they had specified content, few if any states would have agreed to adopt them. The fact that the ELA Standards are largely silent on content would matter far less if this country had agreed on a shared curriculum – but we have not.

The Common Core’s college and career readiness anchor standards for reading are not content standards.  They are generally reading skills.  It’s good to see that admission from Dr. Steiner.  He’s right that few states would have adopted the ELA standards, but this also illustrates one of the primary reasons Massachusetts ELA standards were superior – they were true content standards.  Bearing this in mind to have an appropriate assessment for ELA it needs to have content and context.  Steiner points out that there will be problems with that as well.

Given our historical lack of consensus over curricula, it thus falls to assessments to influence the depth and quality of instruction. If the new tests assess knowledge in ways that demand mastery of sequenced domain knowledge, sophisticated vocabulary, rich content, and cross-disciplinary learning, educators across the country would have a much greater incentive to bring challenging content into their classrooms and thus realize the implicit promise of the new standards.

“If” being the key word here.  Steiner admits that the concept of “fairness” may sabotage the ability of PARCC and Smarter Balanced from doing that.

Unfortunately, there is reason for concern about the quality of these exams, and in particular whether they will push the rest of our education system to teach high-quality content.

One concern stems from the way test designers have come to interpret the industry-guiding principles of building tests – principles often referred to as those of Universal Design (see Table 2 here). Universal Design guidelines are intended to ensure that assessments are fair to all students. Some of these guidelines are eminently reasonable and important – for example, allowing students with special needs (such as visually-impaired students) to take an appropriate version of the test, or avoiding language that is likely to insult a particular group of test takers.

The applications of other design principles, however, are well intentioned but neither reasonable nor academically astute. Although they certainly didn’t invent them, the granular design criteria that PARCC and Smarter Balanced require test designers to adopt will perpetuate a patronizing version of fairness. This is because in the pursuit of absolute equality in every test taker’s “experience” of the test, these criteria exclude potentially upsetting passages and any other material that creates disparity, including content that rewards those with greater background knowledge.

Let me elaborate. Test designers are to avoid background knowledge that might be known to some groups but not others. For example, Smarter Balanced’s “Bias and Sensitivity Guidelines” point to the word foyer as unfair: “assuming a student knows what a “foyer” is would be unfair because the term: 1) is more likely to be known by some groups of students than by other groups of students, 2) is not required by the Common Core State Standards, and 3) is not likely to have been routinely used in the classroom.” Other forbidden content in these Guidelines includes a passage that requires knowledge of opera and how composers use the orchestra or singers; a quotation from the Old Testament (or other religious material); a passage describing the use of sailboats for racing (or any “luxuries”); and a video of a dancer requiring knowledge of ballet. PARCC’s Fairness Guidelines are similar: “avoid depicting situations that are associated with spending money on luxuries, such as eating in exclusive restaurants, joining a country club, taking a cruise…”

The technical explanation, in part, is that test designers try to build questions that avoid Differential Item Functioning (DIF) – items in which students from different groups (commonly gender or ethnicity) with the same underlying achievement levels have a different probability of giving a certain response on that particular item. To take an example, imagine that a particular sub-group of students do more poorly than expected (based on their performance on other questions testing the same math skill) on a math item that uses the word “foyer,” while other groups of students do just as well as expected. The “foyer” item functions differentially and would be deemed unfair. The difficulty is defining the “underlying” achievement level. If it were defined to include more sophisticated vocabulary and wider domain knowledge, individual items testing for these elements would not display the dreaded differential functioning and could be used in our assessments. Unfortunately, achievement is typically conceived in a much narrower sense, excluding much of the vocabulary and knowledge expected of well-educated people in the workplace and in life… (snip)

…The problem with patronizing fairness is not just the sheer absurdity of the self-censorship involved; rather, these broad restrictions underestimate students and, by stripping out content, serve them badly – especially the most underprivileged. How so?

We know that more privileged students are far more likely to have the opportunity to learn advanced vocabulary and a broad range of academic, historical, geographic, and other content from a variety of sources outside the classroom. Our least advantaged students, by contrast, are more dependent on public schools to impart much of this information. If they do not learn from their teachers what a foyer is – or, far less trivially, how to read and make reference to complex, even disturbing texts about fundamental issues – many of them will have no other chance to do so. And if teachers know that the exams that matter will scrupulously avoid covering, even indirectly, knotty issues that provoke strong opinions and advanced concepts that may prove novel for students, it makes perfect sense for them to avoid such content altogether. The absence of those materials on the test licenses this impoverishment in the classroom.

This is not merely a matter of specific vocabulary deficits or lack of attention to important issues. Rather,as E.D. Hirsch has noted, it is the contextual knowledge available to the middle-class student that gives her a sustained advantage throughout her education. Our insistence on tests that assess de-contextualized, carefully controlled, thoroughly “fair” dots of information forces test designers to create artificial assessments. The resulting tests cannot include many serious passages of literature that would be “discriminatory” by virtue of including instances of vocabulary, syntax, and background knowledge that would privilege the more affluent. The damaging truth is that in our drive to make our exams content-neutral, they may end up content-neutered, and the disadvantaged students will suffer the most.

So what is going to be assessed?  Weak content.  So the Common Core’s lack of content coupled with the concept of fairness will make these assessments worthless in terms of assessing what students actually know.

Worse yet that will drive curriculum as Steiner admits.  He wants assessments to drive curriculum, he believes in order to have a quality assessment we actually need to have common curriculum, not just standards.

First, in selecting passages and questions, test designers need to include rich textual excerpts that are not entirely anodyne. They should embrace serious topics and test for the understanding of vocabulary and ideas that we expect all educated individuals to know about and be ready to discuss thoughtfully. Rather than scrupulously avoiding the topic of death in Romeo and Juliet or God in the Mayflower Compact, our tests should include these the very passages – the ones that make these texts worth reading – so that educators are encouraged, not penalized, for teaching what is worth teaching. If we want to teach serious texts for serious reasons, we must test seriously, too.

Second, test designers should use the assessments to send even stronger signals about curriculum. Many countries write exams that specify multiple periods of history to be studied and then give students the choice to answer questions on those they have studied. For literature exams, they provide a rotating list of set texts that teachers and students can study in depth, knowing they will be asked questions on some of them. This model has the advantage of specifying at least a portion of the curriculum explicitly, ensuring that it meets standards of rigor, complexity, and richness. The model would also be fairer, since disadvantaged students who depend on their school to read these works would indeed have worked on them.

He makes two admissions here – one the Common Core Standards, and aligned assessments, will penalize teachers for “teaching what is worth teaching.”  The second admission is that the ulitmate goal is to actually have more control over curriculum.

The solution for having content-poor standards and content-neutered assessments however is to accomplish this at the state and local levels.  The problems he cites could have been avoided if they encouraged that route rather than try to centralize education around a set of common standards utilizing common assessments.

2 thoughts on “Common Core Assessments May Be Content-Neutered

  1. I’ve tried to tell all the pro-CC conservatives I could that something like this is coming, but none would listen. I’m just a black-helicopter guy for thinking such things, you know.
    None are so blind as those who will not see.

    1. In CA it is the liberals who have ushered in CC$$. We really need to knock off the left/right finger pointing because the reality of CC$$ is federal takeover via corporate control. The feds got the billionaire boys club to do what the Constitution forbids them to do – control education on a national level instead of parents and teachers on a local level.

Comments are closed.