What about the textbook reviews?

18 May, 2024
21 Comments

Parent Question:

Our school district is using a program that has received many bad reviews, including by EdReports. We raised that with our School Superintendent, and she indicated that EdReports is revamping its review process so their evidence doesn’t mean anything. What do you think?

Teacher Question:

EdReports and Knowledge Matters Campaign and others are requiring that high quality texts build background knowledge—a good thing. However, they are expecting it to be through a topical approach not a broader thematic approach. One curriculum that is touted as strong in this area addresses one topic for 18 weeks! So the question I am asking is what is the difference between a topical approach or a thematic approach and which is preferred?

Reporter:

https://www.forbes.com/sites/nataliewexler/2024/02/22/literacy-experts-say-some-edreports-ratings-are-misleading/?sh=75e5be984128

Shanahan Responds:

Over the past few weeks, I’ve been inundated by emails and phone calls about EdReports and a couple of other textbook review protocols (those issued by Knowledge Matters Campaign and Reading League).

I usually stay away from this kind of topic since I help design commercial programs and try to avoid conflicts of interest. At this point, the problems with these reviews have gotten so broad and so general, that I can discuss them without any danger of conflict.

I’ve noticed six problems with these reviews and have suggested solutions to each.

1. Educators are placing way too much trust in these reviews.

These review processes have been undertaken by groups who want to make sure that curriculum materials are up to snuff. But what that means differs from review process to review process. Each review organization has different beliefs, goals, and methodologies. Accordingly, reliance upon any one of these reviews may be misleading.

The federal government requires non-profits to file 990 forms. These forms require a declaration of purposes and description of their activities. For instance, the Reading League says that it encourages “evidence-aligned instruction to improve literacy outcomes” and EdReports aims to provide “evidence-based reviews of instructional material.”

The Knowledge Matters Campaign is a bit different. They are not a free-standing entity but part of another non-profit, “Standards Work”. In their 990, Knowledge Matters Campaign is described as an “advocacy effort” that showcases “high-quality, knowledge-building curriculum.” There is nothing wrong with advocating for favorite commercial programs. That just isn’t the best basis for providing objective reviews or sound review tools. They can provide such reviews, but I think their documents should transparently state those prior commitments. One can only wonder about a review process that starts with the programs they like, and then subsequently formulates review criteria based on that.

The Reading League and EdReports both explicitly claim to support “evidence-aligned” curricula. This is a bit of shift for EdReports since originally its goal was to ensure alignment with the Common Core State Standards. Knowledge Matters Campaign does not seem to make that “evidence-based” assertion, though their review protocol mimics the others in how it uses research citations.

The point here is that these kinds of review often have other motives other than those of the educators who use them. Unless you can be certain of their motives – both in terms of declared purposes and in their alignment with those claims – buyer beware! It’s one thing to try to ensure that school practices are in accord with what we know, it is quite another to establish review criteria based on other considerations, no matter how well-meaning those consideration may be.

All these reviews are spotty at best when it comes to this alignment so I would discourage curriculum selection processes that depend entirely or mainly on any of these reviews. I wouldn’t ignore them entirely; I would just independently verify each demerit they assign – including determining whether that criterion even matters.

A cool thing that that Reading League does is that is shares the publisher’s responses to their “red flag” warnings. That kind of transparency is good because it should help school districts to consider both sides of an issue. For instance, in a program I’m involved in, Reading League flagged a particular practice their reviewers didn’t like. The fact that this device was in three lessons out of about 900 in the program or that we could provide substantial research support for the specific practice made no difference to them. In such instances, having both the review claim and the publisher response should help districts to examine such issues and to decide for themselves if that is really a problem or a big enough problem to matter.

That’s how it should work. When districts surrender all judgment to these organizations – refusing to consider or purchase any program that gets dinged no matter the evidence – then the game is lost. Instead of getting the best programs available, schools will end up with the programs that best meet some groups’ ideological positions.

2. What Constitutes Evidence?

Despite the rhetoric of these groups, the term “evidence aligned” is meaningless. Often there is no direct evidence that what is being required has ever benefited children’s learning in a research study.

By contrast, the National Reading Panel and the What Works Clearinghouse have required – before they will say anything works – that studies have directly evaluated the effectiveness of that something and found it to be advantageous to learners.

The review documents cite research studies for each criterion. This may seem convincing to some district administrators. However, if you look more closely, you’ll find that the evidence is woefully uneven. In some cases, the research is substantial, in others there is no direct evidence – only poorly controlled correlational studies or evidence that a particular topic is important, but with no proof about how pedagogy might best address that important issue.

I wish they would all take the approach that the What Works Clearinghouse (WWC) takes. The Clearinghouse allows their guest experts to make any claims they want to, but then it checks out the nature and quality of the evidence supporting those claims. Their reports tell you what the experts say, but then report how strong that case is in terms of actual research support.

That kind of reporting would allow districts to know that the phonics requirements for Grades K-2 were supported by substantial research, but that the phonics claims for the upper grades are proffered with little evidence.

What Works would allow the encouragement of decodable texts and of favored approaches to teaching background knowledge. But they would require an admission that those criteria are not really evidence aligned.

Sadly, too many district administrators assume that the opposite must be true. They are wrong. These reviews adopt review criteria and then seek any kind of evidence to support those choices – no matter how unconvincing and uneven that evidence may be.

3. Grain Size Problems

Not all the criteria included in these reviews appear to be of equal importance.

For example, the Reading League requires that reading programs require both explicit teaching of phonics and handwriting. A program that lacks either can be smacked for the omission, and some districts, by policy, will then prohibit the consideration of such programs.

Don’t get me wrong. It makes sense for schools to explicitly teach both phonics and handwriting. Core reading programs should include phonics, given that such instruction contributes fundamentally to early reading development, and it seems prudent to align the decoding lessons with the rest of the program (though, admittedly, there is no direct research evidence supporting that concern).

The benefits of teaching handwriting, however, do not accrue directly to reading, and I am aware of no data that shows either necessity or benefit of aligning such instruction with the rest of the reading lessons. It would not be perverse for a district to purchase separate reading and handwriting programs.

To me these criteria are not equivalent. A low mark in one should be a real concern. A low mark in the other may be informative but it should not be determinative. There are many such false equivalences throughout these evaluation schemes: some criteria flag essentials and some could safely be ignored.

4. Measurement Problems

Even when a criterion is spot on one might wonder about how to determine if that criterion has been sufficiently addressed. Knowledge Matters Campaign encourages the teaching of comprehension strategies – a reasonable thing to do given the extensive research supporting their benefits – and, yet how much strategy teaching will be sufficient to meet the standard? It is easy to see how two well-meaning and careful reviewers could disagree about an issue like that.

The teacher letter included above points out a reading program that devotes 18 weeks to one content topic. Such a program would surely meet the Knowledge Matters Campaign criteria, though to me that sounds like overkill – certainly not something that possesses research support. If I reviewed it, I’d be critical that the program is too narrow in focus while other reviewers might conclude that it addresses the knowledge building criteria appropriately.

The more specific the review criteria are, the more reliable should be the reviews. However, the more specific they are, the harder it is to justify them given the nature of research. For me, I’d prefer that everyone has all the information available:

“We reviewed this program and judged that it met our knowledge building criteria. That’s a plus. However, it is so narrowly focused that we wondered if that is best (and we know of no direct research evidence on this matter). Students taught from this program are likely to know more about electricity than any group of fourth graders in the history of mankind. If they are ever again asked to read about electricity, they will likely achieve the highest reading comprehension ever recorded – if they do not run screaming from the testing room.”

There is a body of research on sentries and their ability to protect military installations. These studies find that the more specific and exacting the criteria for entering a camp (e.g., how exactly someone needs to know the password), the more likely that friendly troops will be shot. The more liberal those entry procedures, the more likely an enemy will gain entrance. The Knowledge Matters Campaign criteria look to me to be the most general and the Reading League ones seem most specific. That probably means that if you go with one, you will be more likely to reject sound programs and with the other, weaker programs may sneak through.

My preference would be for districts to appreciate the limitations of these reviews. That doesn’t mean ignoring their information but considering their claims with the same gimlet-eye that should be use with any of the claims made for the commercial programs. Do they just say there is a problem, or do they specifically document their concerns, perhaps like this:

“We did not think programs should include lessons that encouraged this-or-that kind of an activity. We reviewed six grade levels of this program and found it included 420 fluency lessons. It earned a demerit because twice in the second-grade program it encouraged the this-or-that activity.”

That way, a district could decide whether such an inclusion mattered much to them, either in terms of how serious the infraction or its extent. It would also be good if the producer of that curriculum weighed in to either admit they screwed up or to defend their approach. By reporting not just that there was an infraction to the review criteria, but the extent of the problem, districts would better be able to use the reviews appropriately.

5. Effectiveness versus Potential Effectiveness

Product reviews don’t tell us what works in terms of improving reading achievement. No, they only reveal the degree to which the program designs are consistent with research, standards, or someone’s ideology.

The National Reading Panel reported that fluency instruction in grades 1-4 and with struggling readers 1-9 improved reading achievement. These program reviews all require that programs address fluency, and in some cases they even specify some preferred details about that instruction.

The idea is that a program that includes fluency teaching like the fluency teaching delivered in the studies is going to be advantageous. That is a hope and not a fact, because most core programs have no data showing that their fluency lessons boost reading achievement.

We are aiming for a possibility. The idea is that when research proves that an approach can be effective, we should encourage schools to replicate such instruction in the hopes that they will obtain the same results.

This is nothing like the standards that we have for medical and pharmacological research. They must show that their version of a treatment works; it is not enough to show that they are trying to do something like what has worked elsewhere.

This is an important distinction.

The Bookworms program apparently received low reviews from EdReports, despite having rigorous, refereed research studies showing its actual effectiveness – not that it was designed to look like what was done in the studies, but that its design really did pay off in more student learning.

I’m flabbergasted that EdReports (and the other reviews) don’t leave themselves an out here: If there is sound, high-quality research showing the effectiveness of a specific program, who cares whether it matches your predictive review criteria? Program A looks like the research, Program B doesn’t look as much like the research, but it is very effective in teaching children.

The review agencies should have a provision saying that they will give an automatic pass to any program with solid direct research support concerning its actual effectiveness.

I would still review their program. However, my purpose here would be to try to figure out how a program that failed to meet my criteria did so well. Perhaps the reviews were sloppy, which might require more rigorous training of reviewers. Another possibility is that the criteria themselves may be the problem. Maybe some of the “non-negotiables” should be a lot more negotiable after all.

6. Usability

I’m a bit thrown by the usability requirements in some of these reviews. I agree with them in one sense. If teachers struggle to use a program then it’s unlikely to be effective no matter what non-negotiables it addresses.

However, I know of no research that can be used as the basis of evaluating usability, so what constitutes it is more of an act of reason than of evidence-alignment. Knowledge Matters Campaign wants programs to include not just what it is that teachers are supposed to do, but explanations for why those things should be done. I love that but have no idea whether that would improve practice.

I think the reason for this emphasis on usability may come from the fidelity evaluations that are now common in instructional research studies. Researchers, to ensure that it is their instruction that is making the difference, do many things to try to guarantee fidelity to their plan. This includes teaching the lessons themselves, using video teachers, scripting the lessons, and observing their delivery, and so on. That kind of thing makes great sense in a 12-week research study which didn’t include any second language students or kids below the 40^th percentile and was only taught to children whose parents granted approval.

It is a lot harder to argue for especially-narrow prescriptive, non-adjustable approaches – lessons aimed at making certain the teachers don’t screw it up by varying from what’s in the teacher’s guide – in real classrooms. The idea of teaching everyone the same lesson, no matter what they already know, may make sense to some “reading advocates.” Nevertheless, it is a lousy idea for kids and reading achievement.

Many districts, in their selection procedures, require tryouts of new programs – or at least they used to. Some of their teachers try to deliver the lessons for several weeks to see how workable the product may be. This makes a lot more sense to me than the armchair usability criteria in these reviews. Again, districts make a big mistake in ceding their responsibility to these reviews. Some things should be done carefully in house.

What are the big take-aways here?

1. The development of commercial programs for teaching reading are serious endeavors that can provide valuable supports to teachers. However, such program designs are fallible. They require the contributions of dozens, perhaps hundreds, of people whose knowledge and efforts can range greatly. It is sensible for districts to purchase such programs, and essential that they take great care in this to try to end up with supports that really help teachers and students succeed.

2. There are benefits to having third-party reviews of these kinds of programs, both by government agencies (e.g., What Works Clearinghouse) and by non-profits that are not commercially entangled with the corporations that sell these programs. These external reviews can warn consumers (the school districts) of egregious problems, and they can push publishers to do better.

3. These kinds of reviews are likely to be most useful when they depend substantially on high quality research – approving programs that have been proven to provide learning advantages to students and encouraging the close alignment of programs with existing research data.

4. Just as school districts need to be skeptical of undocumented claims of commercial companies (the folks who sell the programs), they must be just as skeptical of the claims of those who critique them. The more transparent, specific, and well-documented these critiques the better. Districts should be wary of simply accepting any negative judgments by reviewers – requiring evidence that the criteria are truly essential to quality and that research really rejects what a program is doing.

5. Districts should adopt sound procedures for choosing programs. These procedures should include consideration of these reviews. However, no district should adopt policies that automatically accept or reject programs based on these reviews.

Here are links to each of these product reviewing organizations:

https://www.edreports.org/reports/ela

https://knowledgematterscampaign.org/review-tool/#research-compendium

https://www.thereadingleague.org/compass/curriculum-decision-makers/

https://ies.ed.gov/ncee/WWC/Search/Products?productType=2

LISTEN TO MORE: Shanahan On Literacy Podcast

Comments

See what others have to say about this topic.

Mark May 18, 2024 06:20 AM

Don't you think that the Knowledge Matters Campaign should also disclose that their parent company, Standards Works, received money from ARC, Core Knowledge, and Open Up Resources for undisclosed reasons? The programs that these companies/organizations produce are all reviewed favorably on the Knowledge Matters website.

This information is freely available online.

Matt May 18, 2024 06:26 AM

Tim, what do you think about the IES review process?

https://ies.ed.gov/pubsearch/pubsinfo.asp?pubid=REL2017219

Timothy Shanahan May 18, 2024 02:03 PM

Matt--
As for those IES criteria, unlike the ones I wrote about, I agree with all of the criteria that they include (they have been careful to stay to the research, rather than making up stuff just because they like it). However, I think it is incomplete, particularly when one considers the role of cognitive strategies and language in comprehension development, nor do they connect the reading and writing at any level (spelling and phonics, writing about text and comprehension). If I were director of reading for a school district (again), and wanted to create text selection criteria, I would start with these and then look at the others to see if something defensible was in them that was not here.

tim

Timothy Shanahan May 18, 2024 02:10 PM

Mark--
I wasn't aware of that conflict of interest. At one time I was on the Advisory Board of KMC. I was happy to help folks who were encouraging that we teach kids more content. However, recently they decided to change direction a bit and it appeared to me that they were out to get certain commercial programs so I dropped off. Conflicts of interest are a serious problem and at least some of the made up stuff in their criteria may have landed there if they were trying to float their own boat.
Of course, I could mention something similar with Reading League. Their standards promote the use of decodables to an extent that far exceeds the conclusions of any scientist who has written about this or the results of any of the published studies. Maybe it is not surprising that they are marketing decodables and draw income from that.
Whether or not there is any real conflict in either of these cases (perhaps the individuals who worked on these criteria were not aware of where the money comes from), the appearance of it should caution school districts about their use. I'm aware of no such conflicts with EdReports. I just hope they improve their process without knuckling under to the pressure groups that have an axe to grind.

tim

Timothy Shanahan May 24, 2024 04:52 PM

Christy--

There is no research showing that either of these necessarily does any better than the other. These days there are folks promoting reading programs with a content emphasis (reading about photosynthesis, planets, Civil Rights movement, etc.). There is no evidence that these programs improve students' reading ability any more than traditional programs have -- though they very well might increase student knowledge of some of those topics. Given that, either approach might make some sense.

My takes is that the books you are consdering are aimed (or should be aimed) at teaching literature. With literature, it is very reasonable that textbooks might be focused on literary themes. Indeed, a high school unit on some topic drawn from science or history may lead to exposure of very different language than one usually gains from literature -- but that is a plus and it si why it is essential that English classes put great emphasis on literature. The "content" of literature includes things like human development, relationships, and emotions -- love, loyalty, trust, embarrassment, greed, conflict are more likely to come up in a literature class than a science class.

There tend to be two different takes on theme -- one that treats it like topical subject matter (the one word list above is an example of that), and the other (the preferred approach) expresses themes more as a stance that someone takes on a topic. A unit on trust for example might expose students to several stories, poems, or essays that get into themes like: it takes time to gain trust, it is difficult to trust adults during one's loss of innocence, it is possible to regain trust.

There is a content to literature, but as you can see it is very different from the kinds of information that secondary students should be reading about in their other classes.

Another way that these programs may organize instruction is by genre (reading different genres of literacy fiction -- romance, adventure, etc., poetry, literary non-fiction) or by literary elements (themes, characterization, plot, meter, etc.). That is a reasonable choice also.

I think you need to think hard about what it is that you are trying to teach students in an English class (what they should know and be able to do by the end of the year or years) and choose accordingly. You can't depend on the science of reading to sort this out.

tim

Jennifer Sep 19, 2024 07:44 AM

Your post, while insightful as it is, it has also left me feeling so discouraged. As a teacher, I am desperate for district leaders to have access to honest and credible resources so that districts, especially those like mine, have something to help guide them as critical consumers and evaluators when adopting new curriculum. Unfortunately, not all district leaders, particularly in my district, are capable of effectively evaluating curricula and continue to make ill-informed decisions—with little to no guidance. Additionally, program evaluation organizations only leave me feeling even more discouraged because who or what can our incompetent district leaders use to help them try to get it right, for once!? That is, even if they know about or choose to seek out such support. The big people have really got to get it together to help the little people and frankly, we just can’t trust that all leaders will do their due diligence in continuing to build their own capacities and ensuring they are critically evaluating the resources in which tons of money inevitably will be spent, and then the materials just get tossed at the teachers—all the while demanding implementation with fidelity—and you know how we all just love the word fidelity.

I’ve just lost hope after so many years of watching this unacceptable practice continue to plague many districts, particularly mine. I’d really like to believe that after all the advancements over the years, along with continued research, we’d surely be doing a much better job at curriculum adoptions by now, and we’d surely be seeking out the most highly qualified and knowledgeable leaders to help bring research into practice, using high-quality teaching materials. So the solution for this? Ah yes! Many states, are now mandating districts to choose from a list of approved curriculum, according to what some team has allegedly vetted and deemed “quality.” Wonder who or what was used to guide that vetting process, and can we trust that these selected team members don’t have their own agendas and/or deep-rooted beliefs that aren’t truly research and evidenced based?? Your post has me skeptical! “Know better, do better? Not so much.

My district just purchased a BRAND NEW phonics-only program for K-2, to piecemeal with Bookworms, and they just purchased a new comprehensive core reading curriculum for 3rd and 4th grade, yet our state ELA standards are changing next year. AND, we just purchased a new math curriculum for K-2, and they are planning to purchase math next year for 3rd and 4th grade. All these decisions were made by one person, who never used an evaluation tool, must less referenced these online evaluation organizations, whether good or bad. Did I mention 64% of our first-grade students are below or well below grade level in reading, based on our BOY Acadience universal sceener? More than 50% are below, or well below, in 2nd and 3rd grades, too.

Obviously I’m deeply concerned for our students and even for the teachers because with all the access to all the resources and research in the world, even informative blogs like yours, it’s trust and credibility in our district leaders that’s lacking, in my case, regardless how credible these curriculum evaluation organizations are or aren’t.

As always, I sincerely appreciate your dedication to sharing research with educators.