Achievement Gap

  • Where Do Achievement Gaps Come From?

    Written on August 10, 2017

    For almost two decades now, educational accountability policy in the U.S. has included a focus on the performance of student subgroups, such as those defined by race and ethnicity, income, or special education status. The (very sensible) logic behind this focus is the simple fact that aggregate performance measures, whether at the state-, district-, or school levels, often mask large gaps between subgroups.

    Yet one of the unintended consequences of this subgroup focus has been confusion among both policymakers and the public as to how to interpret and use subgroup indicators in formal school accountability systems, particularly when those indicators are expressed as simple “achievement gaps” or “gap closing” measures. This is not only because achievement gaps can narrow for undesirable reasons and widen for desirable reasons, but also because many gaps exist prior to entry into the school (or district). If, for instance, a large Hispanic/White achievement gap for a given cohort exists at the start of kindergarten, it is misleading and potentially damaging to hold a school accountable for the persistence of that gap in later grades – particularly in cases where public policy has failed to provide the extra resources and supports that might help lower-performing students make accelerated achievement gains every year. In addition, the coarseness of current educational variables, particularly those usually used as income proxies, limits the detail and utility of some subgroup measures.

    A helpful and timely little analysis by David Figlio and Krzystof Karbownik, published by the Brookings Institution, addresses some of these issues, and the findings have clear policy implications.

  • Improving Accountability Measurement Under ESSA

    Written on May 25, 2017

    Despite the recent repeal of federal guidelines for states’ compliance with the Every Student Succeeds Act (ESSA), states are steadily submitting their proposals, and they are rightfully receiving some attention. The policies in these proposals will have far-reaching consequences for the future of school accountability (among many other types of policies), as well as, of course, for educators and students in U.S. public schools.

    There are plenty of positive signs in these proposals, which are indicative of progress in the role of proper measurement in school accountability policy. It is important to recognize this progress, but impossible not to see that ESSA perpetuates long-standing measurement problems that were institutionalized under No Child Left Behind (NCLB). These issues, particularly the ongoing failure to distinguish between student and school performance, continue to dominate accountability policy to this day. Part of the confusion stems from the fact that school and student performance are not independent of each other. For example, a test score, by itself, gauges student performance, but it also reflects, at least in part, school effectiveness (i.e., the score might have been higher or lower had the student attended a different school).

    Both student and school performance measures have an important role to play in accountability, but distinguishing between them is crucial. States’ ESSA proposals make the distinction in some respects but not in others. The result may end up being accountability systems that, while better than those under NCLB, are still severely hampered by improper inference and misaligned incentives. Let’s take a look at some of the key areas where we find these issues manifested.

  • New Evidence On Teaching Quality And The Achievement Gap

    Written on November 17, 2016

    It is an extensively documented fact that low-income students score more poorly on standardized tests than do their higher income peers. This so-called “achievement gap” has persisted for generations and is still one of the most significant challenges confronting the American educational system.

    Some people tend to overstate -- while others tend to understate -- the degree to which this gap is attributable to differences in teacher (and school) effectiveness between lower and higher income students (with income usually defined in terms of students’ eligibility for subsidized lunch assistance). As discussed below, the evidence thus far suggests that lower income students are a more likely than higher income students to have less “effective” teachers -- with effectiveness defined in terms of the ability to help raise student test scores, or value-added, although the magnitude of these discrepancies varies by study. There are also some compelling theories as to the possible mechanisms behind these (often modest) discrepancies, most notably the fact that schools in low-income neighborhoods tend to have fewer resources, as well as more trouble recruiting and retaining highly qualified, experienced teachers.

    The Mathematica Policy Research organization recently released a very large, very important study that addresses these issues directly. It focuses on shedding additional light on the magnitude of any measurable differences in access to effective teaching among students of different incomes (the “Effective Teaching Gap”), as well as the way in which hiring, mobility, and retention might contribute to these gaps. The analysis uses data on teachers in grades 4-8 or 6-8 (depending on data availability) over five years (2008-09 to 2012-13) in 26 districts across the nation.

  • Do Subgroup Accountability Measures Affect School Ratings Systems?

    Written on October 28, 2016

    The school accountability provisions of No Child Left Behind (NCLB) institutionalized a focus on the (test-based) performance of student subgroups, such as English language learners, racial and ethnic groups, and students eligible for free- and reduced-price lunch (FRL). The idea was to shine a spotlight on achievement gaps in the U.S., and to hold schools accountable for serving all students.

    This was a laudable goal, and disaggregating data by student subgroups is a wise policy, as there is much to learn from such comparisons. Unfortunately, however, NCLB also institutionalized the poor measurement of school performance, and so-called subgroup accountability was not immune. The problem, which we’ve discussed here many times, is that test-based accountability systems in the U.S. tend to interpret how highly students score as a measure of school performance, when it is largely a function of factors out of schools' control, such as student background. In other words, schools (or subgroups of those students) may exhibit higher average scores or proficiency rates simply because their students entered the schools at higher levels, regardless of how effective the school may be in raising scores. Although NCLB’s successor, the Every Student Succeeds Act (ESSA), perpetuates many of these misinterpretations, it still represents some limited progress, as it encourages greater reliance on growth-based measures, which look at how quickly students progress while they attend a school, rather than how highly they score in any given year (see here for more on this).

    Yet this evolution, slow though it may be, presents a somewhat unique challenge for the inclusion of subgroup-based measures in formal school accountability systems. That is, if we stipulate that growth model estimates are the best available test-based way to measure school (rather than student) performance, how should accountability systems apply these models to traditionally lower scoring student subgroups?

  • An Alternative Income Measure Using Administrative Education Data

    Written on September 16, 2016

    The relationship between family background and educational outcomes is well documented and the topic, rightfully, of endless debate and discussion. A students’ background is most often measured in terms of family income (even though it is actually the factors associated with income, such as health, early childhood education, etc., that are the direct causal agents).

    Most education analyses rely on a single income/poverty indicator – i.e., whether or not students are eligible for federally-subsidized lunch (free/reduced-price lunch, or FRL). For instance, income-based achievement gaps are calculated by comparing test scores between students who are eligible for FRL and those who are not, while multivariate models almost always use FRL eligibility as a control variable. Similarly, schools and districts with relatively high FRL eligibility rates are characterized as “high poverty.” The primary advantages of FRL status are that it is simple and collected by virtually every school district in the nation (collecting actual income would not be feasible). Yet it is also a notoriously crude and noisy indicator. In addition to the fact that FRL eligibility is often called “poverty” even though the cutoff is by design 85 percent higher than the federal poverty line, FRL rates, like proficiency rates, mask a great deal of heterogeneity. Families of two students who are FRL eligible can have quite different incomes, as could two families of students who are not eligible. As a result, FRL-based estimates such as achievement gaps might differ quite a bit from those calculated using actual family income from surveys.

    A new working paper by Michigan researchers Katherine Michelmore and Susan Dynarski presents a very clever means of obtaining a more accurate income/poverty proxy using the same administrative data that states and districts have been collecting for years.

  • The Role Of Teacher Diversity In Improving The Academic Performance Of Students Of Color

    Written on October 14, 2015

    Last month, the Albert Shanker Institute released a report on the state of teacher diversity, which garnered fair amount of press attention – see here, here, here, and here. (For a copy of the full report, see here.) This is the second of three posts, which are all drawn from a research review published in the report. The first post can be found here. Together, they help to explain why diversity in the teaching force—or lack thereof—should be  a major concern.

    It has long been argued that there is a particular social and emotional benefit to children of color, and especially those children from high-poverty neighborhoods, from knowing—and being known and recognized by—people who look like themselves who are successful and in positions of authority. But there is also a growing body of evidence to suggest that students derive concrete academic benefits from having access to demographically similar teachers.

    For example, in one important study, Stanford professor Thomas Dee reanalyzed test score data from Tennessee’s Project STAR class size experiment, still one of the largest U.S. studies to employ the random assignment of students and teachers. Dee found that a one-year same-race pairing of students and teachers significantly increased the math and reading test scores of both Black and White students by roughly 3 to 4 percentile points. These effects were even stronger for poor Black students in racially segregated schools (Dee, 2004).

  • The Persistence Of School And Residential Segregation

    Written on April 24, 2015

    School segregation is a frequent topic of discussion in U.S. education policy debates, and rightfully so (Orfield et al. 2014). The segregation of schools by race, ethnicity and income both reflects and perpetuates inequitable opportunities in the U.S. (e.g., Reardon and Bischoff 2011a; Kaufman and Rosenbaum 1992).

    Needless to say, school segregation, within and between districts, is primarily a function of residential segregation – the spatial isolation of individuals and families by characteristics such as race, ethnicity, income, language, education, etc. There are several different ways to measure segregation, since it can be gauged by different traits (e.g., income, ethnicity), and at different levels – e.g., state, county, city, neighborhood, etc. The choices of variables can have a substantial impact on the conclusions one draws about segregation's levels and trends (Reardon and Owens 2014). One generalization, though, is in order: In the U.S., we have tended to gravitate toward “our own kind,” whether in terms of income or race and ethnicity. This disquieting reality is neither accidental nor mostly the result of individual preferences. In addition to the obvious historical causes (e.g., Jim Crow), segregation arises and is perpetuated by a complex mix of (often institutionalized) factors, such as the spatial patterning of housing costs, density zoning, “steering,” “redlining,” overt discrimination, etc. (e.g., Ondrich et al. 2002). And, finally, there is the stark fact that the nation's poor have very few choices in terms of housing and neighborhood, and many of those choices they do have are bad ones.

    That said, it bears keeping in mind that the majority of families and individuals in America do indeed have the means to make meaningful choices about where and how they live, and even those who desire to live in an integrated neighborhood also weigh many other, meaningful factors – such as housing costs, convenience to stores and transportation, crime rates, schooling options, and so on. There is some evidence of progress in residential (e.g., Ellen et al. 2012) and school integration (e.g., Stroub and Richards 2013) by race and ethnicity, but increasing segregation by income (e.g., Reardon and Bischoff 2011b) Nevertheless, on the whole, integration tends to be unstable, while segregation tends to be more persistent.

  • The Big Story About Gender Gaps In Test Scores

    Written on March 19, 2015

    The OECD recently published a report about differences in test scores between boys and girls on the Programme for International Student Assessment (PISA), which is a test of 15 year olds conducted every three years in multiple subjects. The main summary finding is that, in most nations, girls are significantly less likely than boys to score below the “proficient” threshold in all three subjects (math, reading and science). The report also includes survey items and other outcomes.

    First, it is interesting to me how discussions of these gender gaps differ from those about gaps between income or ethnicity groups. Specifically, when we talk about gender gaps, we interpret them properly – as gaps in measured performance between groups of students. Any discussion of gaps between groups defined in terms of income or ethnicity, on the other hand, are almost always framed in terms of school performance.

    This is partially because schools in the U.S. are segregated by income and ethnicity, but not really by gender, and also because some folks have a tendency to overestimate the degree to which income- and ethnicity-based achievement gaps stem from systematic variation in schooling inputs, whereas in reality they are more a function of non-school factors (though, of course, schools matter, and differences in school quality reinforce the non-school-based impact). That said, returning to the findings of this report, I was slightly concerned with how, in some cases, they were reported in the media.

  • Rethinking The Use Of Simple Achievement Gap Measures In School Accountability Systems

    Written on November 17, 2014

    So-called achievement gaps – the differences in average test performance among student subgroups, usually defined in terms of ethnicity or income –  are important measures. They demonstrate persistent inequality of educational outcomes and economic opportunities between different members of our society.

    So long as these gaps remain, it means that historically lower-performing subgroups (e.g., low-income students or ethnic minorities) are less likely to gain access to higher education, good jobs, and political voice. We should monitor these gaps; try to identify all the factors that affect them, for good and for ill; and endeavor to narrow them using every appropriate policy lever – both inside and outside of the educational system.

    Achievement gaps have also, however, taken on a very different role over the past 10 or so years. The sizes of gaps, and extent of “gap closing," are routinely used by reporters and advocates to judge the performance of schools, school districts, and states. In addition, gaps and gap trends are employed directly in formal accountability systems (e.g., states’ school grading systems), in which they are conceptualized as performance measures.

    Although simple measures of the magnitude of or changes in achievement gaps are potentially very useful in several different contexts, they are poor gauges of school performance, and shouldn’t be the basis for high-stakes rewards and punishments in any accountability system.

  • The Global Relationship Between Classroom Content And Unequal Educational Outcomes

    Written on July 29, 2014

    Our guest author today is William Schmidt, a University Distinguished Professor and co-director of the Education Policy Center at Michigan State University. He is also a member of the Shanker Institute board of directors.

    It is no secret that disadvantaged students are more likely to struggle in school. For decades now, public policy has focused on how to reduce the achievement gap between poorer and more affluent students. Despite numerous reform efforts, these gaps remain virtually unchanged – a fact that is deeply frustrating, and also a little confusing. It would be reasonable to assume that background inequalities would shrink over the years of schooling, but that’s not what we find. At age eighteen, rather, we find differences that are roughly the same size as we see at age six.

    Does this mean that schools can’t effectively address inequality? Certainly not. I devoted a whole book to the subject, Inequality for All, in which I argued that one of the key factors driving inequality in schools is unequal opportunity to learn, or OTL.

    It is very unlikely that students will learn material they are not exposed to, and there is considerable evidence that disadvantaged students are systematically tracked into classrooms with weaker content. Rather than mitigating the effects of poverty, many American schools are exacerbating them.



