The Geography of 2016 Revisited
Why did Trump win in 2016? What are the correlates of Trump's victory? We've looked at this question before. Since then we discovered a good way to visually represent linear models that allows us to make apples to apples comparison between conditioners. It is perhaps time to return to the problem with sharper tools. As argued in the original piece, people tend to vote for their party, no matter who's on the ticket. It's therefore nonsense to analyze counties that voted for Trump; the best predictor of that, as always, is counties that vote Republican. What we must ask instead is: what are the properties of the counties than swung towards Trump? That is, what is the relationship between the characteristics of counties and the difference in the Republican share of the vote between 2012 and 2016?
We define the Trump swing at the county level as the change in the Republican share between the last two Presidential elections. Then we ask a straightforward question: How do counties that swung towards Trump differ from counties that swung away from him?
We obtain socioeconomic data from the USDA and electoral data from the MIT lab. Of the 3,111 counties for which we have data, 2,476 swung towards Trump and only 677 swung away from him. But, get this: the former is home to only 126m residents, while the latter hosts 220m. The unweighted average swing towards Trump across the sample is +3.6%, while the population weighted swing is -1.3%. Weighted by electoral college votes, the average Trump swing comes to +3.1%. That's the crux of the Dems' dilemma.
Electoral college votes are proportional to state population but do not assure equal representation. In big, densely populated states like California, Florida and Texas, one vote represents around 700,000 people. In small states like Vermont, Rhode Island, Delaware and Alaska, one vote represents around 200,000 people. The electoral voice of people in the least populous states counts three-to-four times as much as those in the most populous states. The District of Columbia may not have representation in Congress, but its 700,000 residents get 3 electoral college votes. The population of Massachusetts is ten times as large, but it only gets 11. Broadly speaking, the electoral college system is rigged towards sparsely populated states in the interior of the country and against populous states on the coasts.
This is of great political import for the two parties are now entrenched on opposite ends of this spectrum. The main consequence for Presidential elections is that the Democrats have a virtual lock on the popular vote and the Republicans have a systematic advantage in electoral college votes. It was no coincidence that Hillary won the popular vote by 3m votes; Obama won the popular vote by 5m in 2012 and 9.5m in 2008. The difference, of course, is that Hillary lost the election despite convincingly winning the popular vote. In fact, the only time that the Republican candidate won the popular vote in the past twenty years is 2004, when Bush's Iraq wager briefly looked like it had worked and the economy was expanding. Even then, Bush won the popular vote by a mere 0.5m.
We begin with the unemployment rate. While low in an absolute sense, the unemployment rate in Trump counties is higher.
In fact, the gap persists through the economic cycle. Counties that swung for Trump have a systematically higher unemployment rate than counties that swung away from Trump.
Case and Deaton uncovered the startling fact that, in the US, "all cause" mortality for non-Hispanic Whites stopped falling in the 1990s, and started climbing outright in the 2000s. They traced the reversal of all cause mortality to what they call 'deaths of despair'. The next figure graphs the evolution of four such mortality rates at the national level.
We see that deaths due to interpersonal violence declined through 1980-2015. The pattern of deaths due to alcohol abuse is more complicated but it is clear that the hazard rate for this risk has been stable, at least for a decade. Death due to self-harm is a euphemism for suicide. We can see that the decline in suicides was arrested in 2000 and has since been increasing. But the most striking graph is that for deaths due to drug overdoses. This hazard rate increased seven-fold in 1980-2015.
Do Trump counties have higher rates of overdose deaths? Yes, they do. The difference is not very large. But it is statistically significant.
How important was this factor? In order to get to the bottom of this, we must get a handle on the variables that predict the swings for or against Trump. I ran hundreds of regressions to understand what the correlations are telling us. The basic story is clear from the following "kitchen sink" regression. We define the fixed-effect of a conditioner as the product of its interquartile range and the absolute value of the slope coefficient. The interpretation is straightforward: The fixed-effect, the height of the bars in the graph below, is the predicted effect on the Trump swing if we change the value of the conditioner from the 25th percentile to the 75th percentile. The error bars are confidence intervals for the fixed-effect.
The kitchen-sink regression is problematic because many of the conditioners were threw in the sink are highly correlated. Eg, net migration is highly correlated with county population growth so that it enters with the wrong sign if we control for the latter. Similarly, high school graduation rates are correlated with college graduation rates, and change in overdose deaths is correlated with the level of overdose deaths in 2015.
The single factor model with overdose as the predictor is significant.
Okay, so what is a good empirical model of 2016? The next figure displays the fixed-effects in our selected model. We throw out variables that are highly correlated with others or are statistically insignificant. That leaves five variables that each have a very interesting interpretation. Note that we weight the regressions with electoral college votes. This is more relevant to 'the causes of 2016'; but the results are not too different in the unweighted OLS. And, as we saw above, the issue is not why people voted for Trump (the population-weighted swing was against him), but why did people, who live in counties that matter most electorally, vote for Trump in higher numbers than they had for Romney?
The most important conditioner is college graduation rates. The higher the college graduation rate, the smaller, or even more negative, the Trump swing. The difference in Trump swing between counties with 15 percent and 25 percent residents who finished college (the interquartile range) is 2.75%. The college graduation rate of Trump counties is nearly half that of counties where Romney received a large vote share than Trump.
Recall the college wage premium. We have an economy that is systematically biased in favor of the educated and the skilled. Trump counties have largely missed out on that prosperity.
Population growth is perhaps even more significant. Trump counties are above all places that are bleeding people. This is clear from both population growth rates and net migration rates. The unweighted average of population growth rates for Trump counties is -0.9%, that for anti-Trump counties (ie counties where Romney got a higher share of the vote than Trump) is +6.5%. Weighting by electoral college votes, we obtain -0.5% and +7.0% respectively.
We recover the same result from net migration rates. Trump counties are places that people are literally leaving behind. The average net migration rate for Trump and anti-Trump counties is -1.6% and +3.8% respectively.
Let's go back to our estimates. The effect of higher median income is positive once we control for college graduation rate. This means that, on the margin, it is the relatively better off counties among the 'left behind' who moved to Trump. This is reminiscent of Eric Hoffer's thesis that it is not the oppressed, but the ones who had status, who feel entitled to it, and sense it slipping away, they are the ones who are prepared to burn it all down and start from scratch.
Interestingly, international migration rate enters with a negative sign; meaning that a higher proportion of foreign born residents predicts a swing away from Trump. Turns out that Americans who live with foreigners kinda like them. Those who don't, who live in towns where they hardly ever see anyone born in another country, find it easier to buy Trump's xenophobic spiel.
I want to understand these overdose death rates better. Let's examine their diachronic pattern. It looks like pro and anti-Trump regions were not so differentiated in the early-2000s.
In order to track the gap more precisely, we use Student's T-test. The next figure shows the t-Statistic for the hypothesis that the difference in the rate of overdose deaths in pro-Trump and anti-Trump counties is indistinguishable from zero. Regional polarization in these risks becomes significant in earnest in 2006. We can see a suggestion of the Great Recession in the time variation through 2009-2012. But the cross-sectional gradient is robust from 2006 onward.
The import of these patterns is clear. The social forces behind the Trump phenomena have been gathering pace for decades; driven by intensifying regional polarization. More precisely, it has been driven by rising despair in regions that have become increasingly peripheralized. The trauma has been evident in the numbers for a while. It would simply not do to deny the facts. The system is not working for a plurality of people, who, if they do not constitute a numerical majority, certainly constitute an effective majority; thanks to the US Constitution.
Postscript. The weights do make a difference. But a minor one. The gradients change neither sign nor order. We omit the error bars with the understanding that all six estimates are significant at 1 percent. Note that, unlike above where it refer to the death rate due to drug overdose, "overdose" in the graph below refers to growth in log death rate due to drug overdose. An added benefit of normalizing the slope coefficients by the interquartile range is thus revealed. We don't have to worry about the effect of the log transform on the gradient. In effect, we are measuring all predictors in half-energy units. What we are comparing is the "middling half" counterfactual: How would an otherwise average county respond if we intervene in the system and flip the predictor from the 25th to the 75th percentile?
Post-postscript. Examining the data with a finer comb reinforces the results. We discretize our response, Trump's vote share less Romney's vote share, into quintiles. The first quintile, Q1, swung away from Trump; the last four, Q2, Q3, Q4, and Q5, swung to Trump. The average swing for the quintile buckets is -4.2%, 1.3%, 3.7%, 6.4%, 11.2% respectively. The average for Q3 is roughly equal to the unweighted average Trump swing across all counties, +3.7%. Recall also that the population-weighted average Trump swing is -1.4% and the electoral college weighted swing is +3.0%. Note also that "Youth" is people aged 16-24 that are neither in school nor working, and "Mortality" refers to the mortality rate for people aged 25-55 years.
[gallery ids="20678,20677,20676,20675,20674,20673,20672,20671,20670,20669,20667,20665,20664,20662" type="slideshow"]
Post-post-postscript. You know you are writing an unfinished piece when you have to write a post-post-postscript. I'm digging deeper into overdose deaths. The growth in this category of 'deaths of despair' has been extraordinary. The number of people dying due to drug abuse has increased seven-fold since 1980, with most of the rise concentrated in the late-1990s and the 2000s.
All states have participated in the movement. Some more than others. The next figure shows overdose death rates across the states. Rates are highest in West Virginia, New Mexico, Kentucky, and lowest in the Dakotas and Nebraska.
Level variables are often confounded by static cross-sectional differences. So it is often better to look at growth rates. The next figure shows the change in log of overdose death rates. New York and California have been the least affected by the epidemic. West Virginia stands out again. But Kentucky, New Hampshire, Oklahoma, Massachusetts, Indiana, and Ohio are not far behind. Up until 1995, New York and California had higher rates of drug deaths than these states.
Growth in drug deaths is strongly correlated with the Trump swing.
The fixed-effect of drug deaths is larger than previously reported. In an unweighted regression, it is almost as great as population growth (which measures whether young people are voting against these counties with their feet).
But once we weight by electoral college votes, overdose deaths emerges as the dominant variable. The drug epidemic thus interacts with the electoral college system in the path to 2016. The empirical gradients displayed in the next figure must be central to any credible diagnosis of 2016.