Using big data to look at life expectancy

Apr 12, 2016, 1:45 PM EDT
(Source: Tambako The Jaguar/flickr)
(Source: Tambako The Jaguar/flickr)

The more scientific researchers turn to big data to understand patterns of all kinds, the more potential is revealed. Take the recent study on income and life expectancy in the U.S. (The Association Between Income and Life Expectancy in the United States, 2001-2014)published in The Journal of the American Medical Association on Sunday, which uses huge swaths of data over a period of 15 years to shed light on a correlation its authors say is well-established but poorly understood. The results, confined to the U.S., show that the average life expectancy of the lowest-income classes in America is now equal to that in Sudan or Pakistan.

The study took income data for the U.S. population from 1.4 billion deidentified tax records between 1999 and 2014, and mortality data from Social Security Administration death records. It noted that the data was "used to estimate race- and ethnicity-adjusted life expectancy at 40 years of age by household income percentile, sex, and geographic area, and to evaluate factors associated with differences in life expectancy.”

As part of the Health Inequality Project, the study found that the richest men in the U.S. now outlive the poorest by 15 years — equal to the rich/poor life expectancy divide in the Sudan — and that the richest women outlive the poorest women by a decade. One of the Harvard-based study’s authors David Cutler, the Otto Eckstein Professor of Applied Economics and a professor at the Harvard Kennedy School and the Harvard T.H. Chan School of Public Health, said that there are two goals with publishing these figures:

"One is to present this data, but the other is to create this data set so it can then be used by policymakers and researchers everywhere. This data has never been looked at with this level of granularity before."

That granularity sheds light on why exactly life expectancies are different per state or region, confirming on a new level that mortality rates vary with income. While this is not the first study to establish the relationship between life expectancy and income, the researchers were able to compare the life expectancies of people in different regions with the same jobs to better understand how income levels directly correlate to mortality rates.

And they found that there is no plateau for higher income or lower income as those figures related to greater longevity or lower survival. Cutler ruled out elements such as insurance coverage or unemployment as factors that contribute to these huge differences in life expectancy:

"It’s not an overwhelming correlation with medical care or insurance coverage. It’s not that the labor market is getting better — it’s not correlated with unemployment, or the expansion or contraction of the labor force, or how socially connected people feel. The only thing it seems to be correlated with is how educated and affluent the area is, so low-income people live longer in New York or San Francisco, and they live shorter in the industrial Midwest.”

The timing is propitious -- Tuesday is Equal Pay Day in the U.S. While there is no telling how exactly these results will factor in to future policy, the study itself reveals the potential to gain highly granular insight into critical cultural, economic and anthropological patterns in the U.S.