The New York Times and Salon have reported that many states, as well as the National Council of State Legislatures (NCSL), have expressed concern over the accuracy of census data that results from the use of the new privacy algorithm developed by the Census Bureau to protect individuals privacy. The census has always added some uncertainty to its data, but a key innovation of this new framework, known as “differential privacy,” is a numerical value describing how much privacy loss a person will experience. It determines the amount of randomness — “noise” — that needs to be added to a data set before it is released, and sets up a balancing act between accuracy and privacy. Too much noise would mean the data would not be accurate enough to be useful — in redistricting, in enforcing the Voting Rights Act or in conducting academic research. But too little, and someone’s personal data could be revealed.
The bureau released a demonstration to states to test out the new method using data from the 2010 census and experts quickly realized that the data was very different from the original 2010 numbers, particularly in rural areas. Along with miscounts, these errors could also completely misrepresent entire communities. But state officials worry that even minor errors could result in significant long-term consequences.
Various states have indicated that they no longer will have accurate information about their communities. The data distortion might misrepresent a city’s population size by 25% or more or, in the case of an age group, by more than 100%. As a result, data necessary for things like enforcing voting rights, funding schools, planning for emergencies, tracking opioid addiction, and city planning will be inaccurate and meaningless.
The data released by the bureau is expected to be accurate on the state level but its sub-state level data — region, county, city, town — will be intentionally distorted.
John Abowd, the associate director for research and methodology and chief scientist of the US Census Bureau, said in a letter to officials in Nevada that the algorithm was “written specifically for the 2020 Census and cannot be directly applied to any other data.” Abowd said that “[T]he Census Bureau is committed to publishing accurate data for the 2020 Census, however our obligation to protect privacy means that we cannot publish perfectly accurate data for every conceivable use case. He argued that the bureau expects the “impact of the error introduced by the use of formal privacy will be less than the error resulting from other factors.” Abowd said that as the bureau works to improve the algorithm, “we are also researching a variety of contingency plans to ensure that the 2020 Census Data Products meet the Census Bureau’s data quality standards.”