Nothing Special   »   [go: up one dir, main page]

Skip to main content

A Note on the Misinterpretation of the US Census Re-identification Attack

  • Conference paper
  • First Online:
Privacy in Statistical Databases (PSD 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13463))

Included in the following conference series:

Abstract

In 2018, the US Census Bureau designed a new data reconstruction and re-identification attack and tested it against their 2010 data release. The specific attack executed by the Bureau allows an attacker to infer the race and ethnicity of respondents with average 75% precision for 85% of the respondents, assuming that the attacker knows the correct age, sex, and address of the respondents. They interpreted the attack as exceeding the Bureau’s privacy standards, and so introduced stronger privacy protections for the 2020 Census in the form of the TopDown Algorithm (TDA).

This paper demonstrates that race and ethnicity can be inferred from the TDA-protected census data with substantially better precision and recall, using less prior knowledge: only the respondents’ address. Race and ethnicity can be inferred with average 75% precision for 98% of the respondents, and can be inferred with 100% precision for 11% of the respondents. The inference is done by simply assuming that the race/ethnicity of the respondent is that of the majority race/ethnicity for the respondent’s census block.

We argue that the conclusion to draw from this simple demonstration is NOT that the Bureau’s data releases lack adequate privacy protections. Indeed it is the Bureau’s stated purpose of the data releases to allow this kind of inference. The problem, rather, is that the Bureau’s criteria for measuring privacy is flawed and overly pessimistic. There is no compelling evidence that TDA was necessary in the first place.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.census.gov/programs-surveys/decennial-census/decade/2020/planning-management/process/disclosure-avoidance/2020-das-development.html.

  2. 2.

    An idea borrowed from Ruggles et al. [13].

  3. 3.

    For instance https://geocoding.geo.census.gov.

  4. 4.

    https://www.nhgis.org/privacy-protected-2010-census-demonstration-data, Vintage 2021-06-08.

References

  1. Brennan Center for Justice. Court Rejects Alabama Challenge to Census Plans for Redistricting and Privacy (2020). https://www.brennancenter.org/our-work/analysis-opinion/court-rejects-alabama-challenge-census-plans-redistricting-and-privacy

  2. Brennan Center for Justice. Fair Lines America Foundation vs. US Dept of Commerce (2022). https://www.brennancenter.org/our-work/court-cases/fair-lines-america-foundation-v-us-dept-commerce

  3. Abowd, J.: Staring down the database reconstruction theorem (2019). https://www2.census.gov/programs-surveys/decennial/2020/resources/presentations-publications/2019-02-16-abowd-db-reconstruction.pdf

  4. Abowd, J.: Second declaration of john m. abowd, fair lines versus US dept. of commerce, appendix b - 2010 reconstruction-abetted re-identification simulated attack (2021). https://www2.census.gov/about/policies/foia/records/disclosure-avoidance/appendix-b-summary-of-simulated-reconstruction-abetted-re-identification-attack.pdf

  5. U.S. Census Bureau: Disclosure avoidance for the 2020 census: an introduction (2021). https://www2.census.gov/library/publications/decennial/2020/2020-census-disclosure-avoidance-handbook.pdf

  6. Garfinkel, S., Abowd, J.M., Martindale, C.: Understanding database reconstruction attacks on public data: these attacks on statistical databases are no longer a theoretical danger. Queue 16(5), 28–53 (2018)

    Article  Google Scholar 

  7. Hauer, M.E., Santos-Lozada, A.R.: Differential privacy in the 2020 census will distort COVID-19 rates. Socius 7, 2378023121994014 (2021)

    Article  Google Scholar 

  8. Kenny, C.T., Kuriwaki, S., McCartan, C., Rosenman, E.T., Simko, T., Imai, K.: The use of differential privacy for census data and its impact on redistricting: the case of the 2020 us census. Sci. Adv. 7(41), eabk3283 (2021)

    Google Scholar 

  9. McKenna, L., Haubach, M.: Legacy techniques and current research in disclosure avoidance at the U.S. census bureau (2019). https://www.census.gov/library/working-papers/2019/adrm/legacy-da-techniques.html

  10. Mueller, T., Santos-Lozada, A.R.: The 2020 US census differential privacy method introduces disproportionate error for rural and non-white populations. SocArXiv (2021)

    Google Scholar 

  11. Muralidhar, K.: A re-examination of the census bureau reconstruction and reidentification attack. Priv. Stat. Databases (2022)

    Google Scholar 

  12. Ruggles, S., et al.: Implications of differential privacy for census bureau data and scientific research. Minnesota Population Center, University of Minnesota, Minneapolis (Working Paper 2018-6) (2018)

    Google Scholar 

  13. Ruggles, S., Van Riper, D.: The role of chance in the census bureau database reconstruction experiment. Popul. Res. Policy Rev. 41(3), 781–788 (2021). https://doi.org/10.1007/s11113-021-09674-3

    Article  Google Scholar 

  14. Santos-Lozada, A.R.: Changes in census data will affect our understanding of infant health. Socius 7, 23780231211023640 (2021)

    Article  Google Scholar 

  15. Santos-Lozada, A.R., Howard, J.T., Verdery, A.M.: How differential privacy will affect our understanding of health disparities in the united states. Proc. Natl. Acad. Sci. 117(24), 13405–13412 (2020)

    Article  Google Scholar 

  16. Winkler, R.L., Butler, J.L., Curtis, K.J., Egan-Robertson, D.: Differential privacy and the accuracy of county-level net migration estimates. Popul. Res. Policy Rev. 41, 1–19 (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paul Francis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Francis, P. (2022). A Note on the Misinterpretation of the US Census Re-identification Attack. In: Domingo-Ferrer, J., Laurent, M. (eds) Privacy in Statistical Databases. PSD 2022. Lecture Notes in Computer Science, vol 13463. Springer, Cham. https://doi.org/10.1007/978-3-031-13945-1_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-13945-1_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-13944-4

  • Online ISBN: 978-3-031-13945-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics