Summary
This paper describes the data protection framework and risk mitigation strategies employed by the All of Us Research Program to enable individual-level health data access whilst safeguarding participant privacy. Using state-of-the-art adversarial modelling on 329,084 participants, the authors confirmed that systematic data transformations (geographic generalisation, event suppression, date randomisation) achieved re-identification risks below federally accepted thresholds. The analysis revealed disparate risk profiles across demographic groups, highlighting the need for multipronged protection including authentication, monitoring, and enforcement mechanisms.
UK applicability
The findings are relevant to UK health research infrastructure and data governance frameworks such as the UK Biobank and NHS Digital initiatives, particularly regarding alignment of privacy protection standards and re-identification risk assessment methodologies with international best practices.
Key measures
Re-identification risk (95th percentile across all participants; variation by race, ethnicity, and gender); threshold compliance at ≤0.09
Outcomes reported
The study assessed re-identification risk for 329,084 participants in the All of Us Research Program using an adversarial model, confirming that expected risk did not exceed 0.09 (a federally accepted threshold). It examined how risk varied across participant demographics and described the multi-pronged data protection strategy employed.
Topic tags
Dig deeper with Pulse AI.
Pulse AI has read the whole catalogue. Ask about this record, its theme, or how the findings apply to UK farming and policy — every answer cites the underlying studies.