The “All of Us” Research Program
NIH's Plan to Build a Million Person Health Metric Database
Chief Data Officer
National Institutes of Health
About the Lecture
The All of Us Research Program is collecting health data from at least a million people to advance precision medicine research and fuel new insights into human health. The lecture will discuss the ways the program is looking to accelerate medical research and advance health equity through its robust research ecosystem, as well as the opportunities for researchers to begin using the rich dataset.
The platform currently supports more than 4,200 registered researchers with more than 3,600 active projects. Information about each workspace’s research purpose and team are publicly available. De-identified, row-level phenotypic data has been available since May 2020 through the Workbench’s Registered Tier, and a new Controlled Tier includes data from nearly 100,000 whole genome sequences from All of Us participants, alongside other data types.
The Researcher Workbench now includes data from more than 372,000 participants. Inclusion of data from all of society’s diverse ethnic and genetic backgrounds is essential to the program serving its broad goals. Thus, far nearly 80% of participants identify with groups historically underrepresented in medical research, including 45% who identify with a racial or ethnic minority group.
The promise of medical research will only be made possible through a pervasive and unyielding commitment to inclusion that provides an opportunity to study the vast diversity of human experience. By building this health research resource and making it broadly available and accessible to the research community, All of Us is looking to change the paradigm of health research so that it is more effective for everyone.
Selected Reading & Media References
About the Speaker
Andrea Ramirez is the chief data officer of the US National Institutes of Health’s (NIH) “All of Us Research Program.” In addition to her work with All of Us, Andrea maintains a clinical practice in pharmacogenomics, atypical diabetes, and general endocrinology.
Before becoming the Chief Data Officer she served as senior advisor to the Chief Executive Officer of the All of Us Research Program. Before joining NIH, Andrea was a physician and scientist at Vanderbilt University Medical Center studying the genomics of metabolic disorders and precision diabetes care, where she also led the data science team at the All of Us Data and Research Center.
Andrea’s work has been funded by the Albert Schweitzer Fellowship, the Sarnoff Cardiovascular Research Foundation, and through several NIH awards.
Andrea earned a BS at North Carolina State University, and an MD at Duke University. She completed internal medicine and clinical pharmacology training at Vanderbilt University Medical Center and endocrinology, diabetes, and metabolism at NIH.
On March 3rd, 2023, from the Powell Auditorium of the Cosmos Club in Washington, D.C, and by Zoom webinar broadcast on the PSW Science YouTube channel, President Larry Millstein called the 2,473rd meeting of the Society to order at 8:05 p.m. ET. He welcomed new members, and the recording secretary read the minutes of the previous meeting.
President Millstein then introduced the speaker for the evening, Andrea Ramirez, Chief Data Officer at the National Institutes of Health. Her lecture was titled, “The “All of Us” Research Program.”
Ramirez began by highlighting the importance of observation in medicine and highlights the founding principles of observational cohort studies, particularly the FramingHAM heart study, which was instrumental in understanding the impact of blood pressure, cholesterol, and smoking on heart disease. The speaker emphasized the need for a personalized approach to medicine and the challenges of understanding the genome and drug response.
Ramirez discussed the complexity of implementing genomic medicine in clinical practice, using diabetes as an example. She highlights the need for more observation and learning to better understand the implications of genomics in diverse populations. The speaker compared the precision of astrophysics in studying outer space to the need for doctors to observe and learn more about their patients with diabetes.
The speaker then went on to review the All of Us Research Program which is a longitudinal study aiming to enroll one million or more participants to capture a broad range of common and rare diseases. Ramirez emphasized the core values of the program, including openness, diversity, transparency, and engagement with participants. The program collects a variety of data including biospecimens, electronic health records, surveys, and mobile device data. The speaker emphasized the importance of returning value to participants through access to their information and prioritizing engagement. The program partners with a variety of healthcare organizations to ensure inclusivity and diversity in recruitment.
Ramirez then announced the upcoming release of a large set of whole genome sequences using a new technology that can identify structural variants. The program is also expanding to include ancillary studies to allow researchers to re-contact participants and ask more specific questions. The program hopes to continue to grow and engage more participants to help researchers answer deeper questions in medical research.
The lecture discusses the All of Us Research Program's five-year goal structure and their approach to conducting demonstration projects. The five-year goals include scaling enrollment and retention, getting data available to researchers, launching ancillary studies, and supporting a diverse global community of researchers. The demonstration projects aim to characterize and validate the cohort for quality, utility, and diversity of data and tools, not to make discoveries. By conducting demonstration projects, researchers can understand how to use the data and identify any quality issues or changes in the data set. The lecture provides examples of two demonstration projects, one on diversity and the other on pediatric data. The program is continuing to conduct demonstration projects on phenotype and genotype data.
Next the question and answer period began. One member asked what demonstration projects are being conducted. The speaker responded that there have been demonstration projects on phenotype data and genotype data, where they have generated data and made it available to researchers. Another member asked what was the hypothesis tested. the example case used in pediatrics. Ramirez explained that the pediatric obesity had been changing over time and researchers were able to confirm their diagnoses
After the question and answer period, President Millstein thanked the speaker, made the usual housekeeping announcements, and invited guests to join the Society. President Millstein adjourned the meeting at 10:06 p.m.
Temperature in Washington, D.C.: 7° C
Weather: Cloudy with light precipitation
Attendance: Attending in person: 30, concurrent live stream viewers: 20, for a total live viewership of 50, and the number of online viewers in the first two weeks of posting was, 111.
Cameo Lance, Recording Secretary