Skip to Main Content

All of Us Researcher Workbench Guide

Learn about the All of Us Researcher Workbench, the secure data analysis platform where you can access and analyze the NIH All of Us data.

Email this link:

What kind of tools does the Workbench offer?

Powerful tools in the Researcher Workbench support data analysis and collaboration. These tools include:

  • Shared Workspaces to access, store, and analyze data,
  • Notebooks capable of high-powered queries and analysis using R or Python,
  • A Dataset Builder to search and save collections of health information about cohorts, and
  • A Cohort Builder that allows researchers to create, review, and annotate groups of participant data.

Data curation

Details on the data curation processes are detailed in the Research Hub and reproduced below:

EHR Data Harmonization

The All of Us Research Program uses the Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) to standardize EHR data for all researchers.

Data Refinements

After harmonizing the EHR data to meet the specifications of the OMOP CDM, we process the data to ensure participant privacy is protected. We also take steps to conform and clean the data to deliver high-quality data.

Data Dictionary

The All of Us Data Dictionary documents what data are available from participants and what modifications the program makes to protect participant privacy. It provides a description for each data field, noting whether it is a standard OMOP field or a custom field created to help capture data unique to the program. The Data Dictionary also provides information on whether the data in each field come from participant health records or from information the participants provide themselves, like survey data. The Data Dictionary details some ways we clean the data to improve data quality, as well as many of the program custom concept IDs for easy reference. This resource includes versioning data so you can see what has been changed, added, or removed since the previous curated dataset.

Explore the Registered Tier Data Dictionary and the Controlled Tier Data Dictionary.