GPC Reusable Observable Unified Study Environment (GROUSE)

In order to understand all types of care a patient receives without being restricted to specific health systems, the GPC Reusable Observable Unified Study Environment (GROUSE) – a unique de-identified data resource, is created by merging Medicare and Medicaid claims with Electronic Health Records from all 13 GPC sites. 

GPC CDRN selected three types of conditions – one rare disease (amyotrophic lateral sclerosis), one type of cancer (breast cancer), and one public health problem (healthy vs. unhealthy weight), to a) quantify completeness of the health system-derived data repositories; and b) evaluate the distributions of health and care processes for the patients within the GPC versus the larger Medicare and Medicaid populations in our region to understand how studies of the GPC population generalize to the broader populations in our states.

Three pre-defined cohorts are currently covered by GROUSE IRB for using Centers for Medicare and Medicaid Services (CMS) claims for research purposes: Breast Cancer, Amyotrophic Lateral Sclerosis (ALS) and Obesity. 

Goals and Objectives

Within the context of GPC and PCORnet, our study focuses on the following overarching goals:

(1) To understand the development, treatment, progression and consequences of ALS, breast cancer, and healthy vs. unhealthy (overweight and obesity) weight.

(2) To evaluate and enhance data quality derived from electronic health records (EHR) and claims, as well as through integration of other 3rd party data resources.

(3) To examine care disparities at individual, community and institution level (e.g., comparisons on access to care among Medicare/Medicaid-insured, commercially-insured and uninsured population) and evaluate generalizability of pragmatic interventions.

(4) To evaluate healthcare utilization and the economic impact Evaluate healthcare utilization and the economic impact of multiple acute and chronic conditions and identify “hotspotting” areas to better inform policies for lowering costs and patients’ financial burden.

(5) To serve as a greater national resource to understand the development, treatment, progression, and consequences of acute and chronic disease cared for within the United States healthcare system and in support of quality care for the conditions championed by the Patient Powered Research Networks in PCORnet as well as the other conditions studied by our peer CDRNs.

Data Sources

There three main sources of data currently included in GROUSE data repository:

  1. Medicare and Medicaid Claims
    1. Geographical coverage: covering the entire 8 main GPC residing states (KS, MO, IA, WI, NE, MN, TX, UT) and part of surrounding states
    2. Longitudinal coverage: Medicare (2011 – 2017); Medicaid (2011 – 2012)
  2. Electronic Health Records
  3. Public Datasets on social determinants of health (e.g., American Community Survey)

All the three data sources are transformed in conformance with PCORnet Common Data Model specifications ( The PCORnet CDM is a specification that defines a standard organization and representation of data for the PCORnet Distributed Research Network. GPC has implemented open-source algorithms to transform Medicare and Medicaid research identifiable files into PCORnet CDM format (See more details at

GROUSE Cohorts

Three pre-defined cohorts are currently covered by GROUSE IRB for using Centers for Medicare and Medicaid Services (CMS) claims for research purposes

If you’ve defined clear study objectives and found GROUSE to be a fitting data resource for your research, please follow the access request process and more information on GROUSE is described in the GitHub GROUSE wiki page link