HTAC Pipeline · Step 7 of 7

Prevalence Results

Clinical Data

PPRL Tokens

Deduplication

Enrichment

Cohort Queries

Suppression

Results

The final output: suppressed, stratified prevalence estimates stored as PrevalenceEstimate rows within a StudyRun. Those rows feed an internal operations console, optional maps and charts, governed file extracts, and—where included in scope—a read-only programmatic interface.

What a result represents

Each PrevalenceEstimate row is one cell in a multi-dimensional cube: one condition × one health system × one geographic level × one geographic value × one stratifier × one stratifier value. A single study run for 22 conditions across 11 sites, 4 geo levels, and 9 stratifiers can produce tens of thousands of cells.

StudyRun organizes everything

A StudyRun groups all estimates from one execution of the pipeline: it records the run date, roster version, study period, which conditions were queried, and the run status (Pending → Running → Complete / Failed). Multiple study runs allow year-over-year trend comparisons.

Data quality reports

Each run can also generate DataQualityReport rows — one metric per site per run date. Metrics include completeness rates, missingness in key fields, and concept mapping coverage. DQ flags (Pass / Warn / Fail) help analysts identify sites with data issues before using their estimates.

Output: PrevalenceEstimate Record

Field	Role	Description
`study_run`	Link	The run this estimate belongs to
`condition`	Link	The HTAC health condition (e.g., Diabetes)
`health_system`	Link (optional)	Empty when the row is a network-wide aggregate across sites
`geo_level`	Category	state / county / zip / census_tract
`geo_value`	Text (optional)	FIPS code, ZIP, or census tract; empty for state-level rows
`stratifier`	Category	One of nine stratification dimensions (race, language, age group, etc.)
`stratifier_value`	Text	The stratum label (e.g., Black or African American, Somali)
`numerator`	Count (optional)	People with the condition in this stratum; empty when suppressed
`denominator`	Count (optional)	Active people in the study window for this stratum; empty when suppressed
`prevalence_rate`	Rate (optional)	Cases per 10,000 active people; empty when suppressed
`is_suppressed`	Flag	Set when the numerator is below the published threshold; detail fields are cleared

Sample Output — What a Published Row Looks Like

Condition	Site	Geo	Stratifier	Stratum	Numerator	Denominator	Rate (per 10k)	Suppressed?
Hypertension	Statewide	State	Total	All	163	500	3,260.0	No
Diabetes	Regional Health A	County	Race	Black or African American	14	42	3,333.3	No
HIV	Essentia Health	County	Race	Am. Indian / Alaska Native	—	—	—	Yes (n < 11)

Figures above are illustrative. After the roster and condition definitions are loaded, run the condition-query job for the active study period to materialize estimates in this environment.

How Results Are Accessed

Operations

Staff console

Authorized analysts review, filter, and export estimate tables through the password-protected console (this deployment: /admin/). Access is limited to defined roles.

Integration

Read-only service interface

Downstream systems can pull stratified cells over HTTPS at /api/htac/v1/ with filters for condition, geography, stratifier, and suppression status, subject to the same publication rules as the console.

Reporting

Maps and dashboards

Maps and trend views are usually deployed in a separate reporting layer that reads the same published tables or service endpoints, so public traffic does not sit on operational databases.

Disclosure

Public-use files

Full public-use extracts are released only through the organization’s data-request process, with DUAs and redistribution terms attached.

Operational snapshot

840Total estimate cells

145Published

695Suppressed

Top conditions by cell count

Condition	Estimate cells
Depression	168
Asthma	168
Hypertension	168
Diabetes	168
Opioid Use Disorder	168

Study Runs

Pipeline demonstration run 3

Run date: May 15, 2026 | Roster version: May 15, 2026 | Complete

Synthetic federated pipeline demonstration output.

HTAC 2022 Baseline

Run date: March 1, 2023 | Roster version: Jan. 15, 2023 | Pending

Baseline prevalence study covering the 2022 calendar year.

Pipeline Complete

From raw EHR records in OMOP CDM to suppressed, stratified prevalence estimates — spanning the conditions, sites, stratifiers, and geography levels configured in your study run.

Back to HTAC overview

From Demo to Deployment

This demonstration environment illustrates the full federated pipeline in a sandbox context. A production deployment for a health system consortium or public health agency would layer in: a governance structure with a steering committee and scientific review process, data use agreements with each contributing health system and administrative data provider, a privacy-preserving record linkage process using tokens generated at each contributing site, validated condition codesets reviewed by clinical and epidemiological staff, a data quality monitoring process running on a quarterly or annual cadence, and a publication process that governs who sees what results and under what conditions.

Building that production layer is the work this practice supports.

Discuss a Deployment

Latest pipeline demonstration (completed May 14, 2026 20:05). Counts on this page reflect synthetic federated data from that run. Open the live demo →

← Step 6: Suppression

← Pipeline overview