HTAC Pipeline · Step 7 of 7
Prevalence Results
The final output: suppressed, stratified prevalence estimates stored as PrevalenceEstimate rows within a StudyRun. Those rows feed an internal operations console, optional maps and charts, governed file extracts, and—where included in scope—a read-only programmatic interface.
What a result represents
Each PrevalenceEstimate row is one cell in a multi-dimensional cube: one condition × one health system × one geographic level × one geographic value × one stratifier × one stratifier value. A single study run for 22 conditions across 11 sites, 4 geo levels, and 9 stratifiers can produce tens of thousands of cells.
StudyRun organizes everything
A StudyRun groups all estimates from one execution of the pipeline: it records the run date, roster version, study period, which conditions were queried, and the run status (Pending → Running → Complete / Failed). Multiple study runs allow year-over-year trend comparisons.
Data quality reports
Each run can also generate DataQualityReport rows — one metric per site per run date. Metrics include completeness rates, missingness in key fields, and concept mapping coverage. DQ flags (Pass / Warn / Fail) help analysts identify sites with data issues before using their estimates.
Output: PrevalenceEstimate Record
| Field | Role | Description |
|---|---|---|
study_run | Link | The run this estimate belongs to |
condition | Link | The HTAC health condition (e.g., Diabetes) |
health_system | Link (optional) | Empty when the row is a network-wide aggregate across sites |
geo_level | Category | state / county / zip / census_tract |
geo_value | Text (optional) | FIPS code, ZIP, or census tract; empty for state-level rows |
stratifier | Category | One of nine stratification dimensions (race, language, age group, etc.) |
stratifier_value | Text | The stratum label (e.g., Black or African American, Somali) |
numerator | Count (optional) | People with the condition in this stratum; empty when suppressed |
denominator | Count (optional) | Active people in the study window for this stratum; empty when suppressed |
prevalence_rate | Rate (optional) | Cases per 10,000 active people; empty when suppressed |
is_suppressed | Flag | Set when the numerator is below the published threshold; detail fields are cleared |
Sample Output — What a Published Row Looks Like
| Condition | Site | Geo | Stratifier | Stratum | Numerator | Denominator | Rate (per 10k) | Suppressed? |
|---|---|---|---|---|---|---|---|---|
| Hypertension | Statewide | State | Total | All | 163 | 500 | 3,260.0 | No |
| Diabetes | Regional Health A | County | Race | Black or African American | 14 | 42 | 3,333.3 | No |
| HIV | Essentia Health | County | Race | Am. Indian / Alaska Native | — | — | — | Yes (n < 11) |
Figures above are illustrative. After the roster and condition definitions are loaded, run the condition-query job for the active study period to materialize estimates in this environment.
How Results Are Accessed
Operations
Staff console
Authorized analysts review, filter, and export estimate tables through the password-protected console (this deployment: /admin/). Access is limited to defined roles.
Integration
Read-only service interface
Downstream systems can pull stratified cells over HTTPS at /api/htac/v1/ with filters for condition, geography, stratifier, and suppression status, subject to the same publication rules as the console.
Reporting
Maps and dashboards
Maps and trend views are usually deployed in a separate reporting layer that reads the same published tables or service endpoints, so public traffic does not sit on operational databases.
Disclosure
Public-use files
Full public-use extracts are released only through the organization’s data-request process, with DUAs and redistribution terms attached.
Operational snapshot
Top conditions by cell count
| Condition | Estimate cells |
|---|---|
| Depression | 168 |
| Asthma | 168 |
| Hypertension | 168 |
| Diabetes | 168 |
| Opioid Use Disorder | 168 |
Study Runs
Pipeline demonstration run 3
Synthetic federated pipeline demonstration output.
HTAC 2022 Baseline
Baseline prevalence study covering the 2022 calendar year.
Pipeline Complete
From raw EHR records in OMOP CDM to suppressed, stratified prevalence estimates — spanning the conditions, sites, stratifiers, and geography levels configured in your study run.
Back to HTAC overviewFrom Demo to Deployment
This demonstration environment illustrates the full federated pipeline in a sandbox context. A production deployment for a health system consortium or public health agency would layer in: a governance structure with a steering committee and scientific review process, data use agreements with each contributing health system and administrative data provider, a privacy-preserving record linkage process using tokens generated at each contributing site, validated condition codesets reviewed by clinical and epidemiological staff, a data quality monitoring process running on a quarterly or annual cadence, and a publication process that governs who sees what results and under what conditions.
Building that production layer is the work this practice supports.