Incorporating a location-based socioeconomic index into a de-identified i2b2 clinical data warehouse

Bret J. Gardner, Jay G. Pedersen, Mary E. Campbell, James C McClay

Research output: Contribution to journalArticle

Abstract

Objective: Clinical research data warehouses are largely populated from information extracted from electronic health records (EHRs). While these data provide information about a patient's medications, laboratory results, diagnoses, and history, her social, economic, and environmental determinants of health are also major contributing factors in readmission, morbidity, and mortality and are often absent or unstructured in the EHR. Details about a patient's socioeconomic status may be found in the U.S. census. To facilitate researching the impacts of socioeconomic status on health outcomes, clinical and socioeconomic data must be linked in a repository in a fashion that supports seamless interrogation of these diverse data elements. This study demonstrates a method for linking clinical and location-based data and querying these data in a de-identified data warehouse using Informatics for Integrating Biology and the Bedside. Materials and Methods: Patient data were extracted from the EHR at Nebraska Medicine. Socioeconomic variables originated from the 2011-2015 five-year block group estimates from the American Community Survey. Data querying was performed using Informatics for Integrating Biology and the Bedside. All location-based data were truncated to prevent identification of a location with a population >20 000 individuals. Results: We successfully linked location-based and clinical data in a de-identified data warehouse and demonstrated its utility with a sample use case. Discussion: With location-based data available for querying, research investigating the impact of socioeconomic context on health outcomes is possible. Efforts to improve geocoding can readily be incorporated into this model. Conclusion: This study demonstrates a means for incorporating and querying census data in a de-identified clinical data warehouse.

Original languageEnglish (US)
Pages (from-to)286-293
Number of pages8
JournalJournal of the American Medical Informatics Association
Volume26
Issue number4
DOIs
StatePublished - Jan 31 2019

Fingerprint

Electronic Health Records
Informatics
Censuses
Social Class
Geographic Mapping
Environmental Health
Clinical Laboratory Techniques
Health
Research
History
Economics
Medicine
Morbidity
Mortality
Population

Keywords

  • American Community Survey (ACS)
  • census
  • i2b2
  • social determinants of health
  • socioeconomic status

ASJC Scopus subject areas

  • Health Informatics

Cite this

Incorporating a location-based socioeconomic index into a de-identified i2b2 clinical data warehouse. / Gardner, Bret J.; Pedersen, Jay G.; Campbell, Mary E.; McClay, James C.

In: Journal of the American Medical Informatics Association, Vol. 26, No. 4, 31.01.2019, p. 286-293.

Research output: Contribution to journalArticle

@article{cfe00369bf2b40d8b29f6c049cca6927,
title = "Incorporating a location-based socioeconomic index into a de-identified i2b2 clinical data warehouse",
abstract = "Objective: Clinical research data warehouses are largely populated from information extracted from electronic health records (EHRs). While these data provide information about a patient's medications, laboratory results, diagnoses, and history, her social, economic, and environmental determinants of health are also major contributing factors in readmission, morbidity, and mortality and are often absent or unstructured in the EHR. Details about a patient's socioeconomic status may be found in the U.S. census. To facilitate researching the impacts of socioeconomic status on health outcomes, clinical and socioeconomic data must be linked in a repository in a fashion that supports seamless interrogation of these diverse data elements. This study demonstrates a method for linking clinical and location-based data and querying these data in a de-identified data warehouse using Informatics for Integrating Biology and the Bedside. Materials and Methods: Patient data were extracted from the EHR at Nebraska Medicine. Socioeconomic variables originated from the 2011-2015 five-year block group estimates from the American Community Survey. Data querying was performed using Informatics for Integrating Biology and the Bedside. All location-based data were truncated to prevent identification of a location with a population >20 000 individuals. Results: We successfully linked location-based and clinical data in a de-identified data warehouse and demonstrated its utility with a sample use case. Discussion: With location-based data available for querying, research investigating the impact of socioeconomic context on health outcomes is possible. Efforts to improve geocoding can readily be incorporated into this model. Conclusion: This study demonstrates a means for incorporating and querying census data in a de-identified clinical data warehouse.",
keywords = "American Community Survey (ACS), census, i2b2, social determinants of health, socioeconomic status",
author = "Gardner, {Bret J.} and Pedersen, {Jay G.} and Campbell, {Mary E.} and McClay, {James C}",
year = "2019",
month = "1",
day = "31",
doi = "10.1093/jamia/ocy172",
language = "English (US)",
volume = "26",
pages = "286--293",
journal = "Journal of the American Medical Informatics Association",
issn = "1067-5027",
publisher = "Oxford University Press",
number = "4",

}

TY - JOUR

T1 - Incorporating a location-based socioeconomic index into a de-identified i2b2 clinical data warehouse

AU - Gardner, Bret J.

AU - Pedersen, Jay G.

AU - Campbell, Mary E.

AU - McClay, James C

PY - 2019/1/31

Y1 - 2019/1/31

N2 - Objective: Clinical research data warehouses are largely populated from information extracted from electronic health records (EHRs). While these data provide information about a patient's medications, laboratory results, diagnoses, and history, her social, economic, and environmental determinants of health are also major contributing factors in readmission, morbidity, and mortality and are often absent or unstructured in the EHR. Details about a patient's socioeconomic status may be found in the U.S. census. To facilitate researching the impacts of socioeconomic status on health outcomes, clinical and socioeconomic data must be linked in a repository in a fashion that supports seamless interrogation of these diverse data elements. This study demonstrates a method for linking clinical and location-based data and querying these data in a de-identified data warehouse using Informatics for Integrating Biology and the Bedside. Materials and Methods: Patient data were extracted from the EHR at Nebraska Medicine. Socioeconomic variables originated from the 2011-2015 five-year block group estimates from the American Community Survey. Data querying was performed using Informatics for Integrating Biology and the Bedside. All location-based data were truncated to prevent identification of a location with a population >20 000 individuals. Results: We successfully linked location-based and clinical data in a de-identified data warehouse and demonstrated its utility with a sample use case. Discussion: With location-based data available for querying, research investigating the impact of socioeconomic context on health outcomes is possible. Efforts to improve geocoding can readily be incorporated into this model. Conclusion: This study demonstrates a means for incorporating and querying census data in a de-identified clinical data warehouse.

AB - Objective: Clinical research data warehouses are largely populated from information extracted from electronic health records (EHRs). While these data provide information about a patient's medications, laboratory results, diagnoses, and history, her social, economic, and environmental determinants of health are also major contributing factors in readmission, morbidity, and mortality and are often absent or unstructured in the EHR. Details about a patient's socioeconomic status may be found in the U.S. census. To facilitate researching the impacts of socioeconomic status on health outcomes, clinical and socioeconomic data must be linked in a repository in a fashion that supports seamless interrogation of these diverse data elements. This study demonstrates a method for linking clinical and location-based data and querying these data in a de-identified data warehouse using Informatics for Integrating Biology and the Bedside. Materials and Methods: Patient data were extracted from the EHR at Nebraska Medicine. Socioeconomic variables originated from the 2011-2015 five-year block group estimates from the American Community Survey. Data querying was performed using Informatics for Integrating Biology and the Bedside. All location-based data were truncated to prevent identification of a location with a population >20 000 individuals. Results: We successfully linked location-based and clinical data in a de-identified data warehouse and demonstrated its utility with a sample use case. Discussion: With location-based data available for querying, research investigating the impact of socioeconomic context on health outcomes is possible. Efforts to improve geocoding can readily be incorporated into this model. Conclusion: This study demonstrates a means for incorporating and querying census data in a de-identified clinical data warehouse.

KW - American Community Survey (ACS)

KW - census

KW - i2b2

KW - social determinants of health

KW - socioeconomic status

UR - http://www.scopus.com/inward/record.url?scp=85062590948&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062590948&partnerID=8YFLogxK

U2 - 10.1093/jamia/ocy172

DO - 10.1093/jamia/ocy172

M3 - Article

VL - 26

SP - 286

EP - 293

JO - Journal of the American Medical Informatics Association

JF - Journal of the American Medical Informatics Association

SN - 1067-5027

IS - 4

ER -