The impact of standardizing the definition of visits on the consistency of multi-database observational health research

Erica A Voss, Qianli Ma, Patrick B Ryan
BMC Medical Research Methodology 2015, 15: 13

BACKGROUND: Use of administrative claims from multiple sources for research purposes is challenged by the lack of consistency in the structure of the underlying data and definition of data across claims data providers. This paper evaluates the impact of applying a standardized revenue code-based logic for defining inpatient encounters across two different claims databases.

METHODS: We selected members who had complete enrollment in 2012 from the Truven MarketScan Commercial Claims and Encounters (CCAE) and the Optum Clinformatics (Optum) databases. The overall prevalence of inpatient conditions in the raw data was compared to that in the common data model (CDM) with the standardized visit definition applied.

RESULTS: In CCAE, 87.18% of claims from 2012 that were classified as part of inpatient visits in the raw data were also classified as part of inpatient visits after the data were standardized to CDM, and this overlap was consistent from 2006 to 2011. In contrast, Optum had 83.18% concordance in classification of 2012 claims from inpatient encounters before and after standardization, but the consistency varied over time. The re-classification of inpatient encounters substantially impacted the observed prevalence of medical conditions occurring in the inpatient setting and the consistency in prevalence estimates between the databases. On average, before standardization, each condition in Optum was 12% more prevalent than that same condition in CCAE; after standardization, the prevalence of conditions had a mean difference of only 1% between databases. Amongst 7,039 conditions reviewed, the difference in the prevalence of 67% of conditions in these two databases was reduced after standardization.

CONCLUSIONS: In an effort to improve consistency in research results across database one should review sources of database heterogeneity, such as the way data holders process raw claims data. Our study showed that applying the Observational Medical Outcomes Partnership (OMOP) CDM with a standardized approach for defining inpatient visits during the extract, transfer, and load process can decrease the heterogeneity observed in disease prevalence estimates across two different claims data sources.

Full Text Links

Find Full Text Links for this Article


You are not logged in. Sign Up or Log In to join the discussion.

Related Papers

Remove bar
Read by QxMD icon Read

Save your favorite articles in one place with a free QxMD account.


Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"