OMOP has conducted a series of experiments to generate empirical evidence about the performance of observational analysis methods in their ability to identify true risks of medical products and discriminate from false findings. These experiments were designed to inform the development of a risk identification and analysis system, as envisioned by various pharmaceutical research companies and now mandated for the FDA by Congress through the FDA Amendment Act of 2007. We define a ‘risk identification and analysis system’ as a systematic and reproducible process to generate evidence efficiently to support the characterization of the potential effects of medical products from across a network of disparate observational healthcare data sources.
In June 2012, the OMOP research team presented results from its latest experiments, which sheds light on recommendations for building a risk identification and analysis system, as well as guidance for interpreting observational studies. The proceedings from the 2012 OMOP Symposium are available to listen to or download the materials/presentations. Below are the OMOP 2011-2012 Test Case Reference and Research Results:
(The results file is large and will take several minutes to download)
OMOP's latest experiment, the team evaluated the performance of a risk identification system for four health outcomes of interest: acute myocardial infarction, acute liver injury, acute renal failure, and gastrointestinal bleeding. For these outcomes, OMOP established a reference set of 399 test cases: 165 ‘positive controls’ that represent medical product exposures for which there is evidence to suspect an association with the outcome, and 234 ‘negative controls’ that are drugs for which there is no evidence that they are associated with the outcome. The fundamental goal of OMOP’s research is to develop and evaluate standardized algorithms that can reliably discriminate the positive controls from the negative controls, and to understand how an estimated effect from an observational study relates to the true relationship between medical product exposure and adverse events.
From the current experiment, several insights were gained about expected behavior of a risk identification system. We observed that self-controlled designs are optimal across all outcomes and all sources, but the specific settings are different in each scenario. All sources achieve good performance (Area under ROC curve > 0.80) for acute kidney injury, acute MI, and GI bleed, while acute liver injury has consistently lower predictive accuracy. A risk identification system should confidently discriminate positive effects with relative risk>2 from negative controls, but smaller effect sizes will be more difficult to detect. There was no evidence that any of the five data sources were consistently better or worse than others, but we did observe substantial variation in estimates across sources pointing to the need to routinely assess consistency across a network of databases. The results underscore the importance of transparency and complete specification and reporting of analyses, as all study design choices were shown to have the potential to substantially shift effect estimates.
Diversity in performance and heterogeneity in estimates arose not only from different study design choices (e.g., cohort versus case-control) but also from analytic choices within study design (e.g., number of controls per case in a case-control study). We caution against generalizing these results to other outcomes or other data sources. However, we do think OMOP has now provided a well-defined procedure for how to profile a database and construct an optimal analysis strategy for a given outcome, which can be systematic, reproducible, and yield defined performance characteristics that can directly inform decision-making. OMOP’s research continues to reaffirm the notion that advancing the science of observational research requires an empirical and reproducible approach to methodology and systematic application.
The 2010 Experiment Results contains results for all of the original data partners (central and distributed partners) in addition to meta-analytic composite estimates in the OMOP2010_METHOD_RESULTS file. Secondly, the file OMOP2010_ANALYSIS_REF shows the parameter settings that were chosen for a given analysis_id in the results file.
|Getting Start Guide for Exploring the OMOP 2011 Experiment Results 28sept2012.pdf||251.54 KB|