Standard Vocabulary Specification Version 4.5: Introduction

OMOP Implementation Specification

Standard Vocabularies in Observational Data Analysis (October 2013)

1. Introduction
This document reflects the requirements and implementation for the Standard Vocabulary which is part of the Observational Medical Outcomes Partnership (OMOP) Common Data Model, version 4.0 and covers the following vocabulary domains:
  • Condition
  • Procedure
  • Demographics
  • Observation
  • Visit
  • Death
  • Provider
  • Cost

    For each section, the Standard Vocabulary is defined and its characteristics discussed. Mappings from commonly used terminologies and classifications to the Standard Vocabulary are reviewed.
    The purpose of this document is to introduce each Standard Vocabulary, the source, the structure of the classification and the how to use it. The audiences for this document are researchers and developers who want to utilize the OMOP Common Data Model for drug-outcome research.

  • 1.1 Background
    The OMOP is a public-private partnership established to inform the appropriate use of observational healthcare databases, such as administrative claims and electronic health records, for studying the effects of medical products. The partnership is conducting methodological research to empirically evaluate the performance of various analytical methods on their ability to identify true associations between medical product exposure and health outcomes of interest and avoid false findings. As part of its research, OMOP is developing tools and capabilities for transforming, characterizing, and analyzing disparate data sources across the health care delivery spectrum, and is establishing a shared resource so that the broader research community can collaboratively advance the science1. The Standard Vocabulary is a foundational tool developed by the OMOP team to enable transparent and consistent content across disparate observational databases, and serves to support the OMOP research community in conducting efficient and reproducible observational research.
    The Standard Vocabulary contains all of the code sets, terminologies, vocabularies, nomenclatures, lexicons, thesauri, ontologies, taxonomies, classifications, abstractions, and other such data that are required for:
  • Creating the transformed (i.e., standardized) data from the raw data sets,
  • Searching and querying the transformed data, and browsing and navigating the hierarchies of classes and abstractions inherent in the transformed data, and
  • Interpreting the meanings of the data.
    The Standard Vocabulary is now released in Version 4.0. There have been changes to the previous versions, without violating any of the design principles. In particular, all concepts in previous versions are still available and identified using the same Concept IDs. New data domains were added to the concept list (death, provider, and cost) and a lifecycle was introduced for concepts, relationships and maps.
  • 1.2 Definition of Terms
    For purposes of the OMOP Common Data Model (CDM), the following terms in table 1 are used.
    Table 1: Definition of Terms
    Term Description
    Standard Vocabulary Contains all of the below in a set of tables
    Vocabulary Domain A semantic category, like drug, condition, procedure, defined for OMOP purposes that are needed for drug outcome research
    Vocabulary A combination of terminologies and classifications that belong to a Vocabulary Domain
    Terminology A controlled list of concepts, such as a list of conditions
    Classification A hierarchical system of concepts and concept relationships that defines semantically useful classes, like chemical structures for drugs
    Concept Basic unit of information defined in the vocabularies
    1.3. Vocabulary Representation in the CDM
    Vocabulary information is represented in the CDM as described in table 2 below.
    Table 2: Vocabulary Representation in the CDM
    Term Description
    Concept A list of all valid vocabulary concepts across domains and their attributes. Concepts are derived from existing standards as described below.
    Concept_Synonym A table with synonyms for concepts that have more than one valid name or description. This table is currently not shipped.
    Concept_Relationship A list of relationships between concepts. Some of these relationships are generic (e.g. “Subsumes” relationship), others are domain-specific.
    Source_To_Concept_Map A map between commonly used terminologies and the Standard Vocabulary. For example, drugs are often recorded as NDC, while the Standard Vocabulary for drugs is RxNorm.
    Concept_Ancestor A specialized table containing hierarchical relationships between concepts that may span several generations
    Vocabulary A list of all terminologies that make up the Standard Vocabulary
    Relashionship A list of type of relationships that may exist between two concepts

    This content discusses the content of the tables. For a detailed discussion of the technical specifications of these database tables, please refer to the OMOP Common Data Model specifications Version 4.5 (