Standard Vocabulary Specification Version 4.5: Domains

OMOP Implementation Specification

Standard Vocabularies in Observational Data Analysis (October 2013)

4. Vocabulary Domains


4.1. Drug Domain

4.1.1. Vocabularies
The standard drug vocabulary consists of the following components found in figure 2.

  • Drug reference terminology RxNorm2, maintained by the National Library of Medicine (NLM) – vocabulary_id 8.
  • Classifications for mechanism of action, physiological effect, chemical structure and indication/Contraindication: National Drug File-Reference Terminology (NDF-RT)3, developed by a consortium of the NLM, the U.S. Department of Veterans Affairs, Veterans Health Administration (VHA) and Appelon, Inc. – vocabulary_id 7.
  • Drug products and classes developed by the Department of Veterans Affairs (VA Product and VA Class)4 – vocabulary_id 28 and 32.
  • Two classifications for therapeutic class: The Anatomical Therapeutic Chemical (ATC)5 classification maintained by the WHO Collaborating Centre for Drug Statistics Methodology, vocabulary_id 21 and the Enhanced Therapeutic Classification (ETC)6, maintained by First Databank (FDB) – vocabulary_id 20.
  • Indication and Contraindications: FDA-approved and off-label indications as well as Contraindications are provided by First Databank (FDB Indications)7 – vocabulary_id 19.

RxNorm and the classification systems NDF-RT, VA Class, ATC, ETC and FDB Indications form a combined drug vocabulary. RxNorm is used to define individual drug products as well as their ingredients. On top of these RxNorm-based concepts there are a number of drug classification systems, some derived from NDF-RT and some from ATC and ETC. In addition to that, FDB-based indication and Contraindications are linked to the RxNorm-based concepts.


Figure 2: OMOP Standard Terminology and Classification of Drugs
In addition to these Standard Vocabularies, the cohorts Drug of Interest (DOI) represent complex definitions of drug exposure. These concepts are listed under vocabulary_id 33 and have relationship and ancestry relationships to the RxNorm concepts that are part of their definitions. Please find a more detailed discussion under the Cohort Domain further below.


4.1.2. Implementation of RxNorm
RxNorm is structured into elements that reflect the active ingredients, strengths, and dose form comprising each drug (table 11). For each element, a separate RxNorm concept is defined.
Table 11: Concept Types and Examples as Defined by RxNorm

Element Definition Examples
Ingredient A compound or moiety that gives the drug its distinctive clinical properties. Aspirin
Clinical Drug Combination of ingredient, strength and dose form. Aspirin 500 MG Oral Tablet
Branded Drug Ingredient, strength, and dose form plus brand name. Aspirin 500 MG Oral Tablet [Bayer Aspirin]
Brand Name Pack Fixed product combination of Branded Drugs. {24 (Acetaminophen 500 MG / Diphenhydramine 25 MG Oral Tablet [Tylenol Extra Strength P.M.]) / 50 (Acetaminophen 500 MG Oral Tablet [Tylenol]) } Pack [Tylenol Extra Strength Day and Night Value Pack]
Generic Pack Fixed product combination of Clinical Drugs. {24 (Acetaminophen 500 MG / Diphenhydramine 25 MG Oral Tablet) / 50 (Acetaminophen 500 MG Oral Tablet) } Pack
Brand Name A proprietary name for a family of products containing a specific active ingredient. Bayer Aspirin
Dose Form The physical form of a drug intended for administration or consumption. Oral Tablet
Clinical Drug Component Ingredient plus strength Aspirin 500 MG
Branded Drug Component Branded ingredient plus strength. Aspirin 500 MG [Bayer Aspirin]
Clinical Drug Form Ingredient plus Dose Form. Aspirin Oral Table
Branded Drug Form Branded ingredient plus Dose Form. Aspirin Oral Tablet [Bayer Aspirin]

All RxNorm elements are imported into the Concepts table for the convenience of researchers navigating drug information and codes. However, only five elements are used in the Standard Vocabulary: Clinical Drug and Branded Drug as well as Branded and Generic Packs. These are the low-level drug concepts (concept_level 1). These low-level concepts report into Ingredients, which are implemented as parent concepts (concept_level 2). All other RxNorm elements are not part of the Standard Vocabulary, are not used for mapping and classification, and therefore have a class level assignment of 0.
The resulting Standard Drug Vocabulary structure derived from RxNorm is shown in figure 3. See Appendix A for details and counts of concepts based on RxNorm.

Figure 3: RxNorm Implementation.
Structures in bold belong to the Standard Vocabulary: Level 1 drug products (Clinical/Branded Drugs/Packs) and 2 (Ingredients). All the other RxNorm elements are loaded into the CONCEPT table, but are not part of the Standard Vocabulary. All relationships derived from RxNorm are available.
Table 12: Relationships Imported or Inferred for RxNorm-derived Concepts

Relationship ID Relationship Name Defines Ancestry Description Reverse Relationship
2 Has precise ingredient (RxNorm) Relationship between Brand Name and Clinical Drug Component concepts and Precise Ingredients 136
3 Has tradename (RxNorm) Relationship between the clinical (generic) and branded equivalent concept 137
4 Has dose form (RxNorm) Relationship between Clinical and Branded Drug, Pack and Drug Form and their Dose Form concepts 138
5 Has form (RxNorm) Relationship between Ingredient and Precise Ingredient concepts 139
6 Has ingredient (RxNorm) Relationship between Clinical and Branded Drugs, Packs or Drug Components and their respective Ingredients and Brand Names 140
7 Constitutes (RxNorm) x Relationship between Clinical and Branded Drug Components and their respective Drugs and Packs 141
8 Contains (RxNorm) x Relationship between Clinical and Branded Packs and their respective Clinical and Branded Drugs 142
9 Reformulation of (RxNorm) x Relationship between Brand Names that have been reformulated 143
10 Subsumes x Hierarchical relationship among RxNorm concepts 144

See Appendix B for details and counts of relationships relevant for RxNorm.


4.1.3. Implementation of NDF-RT and VA NDF
NDF-RT and VA NDF is a terminology and classification system of drugs (figure 3). Similarly to RxNorm, it defines drugs defined by ingredient, strength and form as VA Product (vocabulary_id 28), which are part of NDF. The next level above VA Product is the Pharmaceutical Preparation, which is the equivalent of an RxNorm Ingredient. Typically, two such Pharmaceutical Preparations have a hierarchical relationship to each other: The higher-level concept defines the ingredient name, and the lower-level concept is the ingredient salt or isomer form. This is equivalent to the RxNorm Ingredient and Precise Ingredient. For each Pharmaceutical Preparation, relationships to higher-level drug classes are defined. Exceptions to this rule are VA Classes, which are defined for each VA Product and are listed under vocabulary_id 32.

Figure 4: Structure of NDF-RT and VA NDF and the Relationship to RxNorm.
NDF-RT and VA NDF are fully loaded into the CONCEPT table. Some concept classes and relationships have been somewhat re-worded for better readability. For example, the original NDF-RT relationship "has PE" is now "Has physiological effect (NDF-RT)". Only the VA Class, Mechanism of Action, Physiological Effect, Chemical Class and Indications/Contraindications are defined as Standard Vocabulary. Since they are higher-level drug classes above the Ingredient level (concept_level 2) they are assigned concept_level 3 and for the top concepts in each class concept_level 4. All non-Standard Vocabulary concepts are concept_level 0.
The Indication or Contraindication class really contains concepts of the condition type. Any such concept can be an indication to one drug and at the same time a Contraindication to another. The distinction is made through the relationships. Indications of Pharmaceutical Preparations are connected through the "May treat (NDF-RT)" (relationship_id 21) or "May prevent (NDF-RT)" (relationship_id 23) relationships, while Contraindications are defined through "Contraindication to (NDF-RT)". Only the indication relationships are used for the concept ancestor definition (see below).
Indication or Contraindication concepts also have equivalence relationships to SNOMED-CT: "Indication/contraindication to SNOMED (NDF-RT)" (relationship_id 247).
NDF-RT concepts have equivalence relationships to RxNorm. Usually, the higher-level Pharmaceutical Preparation (active ingredient) has an equivalence relationship to an RxNorm Ingredient (concept_level 2), and the VA Product is linked to the RxNorm concept_level 1 concepts (Clinical or Branded Drug and Packs). This is defined as relationship "NDFRT equivalent to RxNorm (RxNorm)" (relationship_id 28). Sometimes these relationships are missing in the Source Vocabulary, and therefore additional relationships were inferred and added ("NDFRT to RxNorm equivalent by concept_name (OMOP)", relationship_id 286). Inferred relationships were also added to connect RxNorm Ingredients to VA Classes (see below). See Appendix A for details and counts of concepts based on NDF-RT and VA NDF.
Table 13: Relationships for NDF-RT and VA NDF derived Concepts

Relationship ID Relationship Name Defines Ancestry Description Reverse Relationship
12 Induces (NDF-RT) Relationship between VA Products or Pharmaceutical Preparations and Conditions (adverse events it causes) 146
13 May diagnose NDF-RT) Relationship between NDF-RT concepts and Conditions it may diagnose 147
14 Has physiological effect (NDF-RT) x Relationship between NDF-RT concepts and the physiological effect it causes 148
15 Has contraindicating physiological effect (NDF-RT) Relationship between VA Products or Pharmaceutical Preparations and the physiological effect that is responsible for its Contraindication 149
16 Has ingredient (NDF-RT) x Relationship between VA Products or Pharmaceutical Preparations and the chemical class containing its ingredient 150
17 Has contraindicating chemical class (NDF-RT) Relationship between VA Products or Pharmaceutical Preparations and the chemical class that is responsible for its Contraindication 151
18 Has mechanism of action (NDF-RT) x Relationship between NDF-RT concepts and its Mechanism of Action 152
19 Has contraindicating mechanism of action (NDF-RT) Relationship between VA Products or Pharmaceutical Preparations and the Mechanism of Action that is responsible for its Contraindication 153
20 Has pharmacokinetics (NDF-RT) Relationship between VA Products or Pharmaceutical Preparations and pharmacokinetic mechanisms 154
21* May treat (NDF-RT) x Relationship between NDF-RT concepts and its Indication 155
22 Contraindication to (NDF-RT) Relationship between NDF-RT concepts and its Contraindication 156
23* May prevent (NDF-RT) x Relationship between NDF-RT concepts and its Indication 157
24 Has active metabolites (NDF-RT) Relationship between VA Products or Pharmaceutical Preparations and its metabolite (Chemical Structure) 158
25 Has site of metabolism (NDF-RT) Relationship between VA Products or Pharmaceutical Preparations and its site of metabolization (Pharmacokinetics) 159
26 Effect may be inhibited by (NDF-RT) Relationship between NDF-RT concepts and other NDF-RT concepts that may inhibit its activity. 160
27 Has chemical structure (NDF-RT) x Relationship between Pharmaceutical Preparations and its Chemical Structure 161
28 Has dose form (NDF-RT) x Equivalence relationships between NDF-RT and RxNorm concepts 162
247 Indication/Contraindication to SNOMED (NDF-RT) Equivalence relationship between an Indication concept and a SNOMED Clinical Finding 163
275 Has therapeutic class (NDF) x Relationship between VA Classes and VA Therapeutic Classes 276
277 Drug-drug interaction for (NDF) Relationship between Drug Interaction concepts and Pharmaceutical Preparations 278
279 Has pharmaceutical preparation (NDF) x Relationship between VA Products and Pharmaceutical Preparations 280
281 Inferred ingredient of (OMOP) x Inferred relationship between drug and ingredient from unambiguous cases where drugs have only one ingredient. 282
285 RxNorm to NDF-RT equivalent by concept_name (OMOP) x Equivalence relationship between NDF-RT and RxNorm concepts based on concept_name identity 286

*Both relationships 21 and 23 are used by NDF-RT to characterize the indication for a drug.
Many of the relationships between RxNorm and NDF-RT are used to build CONCEPT_ANCESTOR records, in effect creating classifications for each drug of the type VA Class (clinical classification), Mechanism of Action (biological classification), Chemical Class (chemical classification) and Indication or Contraindication. The exact path of the chain of individual relationships between concepts and classes is complex but can be ignored by the user.
See Appendix B for details and counts of relationships relevant for NDF-RT and VA NDF.


4.1.4. Implementation of ATC
Within ATC, drugs are divided into fourteen anatomical main groups (1st level), with one pharmacological/therapeutic subgroup (2nd level). The 3rd and 4th levels are chemical/pharmacological/therapeutic subgroups and the 5th level is the chemical substance.

ATC concepts are loaded into the standard vocabulary with a concept vocabulary_id 21 and are assigned a concept_class of ‘Anatomical Therapeutic Chemical Classification’ and a concept_level of 3. See Appendix A for details and counts of the concepts based on ATC.

The hierarchic relationships between the ATC concepts are captured using the ‘Subsumes’ relationship (relationship type 10). RxNorm Clinical Drugs are linked to the ATC classification system using a separate relationship (relationship_id 131) provided by FDB (see below) and a manually added relationship 289 between low-level ATC concepts and RxNorm ingredients. Finally, relationships were added between RxNorm Ingredients and ATC concepts in cases of a existing ancestor relationship between an ATC concept and a RxNorm level 1 drug product concept for those drugs that had only one ingredient (unambiguous class assignment). All relationships between RxNorm and ATC are used to build CONCEPT_ANCESTOR records, making ATC concepts classifications of drugs. For details and counts of relationships see Appendix B.

Relationship ID Relationship Name Defines Ancestry Description Reverse Relationship
10 Subsumes x Hierarchical relationship among ATC concepts 144
131 ATC to RxNorm (FDB) x Equivalence relationship between low-level ATC concepts and RxNorm Clinical drugs provided by FDB 245
289 ATC to RxNorm equivalent by concept_name (OMOP) x Equivalence relationship between low-level ATC concepts and RxNorm ingredients based on string comparison 290
281 Inferred ingredient of (OMOP) x Inferred relationship between drug and ingredient from unambiguous cases where drugs have only one ingredient. 282


4.1.5. Implementation of ETC
The ETC system is FDB's therapeutic classification system and is similarly implemented as ATC. Concepts are assigned vocabulary_id 20 and the concept class ‘Enhanced Therapeutic Classification’ and a concept level of 3. See Appendix A for details and counts of the concepts based on ETC.

The hierarchical relationships among the ETC concepts are captured using the ‘Subsumes’ relationship (relationship_id 10). RxNorm Clinical drugs are tied to ETC classification system using a separate hierarchical concept relationship provided by FDB (see table 15) and a manually added relationship between 289 between low-level ETC concepts and RxNorm ingredients. For details and counts of relationships see Appendix B.

Table 15: List of ETC Relationships

Relationship ID Relationship Name Defines Ancestry Description Reverse Relationship
10 Subsumes x Hierarchical relationship among ETC concepts 144
130 ETC to RxNorm (FDB) x Equivalence relationship between low-level ETC concepts and RxNorm Clinical drugs provided by FDB 244
281 Inferred ingredient of (OMOP) x Inferred relationship between drug and ingredient from unambiguous cases where drugs have only one ingredient. 282
287 ETC to RxNorm equivalent by concept_name (OMOP) x Equivalence relationship between low-level ETC concepts and RxNorm ingredients based on string comparison 288

All relationships between RxNorm and ETC are used to build CONCEPT_ANCESTOR records, making ETC concepts classifications of drugs.


4.1.6. Implementation of FDB Indication and Contraindication
FDB developed Indication and Contraindication concepts for all drugs from a variety of sources such as FDA MedWatch, journal articles, expert treatment guidelines (like the American Society of Health-System Pharmacists (AHFS) Drug Information, The Medical Letter) and product package inserts8. All Indication and Contraindication concepts are part of the Standard Vocabulary and are assigned concept_id 19, concept_level 3 and concept_class of ‘Indication or Contraindication’ with one or more of the following CONCEPT_RELATIONSHIP records in table 16.

Table 16: FDB Indication or Contraindication Relationships

Relationship ID Relationship Name Defines Ancestry Description Reverse Relationship
10 Subsumes x Hierarchical relationship among ETC concepts 144
126 Has FDA-approved drug indication (FDB) x Relationship between RxNorm concepts and FDA-approved indications (label) 240
127 Has off-label drug indication (FDB) x Relationship between RxNorm and off-label but commonly accepted indications 241
129 Has drug Contraindication (FDB) Relationship between RxNorm and Contraindications (label) 243
243 Indication/Contraindication to SNOMED Equivalence relationship between FDB Indications and Contraindications and SNOMED concepts 248

Indications are linked to RxNorm drug product concepts and SNOMED-CT condition concepts.

RxNorm level 1 drug products are linked to both indication (FDA-approved and off-label) and Contraindication concepts through concept relationship types 126, 127 or 129. For each Indication concept, a list of SNOMED-CT concepts is provided through relationship_id 247. In addition, mapping between ICD-9-CM and FDB Indication or Contraindication concepts is provided in the SOURCE_TO_CONCEPT_MAP table.

Only relationships 10, 126 and 127 are used to build CONCEPT_ANCESTOR records, effectively making Indications (labeled and off-label) classifications of drugs.

See Appendix A for details and counts of the concepts and Appendix B for details and counts of relationships relevant to FDB Indications/Contraindications.


4.1.7. Levels
The resulting combined Standard Vocabulary for the Drug domain has 4 concept_levels (table 17).
Table 17: Standard Vocabulary levels for the Drug Domain

Level Description
0 Concepts not used for the Standard Vocabulary
1 Clinical Drugs, Branded Drugs, Generic Packs and Branded Packs
2 Ingredients
3 Drug classes, indications, Contraindications
4 Top level drug class concepts

Levels 1 and 2 are based on RxNorm and are stratified, ie. concepts of the same level cannot have hierarchical relationships to each other (levels are one concept high). Level 1 concepts are marketed drug products administered to patients such as Clinical and Branded Drugs, Generic and Branded Packs.

Level 2 designates generic Ingredients. Brand names (branded ingredients) are not part of the Standard Vocabulary (but loaded as level 0 concepts).

Level 3 and 4 concepts are based on ETC, ATC and NDF-RT and are on top of the RxNorm based concept_level 1 and 2 drugs. Level 3 concepts are drug class concepts, and can have hierarchical relationships to each other (e.g. VA Class "ACE INHIBITORS" is part of the NDF-RT Mechanism of Action class "Enzyme Inhibitors" . Level 4 are top level concepts for each class, e.g. "Cellular or Molecular Interactions" is the top concept for the NDF-RT Mechanism of Action class. Table 18 shows an example of the drug "Prilosec 20 mg tablets" and the hierarchical classification based on CONCEPT_ANCESTOR records (not all shown).

Table 18: Example of a Typical Hierarchical Ladder for Drugs

Concept ID Concept Name Concept Level Concept Class Vocabulary Concept Code
19034886 Omeprazole 20 MG Enteric Coated Capsule [Prilosec] 1 Branded Drug 8 207212
923645 Omeprazole 2 Ingredient 8 7646
4319354 2-Pyridinylmethylsulfinylbenzimidazoles 3 Chemical Structure 7 N0000175098
4351005 Sulfoxides 3 Chemical Structure 7 N0000008055
4350914 Heterocyclic Compounds 3 Chemical Structure 7 N0000008095
4352034 Heterocyclic Compounds, 2-Ring 3 Chemical Structure 7 N0000008260
4352033 Heterocyclic Compounds, 1-Ring 3 Chemical Structure 7 N0000008259
4351444 Benzimidazoles 3 Chemical Structure 7 N0000007536
4340570 Infectious Diseases 3 Indication or Contraindication 7 N0000000007
4344424 Paraneoplastic Endocrine Syndromes 3 Indication or Contraindication 7 N0000002143
4342919 Esophagitis 3 Indication or Contraindication 7 N0000001165
4345391 Heartburn 3 Indication or Contraindication 7 N0000001444
4343495 Neoplasms 3 Indication or Contraindication 7 N0000002128
4342919 Esophagitis 3 Indication or Contraindication 7 N0000001165
4345391 Heartburn 3 Indication or Contraindication 7 N0000001444
4343495 Neoplasms 3 Indication or Contraindication 7 N0000002128
4342948 Gastroesophageal Reflux 3 Indication or Contraindication 7 N0000001319
4342918 Esophageal Diseases 3 Indication or Contraindication 7 N0000001159
4342057 Gastroenteritis 3 Indication or Contraindication 7 N0000001317
4343631 Intestinal Diseases 3 Indication or Contraindication 7 N0000001698
4266745 Stomach Ulcer 3 Indication or Contraindication 7 N0000002830
4345754 Helicobacter Infections 3 Indication or Contraindication 7 N0000003419
4264962 Duodenal Ulcer 3 Indication or Contraindication 7 N0000001008
4323875 Active Transporter Interactions 3 Mechanism of Action 7 N0000000072
4324302 Small Ion Transport Pump Interactions 3 Mechanism of Action 7 N0000000066
4324013 Proton Pump Inhibitors 3 Mechanism of Action 7 N0000000147
4330747 Gastric Acid Alteration 3 Physiologic Effect 7 N0000009054
21001999 Gastroesophageal Reflux 3 Indication or Contraindication 19 1999
21004195 Prevention of Stress Ulcer 3 Indication or Contraindication 19 4195
21002025 Duodenal Ulcer due to H. Pylori 3 Indication or Contraindication 19 2025
21002024 Duodenal Ulcer 3 Indication or Contraindication 19 2024
21002232 Upper GI Bleed 3 Indication or Contraindication 19 2232
21002055 Gastric Hypersecretory Conditions 3 Indication or Contraindication 19 2055
21000608 Zollinger-Ellison Syndrome 3 Indication or Contraindication 19 608
21001993 Erosive Esophagitis 3 Indication or Contraindication 19 1993
21502545 GI Acid Secretion Reducing Agents - Antisecretory Agents 3 Enhanced Therapeutic Classification 20 2545
21502546 Peptic Ulcer Therapy 3 Enhanced Therapeutic Classification 20 2546
21500445 Gastric Acid Secretion Reducing Agents - Proton Pump Inhibitors (PPIs) 3 Enhanced Therapeutic Classification 20 445
21600046 DRUGS FOR ACID RELATED DISORDERS 3 Anatomical Therapeutic Chemical Classification 21 A02
21600080 DRUGS FOR PEPTIC ULCER AND GORD 3 Anatomical Therapeutic Chemical Classification 21 A02B
21600095 PROTON PUMP INHIBITORS 3 Anatomical Therapeutic Chemical Classification 21 A02BC
4279050 GASTROINTESTINAL MEDICATIONS 3 VA Class 32 GA000


4.1.8. Mapping

    Mappings are provided in the SOURCE_TO_CONCEPT_MAP table for drug codes from source vocabularies. These are all alternative drug vocabularies to RxNorm, and generally for each product there are equivalent representations in RxNorm and each of the following:
  • National drug codes (NDC)9 – vocabulary_id 9. NDCs define the labeler, product and trade package size of all drugs in the FDA Drug Registration and Listing System. It is composed of two segments, the labeler code (assigned by the FDA) and the product and package codes assigned by the drug manufacturer. OMOP collects NDCs from a variety of distributors, as no one source maintains a complete listing of NDCs.
    Medi-Span Generic Product Identifier (GPI)10 codes – vocabulary_id 10.
  • Multum Cerner Main Multum Drug Codes (MMDC)11 – vocabulary_id 16.
  • Department of Veterans Affairs VA ID Drug identifiers from the VA National Drug File VA-NDF (VA Product)12 – vocabulary_id 28.
  • First Databank GCNSEQNO13 – vocabulary_id 53.
  • First Databank UK Multilex14 iProductID containing branded and generic drug products in the UK – vocabulary_id 22.
  • NLM Medical Subject Headings (MeSH)15 – vocabulary_id 46.
  • FDA Structured Product Labels (SPL)16 – vocabulary_id 50.

    The records with these mappings are designated by the mapping_type 'DRUG'.
    In addition to drug source codes, cross-references are available for drugs that are administered as part of a medical procedure (procedure drugs) as indicated by mapping_type = ’PROCEDURE DRUG’. Procedure drug mappings are available for the following sources (for details see below):
  • ICD-9-Procedures codes – vocabulary_id 3.
  • American Medical Association (AMA) CPT-4 codes – vocabulary_id 4*.
  • Center for Medicare and Medicaid (CMS) HCPCS codes – vocabulary_id 5.
  • ICD-10-PCS codes – vocabulary_id 35.

These Source Vocabularies are mapped to the RxNorm based drug concepts. If the precise drug product is known, the mapping goes to concept_level 1 drug products (for branded drugs if available to RxNorm Branded Drug, otherwise to equivalent generic Clinical Drug), if only the ingredient is known to concept_level 2 Ingredients. For those drug vocabularies where the entire code list is available, mapping records are provided even if the equivalent RxNorm Concepts are unknown. In those cases, both the target_concept_id and target_vocabulary_id are 0.
See Appendix C for counts and coverage information for each source vocabulary.


4.2. Condition Domain


4.2.1. Vocabularies
The OMOP Standard Vocabulary for conditions consists of SNOMED-CT and MedDRA concepts and
hierarchies (figure 5).

Figure 5: OMOP Standard Terminology and Classification of Conditions based on SNOMED-CT

  • The Systematized Nomenclature of Medicine - Clinical Terms (SNOMED-CT)17 is maintained by The International Health Terminology Standard Organization (IHTSDO). SNOMED-CT covers most areas of clinical information such as diseases, findings, procedures, microorganisms, pharmaceuticals etc, summarized under vocabulary_id 1. All condition concepts are taken from
    the “Clinical Findings” hierarchy.
  • As an alternative classification, the Standard Vocabulary includes the Medical Dictionary for Regulatory Activities (MedDRA)18, which is distributed by the International Federation of Pharmaceutical Manufacturers and Associations (IFPMA) and stored under vocabulary_id 15.

The entire SNOMED-CT is loaded into the vocabulary CONCEPT table for the convenience of the researcher, but only the “Clinical finding” domain is used as a primary vocabulary for conditions (see below for procedures and observations). These concepts form a rich hierarchy of diagnoses, diseases and symptoms and are also cross-referenced to other SNOMED-CT domains.

MedDRA is also a hierarchical system of clinical findings used for regulatory submission of adverse events of medical products. For the purposes of the Standard Vocabulary, MedDRA is implemented so it can serve as an alternative classification system on top of low-level SNOMED-CT-based condition concepts. MedDRA based concepts are assigned vocabulary_id 15.

In addition to these Standard Vocabularies, the cohorts Health Outcome of Interest (HOI) and Standardized MedDRA Queries (SMQ) represent complex definitions of conditions. These concepts are listed under vocabulary_id 33 and have relationship and ancestry relationships to the SNOMED-CT and MedDRA concepts that are part of their definitions. Please find a more detailed discussion under the Cohort Domain further below.


4.2.2. Relationships
Relationships as defined within SNOMED-CT and MedDRA as well as between conditions, SMQ and HOI (see below). SNOMED-derived relationships were imported, and SNOMED "IS-A" were converted to OMOP "Subsumes" relationships (Table 19). For a detailed list of relationships including all internal SNOMED relationships see Appendix B.
Table 19: Relationship Types Defined for Concepts of the Condition Domain

Relationship ID Relationship Name Defines Ancestry Description Reverse Relationship
10 Subsumes x Hierarchical relationship among SNOMED concepts of the same concept_class and among MedDRA between concept_levels 144
29-88 Various SNOMED relationships Relationships between SNOMED concepts that do not belong to the same concept_class 163-222
125 MedDRA to SNOMED equivalent (OMOP) x MedDRA to SNOMED equivalent (OMOP) 239
247 Indication/Contraindication to SNOMED Equivalence relationship between FDB or NDF-RT Indication and SNOMED concepts 248

The two main condition vocabularies are connected to each other as follows: Among the SNOMED-CT concepts, those with the concept_class "Clinical Finding" of concept_level 1 (leaves) and 2 (intermediate) are linked to MedDRA Preferred Term (concept_level 2) concepts. In addition, the Indication or Contraindication concepts in the FDB (vocabulary_id 19) and NDF-RT (vocabulary_id 7) vocabularies are also linked to SNOMED-CT "Clinical Finding" concepts.


4.2.3. Levels
In contrast to the Drug Domain, levels are not assigned as part of an overall condition concept and classification system, but instead within the SNOMED-CT and MedDRA vocabularies.

SNOMED-CT has no fixed hierarchy among its concepts, because any concept can theoretically be related to any other. It is therefore not possible to assign stratified concept levels. The rule adopted for the OMOP Standard Vocabulary is that all lowest concepts in SNOMED-CT without any descendant concepts are designated concept_level 1, and all higher-level concepts are designated concept_level 2. Concept_level 3 is the top level “Clinical finding” concept.

MedDRA is a stratified hierarchical vocabulary with 5 levels: Low Level Terms (LLT, concept_level 1), Preferred Terms (PT, concept_level 2), High Level Term (HLT, concept_level 3), High Level Group Terms (HLGT, concept_level 4) and System Organ Class (SOC, concept_level 5).

As a result, level 1 and 2 SNOMED-CT concepts that are mapped from other source vocabularies are members of two classification systems: (1) SNOMED-CT concepts connected through the hierarchical "Subsumes" or "Is a" relationships, and (2) MedDRA concepts that are linked through SNOMED-MedDRA equivalence relationships and hierarchical relationships within the MedDRA vocabulary (see table 20 as an example). The CONCEPT_ANCESTOR table can be used to easily obtain these SNOMED-CT and MedDRA classifications for each low-level condition code.
Table 20: Example of SNOMED-CT Condition concept 312327 "Acute myocardial infarction" and Hierarchical Classifications (ancestors, not all shown).

Concept ID Concept Name Concept Level Concept Class Vocabulary Concept Code
312327 Acute myocardial infarction 2 Clinical Finding 1 57054005
440142 Disease of mediastinum 2 Clinical Finding 1 49483002
4043346 Disease of thorax 2 Clinical Finding 1 118946009
4180628 Disorder of body system 2 Clinical Finding 1 362965005
432795 Traumatic AND/OR non-traumatic injury 2 Clinical Finding 1 417163006
4103183 Cardiac observations 2 Clinical Finding 1 301095005
40524164 Acute ischemic heart disease 2 Clinical Finding 1 32598000
4274025 Disease 2 Clinical Finding 1 64572001
4132088 Acute heart disease 2 Clinical Finding 1 127337006
40597938 Atherosclerotic heart disease 2 Clinical Finding 1 41702007
321588 Heart disease 2 Clinical Finding 1 56265001
4239975 Myocardial disease 2 Clinical Finding 1 57809008
441840 Clinical finding 3 Clinical Finding 1 404684003
35205180 Acute myocardial infarction 2 Preferred Term 15 10000891
37622445 Peripheral circulatory failure 2 Preferred Term 15 10034567
35204989 Cardiac disorder 2 Preferred Term 15 10061024
35204998 Cardiovascular disorder 2 Preferred Term 15 10007649
35205189 Myocardial infarction 2 Preferred Term 15 10028596
35202457 Cardiac disorders NEC 3 High Level Term 15 10007543
37604016 Circulatory collapse and shock 3 High Level Term 15 10009193
37602356 Arteriosclerosis, stenosis, vascular insufficiency and necrosis 4 High Level Group Term 15 10003216
35202051 Cardiac disorder signs and symptoms 4 High Level Group Term 15 10007539
37602360 Vascular disorders NEC 4 High Level Group Term 15 10047066
37202319 Thoracic disorders (excl lung and pleura) 4 High Level Group Term 15 10013369
35202055 Coronary artery disorders 4 High Level Group Term 15 10011082
35200000 Cardiac disorders 5 System Organ Class 15 10007541
37200000 Respiratory, thoracic and mediastinal disorders 5 System Organ Class 15 10038738
37600000 Vascular disorders 5 System Organ Class 15 10047065

SNOMED-CT has vocabulary_id 1, MedDRA 15.


4.2.4. Mapping
Mappings from vocabularies used in source data to SNOMED-CT "Clinical Finding" derived concepts are
provided in the vocabulary SOURCE_TO_CONCEPT_MAP table (mapping_type=’CONDITION’):

  • The ICD-9 Clinical Modification (ICD-9-CM)19 diagnostic morbidity codes (Volumes 1 and 2), maintained by the National Center for Health Statistics (NCHS) – vocabulary_id 2
  • ICD-10 Clinical Modification (ICD-10-CM) 20 morbidity classification for classifying diagnoses and reason for visits, provided by the Centers for Medicare and Medicaid Services (CMS) and the National Center for Health Statistics (NCHS) – vocabulary_id 34
  • Clinical Terms V3 (CTV3) 21 or Read codes maintained by Britain’s NHS Centre for Coding and Classification (NHSCCC) – vocabulary_id 17
  • Oxford Medical Information System (OXMIS)22 codes, also used in the UK – vocabulary_id 18.
For all these maps, source codes are mapped to the semantically closest SNOMED-CT concept, resulting in mapping to concept_level 1 and 2 concepts. This is in contrast to the mapping conventions in the other Domains, which usually (but not always) map to concept_level 1 concepts.

Apart from the Read codes, for which mappings to SNOMED-CT is provided with the code lists, all other maps had to be either inferred or newly created by OMOP

In addition to these SNOMED mappings, a cross-reference is provided from ICD-9-CM to MedDRA (mapping_type ’CONDITION-MEDDRA’). Finally, a mapping from ICD-9-CM to FDB (but not NDF-RT) Indications or Contraindications is provided for convenience.

For those Condition vocabularies where the entire code list is available, mapping records are provided even if the equivalent SNOMED-CT or MedDRA concepts are unknown. In those cases, both the target_concept_id and target_vocabulary_id are 0.

See Appendix C for counts and coverage information for each source vocabulary.

4.3. Procedure Domain


4.3.1. Vocabularies
Four common procedure code systems are supported in the Standard Vocabulary:

  • The International Statistical Classification of Diseases and Related Health Problems, Clinical Modification, Volume 3 (ICD-9-Procedure) codes maintained by the National Center for Health Statistics (NCHS) – vocabulary_id 3.
  • The American Medical Association produce the Current Procedural Terminology (CPT-4)23 – vocabulary_id 4.
  • The Healthcare Common Procedure Coding System (HCPCS) 24 Level II codes maintained by the Center for Medicare and Medicaid Services (CMS) – vocabulary_id 5.
All three are used for the OMOP Standard Vocabulary as low-level concept codes for procedures. These low-level codes are linked to SNOMED-CT “Procedure” concepts (figure 6) for classification purposes.

Figure 6: OMOP Standard Terminology and Classification of Procedures
As described above, all of SNOMED-CT is loaded into the CONCEPT table, providing the Standard Vocabulary for Conditions and Procedures. The other concepts are available for the convenience of the researcher, but only concepts of the concept_class “Procedure” (similar to "Clinical Finding") are designated Standard Vocabulary concepts with a concept_level greater than 0.

ICD-9-Procedure, CPT-4 and HCPCS concepts are loaded from their respective sources, together with
their relationships to SNOMED-CT.

See Appendix A for details and counts of concepts of the Procedure domain.


4.3.2. Relationships
Hierarchical relationships as defined within ICD-9-Procedure and SNOMED-CT vocabularies, while HCPCS and CPT-4 are flat terminologies. Relationships are also provided between ICD-9-Procedure and CPT-4 codes to and lower-level SNOMED-CT "Procedure" codes. HCPCS to SNOMED-CT relationships are not yet realized. For a detailed list of relationships see Appendix B.
Table 21: Relationship Types Defined for Procedure Concepts

Relationship ID Relationship Name Defines Ancestry Description Reverse Relationship
10 Subsumes x Hierarchical relationship among ICD-9-Procedure and SNOMED "Procedure" concepts 144
91 ICD9 procedure to SNOMED category (OMOP) x Hierarchical relationship between ICD-9-Procedure and SNOMED codes 225
92 ICD9 procedure to SNOMED equivalent (OMOP) x Equivalence relationship between ICD-9-Procedure and SNOMED codes 226
93 CPT-4 to SNOMED category (OMOP) x Hierarchical relationship between CPT-4 and SNOMED codes 227
94 CPT-4 to SNOMED equivalent (OMOP) x Equivalence relationship CPT-4 and SNOMED codes 228

All these relationships are hierarchical and used for constructing the CONCEPT_ANCESTOR table. As a result, SNOMED-CT Procedure concepts can be used as classification for the CPT-4, ICD-9-Procedure and HCPCS-based concepts. See table 22 for an example of SNOMED-CT Procedure concept 40601132 "Lithotripsy of kidney" and hierarchical classifications (descendants, not all shown).
Table 22: Example of SNOMED-CT Procedure Concept 40601132 "Lithotripsy of kidney"

Concept ID Concept Name Concept Level Concept Class Vocabulary Concept Code
40601132 Lithotripsy of kidney 2 Procedure 1 49242003
4087889 Extracorporeal shockwave lithotripsy of the kidney 2 Procedure 1 24376003
4171381 Percutaneous nephrolithotomy with disintegration of calculus 2 Procedure 1 42041003
4343007 Nephroscopic ultrasound fragmentation of ureteric calculus 1

Procedure

1 236173009
4201610 Ultrasonic fragmentation of urinary stone through percutaneous nephrostomy 1 Procedure 1 53514001
4197883 Extracorporeal shockwave lithotripsy of the kidney using fluoroscopic guidance 1 Procedure 1 431731009
4190183 Extracorporeal shockwave lithotripsy for renal calculus 1 Procedure 1 393072009
4022949 Other specified extracorporeal shockwave lithotripsy for renal calculus 1 Procedure 1 175979008
2109635 Lithotripsy, extracorporeal shock wave 1 CPT-4 4 50590
2003942 Ultrasonic fragmentation of urinary stones 1 ICD-9- Procedure 3 59.95
2008220 Extracorporeal shockwave lithotripsy [ESWL] of the gallbladder and/or bile duct 1 ICD-9- Procedure 3 98.52
2008221 Extracorporeal shockwave lithotripsy of other sites 1 ICD-9- Procedure 3 98.59
2003571 Percutaneous nephrostomy with fragmentation 1 ICD-9- Procedure 3 55.04
2008219 Extracorporeal shockwave lithotripsy [ESWL] of the kidney, ureter and/or bladder 1 ICD-9- Procedure 3 98.51
2721045* GLOBAL FEE FOR EXTRACORPOREAL SHOCK WAVE LITHOTRIPSY TREATMENT OF KIDNEY STONE(S) 1 HCPCS 5

S0400

2721426* EXTRACORPOREAL SHOCKWAVE LITHOTRIPSY FOR GALL STONES (IF PERFORMED WITH ERCP, USE 43265) 1 HCPCS 5 S9034

* The related HCPCS concepts are not accessible through the RELATIONSHIP or CONCEPT_RELATIONSHIP table.

However, it must be emphasized that in contrast to the Drug and Condition domains, procedure classifications are not derived by medical science but rather through administrative considerations, and therefore cannot be expected to be semantically precise or generally excepted. In other words, the result of a query for hierarchical relationships, like the above "Lithotripsy of kidney" should be used only as a first step and should be reviewed manually to address a potentially significant number of false positive or false negative query results.


4.3.3. Levels
All HCPCS, CPT-4 and ICD-9-Procedure codes are designated concept_level 1. SNOMED-CT has no strict hierarchy and any concept can be related to any other. It is therefore not possible to assign stratified concept levels. Therefore, all lowest-level leaf concepts are designated concept_level 1, above them are concept_level 2, and concept_level 3 is the top level “Procedure” concept.


4.3.4. Mapping
HCPCS, CPT-4 and ICD-9-Procedures are widely used in source data in the US for coding procedures, and mapping is provided from these source codes to the concepts (to "self"). In addition, the following additional source codes are mapped to SNOMED Procedure concepts:


  • ICD-10 Procedure Coding System (ICD-10-PCS), provided by the National Center for Health Statistics (NCHS) as a medical classification used for procedural codes – vocabulary_id 35.
  • Logical Observation Identifiers Names and Codes (LOINC) – vocabulary_id 6. Only LOINC codes that are require a diagnostic procedure are mapped to the equivalent SNOMED-CT procedure, while blood, urine and other tests are not.

For ICD-10-PCS codes, mapping records are provided even if the equivalent SNOMED-CT concepts
cannot be mapped. In those cases, both the target_concept_id and target_vocabulary_id are 0. In the case of LOINC, only codes for which mapping is available are listed in the SOURCE_TO_CONCEPT_MAP
table.
See Appendix C for counts and coverage information for each source vocabulary in the Procedure
domain.

4.4. Demographic Domain


4.4.1. Vocabularies
Demographic codes are standardized using the following vocabularies:

  • Gender: Administrative Sex codes issued by Health Level Seven (HL7) – vocabulary_id 12.
  • Race – vocabulary_id 13 and Ethnicity – vocabulary_id 44
The Standard Vocabulary for Race and Ethnicity follows the recommendations of the Subcommittee on Standardized Collection of Race/Ethnicity Data for Healthcare Quality Improvement. This Subcommittee of exports was assembled to generate a report regarding the lack of standardization of collection of race and ethnicity data at the federal, state, local, and private sector levels for the IOM Committee on Future Directions for the National Healthcare Quality and Disparities Reports25. Briefly, the report recommends the collection of Ethnicity and Race data as following:
  1. Race and Hispanic ethnicity categories are collected according to the existing Directive of the
    Office of Management and Budget (OMB)26
  2. More detailed Race codes are adopted from the U.S. Center for Disease Control and Prevention (CDC)27


4.4.2. Implementation, Relationships, Levels and Mapping
For the Gender vocabulary, 5 concepts are listed (Male, Other, Female, Unknown/Not Stated and Ambiguous). All Gender concepts are concept_level 1. No mapping or Relationship records are provided.

Only two Ethnicity concepts are implemented according to the Subcommittee recommendation: "Hispanic or Latino" or "Not Hispanic or Latino".

For Race, the OMB Race categories in combination with the granular CDC entities was adopted, but only in the two top hierarchical levels (realized as concept_level 1 and 2). Level 1 contains the OMB Race concepts "Asian", "Black or African American", "Other", "White", "Native Hawaiian or Other Pacific Islander" and "American Indian or Alaska Native". "Non-white" was added for use in simulated data.

There are relationship_id 10 "Subsumes" relationships between these level 1 and 2 concepts, which are adopted from the CDC list. There are no records in the SOURCE_TO_CONCEPT_MAP table as race is usually not coded using the CDC numbering scheme, and each ETL from source data will have to develop a source-specific mapping table.

See Appendix A for details and counts of concepts and Appendix B for relationships relevant to Gender, Race and Ethnicity

4.5. Observation Domain


4.5.1. Vocabularies
Observations are a generic table to capture all clinical findings, observations, complaints, and medical
history, as they are reported in the data, as well as laboratory and radiological tests and their results.
There are three Standard Vocabularies defined for observations:

  • Laboratory tests and values: Logical Observation Identifiers Names and Codes (LOINC)28 is a coding systems maintained by the Regenstrief Institute – vocabulary_id 6.
  • Regenstrief also maintains the "LOINC Multidimensional Classification" – vocabulary_id 49.
  • Qualitative lab results: A set of SNOMED-CT Qualifier Value concepts – vocabulary_id 1.
  • Laboratory units: Unified Code for Units of Measure (UCUM)29, maintained by the UCUM Organization – vocabulary_id 11.
  • All other findings and observables: SNOMED-CT - vocabulary_id 1. The Systematized Nomenclature of Medicine, Clinical Terms (SNOMED-CT), is maintained by the International Health Terminology Standards Development Organization (IHTSDO).


4.5.2. Implementation of LOINC
All available LOINC codes are loaded into the CONCEPT table. LOINC codes consist of six digits and reflect a multi-axial representation, determining for each lab test the component, kind of property, time aspect, system, precision and type of method.

Figure 7: OMOP Standard Terminology and Classification of Lab Observations
LOINC codes are a flat list of concepts. All concepts based on LOINC codes are designated concept_level 1. There is only one relationships defined for LOINC concepts: relationship_id 1 “Concept replaced by (LOINC)”. It codifies identity between two different codes when codes get deprecated. There is also the "Subsumes" relationship (relationship_id 10) to the concept_level 2 LOINC multi-dimensional classification concepts.

Most databases use LOINC or proprietary coding schemes for laboratory tests. In the latter case, a mapping to LOINC has to be developed from scratch, since no standardized mapping can be provided.

See Appendix A for details and counts of concepts and Appendix B for relationships relevant to LOINC.


4.5.3. Implementation of Qualitative Lab Results
The result of the lab test is either a numeric value in combination with a unit, a verbatim text or a coded qualitative result. For the latter result categories, OMOP chose a small but meaningful subset of the available SNOMED-CT Qualifier Values as valid entries (table 23).
Table 23: Valid Values for Qualitative Lab Results

Lab result Concept ID SNOMED Source Code
Final 9188 281321000
Negative 9189 260385009
Not Detected 9190 260415000
Positive 9191 10828004
Trace 9192 260405006


4.5.4. Implementation of UCUM
All standard UCUM concepts available are loaded into the CONCEPT table (concept_class=“UCUM Standard”). However, UCUM is an expandable standard that allows building custom units (concept_class=“UCUM Custom"). UCUM has no coding schema, instead, the actual units are used as
source code. For example “Milligram per deciliter”, concept ID 8840, has the concept code “mg/dL”. All UCUM records are concept_level 1. There are no relationships defined for UCUM. Maps from source codes to these concepts are source data dependent, the mapping_type is “UNIT”.


4.5.5. Implementation of SNOMED-CT for Observations
All standard UCUM concepts available are loaded into the CONCEPT table (concept_class=“UCUM Standard”). However, UCUM is an expandable standard that allows building custom units (concept_class=“UCUM Custom"). UCUM has no coding schema, instead, the actual units are used as
source code. For example “Milligram per deciliter”, concept ID 8840, has the concept code “mg/dL”. All UCUM records are concept_level 1. There are no relationships defined for UCUM. Maps from source codes to these concepts are source data dependent, the mapping_type is “UNIT”.

Figure 8: OMOP Standard Terminology and Classification of Observations

Levels are organized in an identical fashion to the concepts derived from SNOMED-CT for conditions (see above).

Relationships are identical to the principles described for SNOMED-CT conditions (see above). Likewise, only “Subsumes” defines hierarchical relationships used for hierarchical concept_ancestor table construction. In cases where datasets contain verbatim or standardized text strings instead of coded observations, each of these text strings are mapped individually to the equivalent observation categories using mapping_type= “OBSERVATION". The mapping_type for the qualitative lab results is “RESULT CATEGORY”.

4.6. Visit Domain


4.6.1. Vocabularies
Visit codes are standardized using the following terminologies:

  • Place of Service: CMS Place of Service Codes30, maintained by The Centers for Medicare & Medicaid Services (CMS), are two-digit codes placed on health care professional claims to indicate the setting in which a service was provided – vocabulary 14.
  • OMOP Visit concepts – vocabulary_id 24.

All 49 currently valid Place of Service codes are loaded into the CONCEPT table as level 1. There are three Visit concepts defined in table 24.
Table 24: Standard Visit Terminology

Visit Concept ID
Office Visit 9201
Outpatient Visit 9203
Emergency Room Visit 9203


4.6.2. Relationships, Levels and Mapping
Relationship_id 10 "Subsumes" relationships are defined for each CMS Placed of Service into the 3 Visit concepts. However these relationships represent only the most typical cases. For example, Place of Service "Outpatient Hospital" will most likely host Visits "Outpatient Visit". But for "Residential Substance Abuse Treatment Facility" this is not always the case.

All CMS Places of Service are concept_level 1 Concepts. All Visits are concept_level 2 concepts. In the
CONCEPT_ANCESTOR table, Visit concepts are ancestors to Place of Services concepts.

Source data usually do not represent this information in a standardized fashion, and therefore mappings for Places of Service have to establish the mappings individually. The 3 visits are also generally mapped manually in the ETL code.

4.7. Provider Domain


4.7.1. Vocabularies, Relationships, Levels and Mapping
The following Standard Vocabularies are defined for the Provider domain:

  • CMS Specialty Codes31, maintained by the Centers for Medicare and Medicaid Services – vocabulary_id 48.
  • NUCC32 Health Care Provider Taxonomy, maintained by the National Uniform Claims Committee – vocabulary_id 47.
Both coding systems are used for defining the provider specialty and can be used in parallel. CMS Specialty Codes are implemented as concept_level 2, NUCC as concept_level 1. Relationship_id 296 links CMS Specialty concepts to the equivalent NUCC Healthcare Provider concepts33. In the
concept_ancestor table, CMS Specialty concepts are ancestors to NUCC concepts.

Provider domain concepts are listed in the SOURCE_TO_CONCEPT_MAP as records linking the codes to the corresponding concept_ids. No other mapping information is available.

4.8. Cost Domain


4.8.1. Vocabularies, Relationships, Levels and Mapping
The following Standard Vocabularies are defined for the Cost domain.

  • Diagnosis-Related Groups (DRG)34 as implemented by the Centers for Medicare and Medicaid Services for the Medicare Part A "Inpatient Prospective Payment System (IPPS) For Acute Care Inpatient Hospital Stays" – vocabulary_id 40.
  • Major Diagnostic Categories (MDC)35 as implemented by the Centers for Medicare and Medicaid Services as a classification system for DRGs – vocabulary_id 41.
  • Ambulatory Payment Classification (APC)36 as implemented by the Centers for Medicare and Medicaid Services for the Medicare Part A " Prospective Payment System for Hospital Outpatient Department Services" – vocabulary_id 42.
  • Revenue code system defined by the National Uniform Billing Committee (NUBC) for the UB-04 claim form (Revenue Codes)37 – vocabulary_id 43
    In the Standard Vocabulary, both DRG and MS-DRG are implemented under vocabulary_id 40 with full validity information (valid_start_date, valid_end_date). In 2007 after version 25, DRG was revised to MS-DRG with a completely new numbering system, which means that the concept_codes in DRG are not unique (however, the concept_ids are). The change from concept_class "DRG" and "MS-DRG" distinguishes this major revision. Relationship_id 299 links DRG to the corresponding MS-DRG38.

APC and Revenue Codes are implemented as a flat concept_level 1 set of concepts with no classifications or cross-references.

Cost domain concepts are listed in the SOURCE_TO_CONCEPT_MAP as records linking the codes to the corresponding concept_ids. No other mapping information is available.

4.9. Cohort Domain


4.9.1. Vocabularies
The Cohort domain has the purpose to allow researchers to define groupings of entities such as patients or providers. Therefore, the content is not strictly defined by Standard Vocabularies, but instead can be expanded for the needs of the research. However, the following three Vocabularies are defined for the Cohort domain:

  • Drugs of Interests (DOI)39: Special drug cohort definitions that represent drug classes OMOP
    research is focusing on
  • OMOP Health Outcomes of Interest (HOI)40: Special condition cohort definitions that represent outcomes OMOP research is focusing on – both vocabulary_id 33.
  • Standardised MedDRA Queries (SMQ) as maintained by the Medical Dictionary for Regulatory Activities Maintenance and Support Organization (MSSO) for groupings of terms from one or more MedDRA classes that relate to a defined medical condition or area of interest – vocabulary_id 31.
Cohorts can be defined as group of entities exposed to a common circumstance. For example, Health Outcome of Interest (HOI) cohorts specify a group of Persons sharing that HOI. The nature of the Cohort concepts should be defined elsewhere, but the concept_class and concept_name need to uniquely identify the cohort.


4.9.2. Implementation of Drugs of Interest
Drugs of Interest (DOI) are cohort definitions for OMOP research purposes. Each DOI cohort is defined by a list of Clinical Drugs. DOI concepts have vocabulary_id 33, concept_level 2 and hierarchical relationship_id 293 to RxNorm Clinical Drugs. DOIs are listed in the CONCEPT_ANCESTOR table as
ancestor for their respective drug product.


4.9.3. Implementation of Health Outcomes of Interest
Drugs of Interest (DOI) are cohort definitions for OMOP research purposes. Each DOI cohort is defined by a list of Clinical Drugs. DOI concepts have vocabulary_id 33, concept_level 2 and hierarchical relationship_id 293 to RxNorm Clinical Drugs. DOIs are listed in the CONCEPT_ANCESTOR table as
ancestor for their respective drug product.


4.9.4. Implementation of Standardized MedDRA Queries
SMQs are implemented as a 5-level hierarchy which is connected to the MedDRA and SNOMED concepts (figure 9). Some SMQs have hierarchical relationships to other SMQs, and many are defined in terms of sets of MedDRA concepts they are connected to. These definitions might exist in a broad or a narrow incarnation. In the Standard Vocabulary, for SMQs with a broad and a narrow definition the same SMQ are implemented twice with "(narrow)" and "(broad)" as part of their concept_name. Relationships among SMQ are realized using the 10 "Subsumes" relationship, the relationships between SMQ and MedDRA concepts are 132 "SMQ consists of MedDRA (MedDRA)". Both relationships are used for constructing the CONCEPT_ANCESTOR table, such that querying for all descendants of a SMQ results in all MedDRA and from there SNOMED-CT concepts that are defined for the SMQ and all the child SMQs defined for this SMQ (table 25).

Figure 9: SMQ Classification and Relationships to MedDRA and SNOMED-CT
Table 25: Example of SMQ Concept 38000043 "Ischaemic heart disease" and Hierarchical Classifications (descendants, not all shown).

Concept ID Concept Name Concept Level Concept Class Vocabulary Concept Code
38000043 Ischaemic heart disease 2 Standardized MedDRA

Query
31 122
38004629 Other ischaemic heart disease

(broad)
1 Standardized MedDRA

Query
31 124
38004628 Myocardial infarction (broad) 1 Standardized MedDRA

Query
31 123
38000168 Other ischaemic heart disease

(narrow)
1 Standardized MedDRA

Query
31 124
38000047 Myocardial infarction (narrow) 1 Standardized MedDRA

Query
31 123
35205160 Arteriosclerosis coronary artery 2 Preferred Term 15 10003211
35205164 Coronary artery disease 2 Preferred Term 15 10011078
35205165 Coronary artery dissection 2 Preferred Term 15 10048631
35205166 Coronary artery embolism 2 Preferred Term 15 10011084
35205167 Coronary artery insufficiency 2 Preferred Term 15 10052895
35205168 Coronary artery occlusion 2 Preferred Term 15 10011086
35205175 Coronary ostial stenosis 2 Preferred Term 15 10011105
35227367 Arteriosclerosis coronary artery 1 Lowest Level Term 15 10003211
35227368 Atheroma coronary artery 1 Lowest Level Term 15 10003600
35227369 Coronary artery atheroma 1 Lowest Level Term 15 10011073
35227370 Coronary artery atherosclerosis 1 Lowest Level Term 15 10011076
35227371 Coronary artery sclerosis 1 Lowest Level Term 15 10011087
35227372 Coronary atheroma 1 Lowest Level Term 15 10011092
40547656 Angina pectoris 2 Clinical finding 1 367416001
40634151 Acute myocardial infarction of

atrium
2 Clinical finding 1 72977004
4243371 Coronary stricture 2 Clinical finding 1 59062007
4252385 Coronary artery bypass graft

occlusion
2 Clinical finding 1 408546009
4258690 Furcation lesion of coronary

artery
2 Clinical finding 1 440444007
4267568 Acute anteroseptal myocardial

infarction
2 Clinical finding 1 62695002
4328721 Anomalous coronary artery

origin
2 Clinical finding 1 75398000
4215140 Acute coronary syndrome 1 Clinical finding 1 394659003
4215259 First myocardial infarction 1 Clinical finding 1 394710008
4225958 Coronary artery stent

thrombosis
1 Clinical finding 1 421327009

HOI has vocabulary_id=33, SMQ =31, MedDRA=15, SNOMED-CT=1.



4.10. Type Concepts

4.10.1. Vocabularies, Relationships, Levels and Mapping
All Standard Vocabularies for Type concepts defined for the various domains are created as part of the
Common Data Model. They are specialty concepts with the purpose of indicating where the data are
derived from within the source:

  • Drug Exposure Type to defining the origin of the Drug Exposure records. Examples are "Prescription dispensed in pharmacy, "Prescription dispensed through mail order”, "Prescription written", "Medication list entry" etc. – vocabulary_id 36
  • Condition Occurrence Type defining the origin of the Condition Occurrence records. Examples are "Inpatient detail - primary ", "Outpatient detail - 1st position", "EHR problem list entry", etc. – vocabulary_id 37
  • Procedure Occurrence Type defining the origin of the Procedure Occurrence records. Examples are "Inpatient detail - primary position", "Inpatient detail - 1st position, "Inpatient header - primary position", etc. – vocabulary_id 38
  • Observation Type defining the origin of the Observation records. Examples are " Problem list from EHR", "Lab observation numeric result", "Lab observation text", etc. – vocabulary_id 39
  • Death Type defining the origin of the Death records. Examples are "Payer enrollment status 'Deceased'", "Medical claim discharge status 'Died'", "Medical claim diagnostic code indicating death", etc. – vocabulary_id 45

All Type concepts are flat terminologies ( concept_level 1) with no relationships or ancestry relationships. There are no mapping records in the SOURCE_TO_CONCEPT_MAP. Type concepts have to be assigned during ETL in a data source-dependent manner.