Phase II Evaluation of Clinical Coding Schemes: Completeness, Taxonomy, Mapping, Definitions, and Clarity

James R Campbell, Paul Carpenter, Charles Sneiderman, Simon Cohn, Christopher G. Chute, Judith Warren

Research output: Contribution to journalArticle

125 Citations (Scopus)

Abstract

Objective: To compare three potential sources of controlled clinical terminology (READ codes version 3.1, SNOMED International, and Unified Medical Language System (UMLS) version 1.6) relative to attributes of completeness, clinical taxonomy, administrative mapping, term definitions and clarity (duplicate coding rate). Methods: The authors assembled 1929 source concept records from a variety of clinical information taken from four medical centers across the United States. The source data included medical as well as ample nursing terminology. The source records were coded in each scheme by an investigator and checked by the coding scheme owner. The codings were then scored by an independent panel of clinicians for acceptability. Codes were checked for definitions provided with the scheme. Codes for a random sample of source records were analyzed by an investigator for "parent" and "child" codes within the scheme. Parent and child pairs were scored by an independent panel of medical informatics specialists for clinical acceptability. Administrative and billing code mapping from the published scheme were reviewed for all coded records and analyzed by independent reviewers for accuracy. The investigator for each scheme exhaustively searched a sample of coded records for duplications. Results: SNOMED was judged to be significantly more complete in coding the source material than the other schemes (SNOMED* 70%; READ 57%; UMLS 50%; *p < .00001). SNOMED also had a richer clinical taxonomy judged by the number of acceptable first-degree relatives per coded concept (SNOMED* 4.56; UMLS 3.17; READ 2.14, *p < .005). Only the UMLS provided any definitions; these were found for 49% of records which had a coding assignment. READ and UMLS had better administrative mappings (composite score: READ* 40.6%; UMLS* 36.1%; SNOMED 20.7%, *p <. 00001), and SNOMED had substantially more duplications of coding assignments (duplication rate: READ 0%; UMLS 4.2%; SNOMED* 13.9%, *p <. 004) associated with a loss of clarity. Conclusion: No major terminology source can lay claim to being the ideal resource for a computer-based patient record. However, based upon this analysis of releases for April 1995, SNOMED International is considerably more complete, has a compositional nature and a richer taxonomy. It suffers from less clarity, resulting from a lack of syntax and evolutionary changes in its coding scheme. READ has greater clarity and better mapping to administrative schemes (ICD-10 and OPCS-4), is rapidly changing and is less complete. UMLS is a rich lexical resource, with mappings to many source vocabularies. It provides definitions for many of its terms. However, due to the varying granularities and purposes of its source schemes, it has limitations for representation of clinical concepts within a computer-based patient record.

Original languageEnglish (US)
Pages (from-to)238-251
Number of pages14
JournalJournal of the American Medical Informatics Association
Volume4
Issue number3
DOIs
StatePublished - Jan 1 1997

Fingerprint

Systematized Nomenclature of Medicine
Unified Medical Language System
Clinical Coding
Research Personnel
Terminology
Medical Informatics
Vocabulary
Information Storage and Retrieval
International Classification of Diseases

ASJC Scopus subject areas

  • Health Informatics

Cite this

Phase II Evaluation of Clinical Coding Schemes : Completeness, Taxonomy, Mapping, Definitions, and Clarity. / Campbell, James R; Carpenter, Paul; Sneiderman, Charles; Cohn, Simon; Chute, Christopher G.; Warren, Judith.

In: Journal of the American Medical Informatics Association, Vol. 4, No. 3, 01.01.1997, p. 238-251.

Research output: Contribution to journalArticle

Campbell, James R ; Carpenter, Paul ; Sneiderman, Charles ; Cohn, Simon ; Chute, Christopher G. ; Warren, Judith. / Phase II Evaluation of Clinical Coding Schemes : Completeness, Taxonomy, Mapping, Definitions, and Clarity. In: Journal of the American Medical Informatics Association. 1997 ; Vol. 4, No. 3. pp. 238-251.
@article{03de5420651a487c81c9c7e4636e3951,
title = "Phase II Evaluation of Clinical Coding Schemes: Completeness, Taxonomy, Mapping, Definitions, and Clarity",
abstract = "Objective: To compare three potential sources of controlled clinical terminology (READ codes version 3.1, SNOMED International, and Unified Medical Language System (UMLS) version 1.6) relative to attributes of completeness, clinical taxonomy, administrative mapping, term definitions and clarity (duplicate coding rate). Methods: The authors assembled 1929 source concept records from a variety of clinical information taken from four medical centers across the United States. The source data included medical as well as ample nursing terminology. The source records were coded in each scheme by an investigator and checked by the coding scheme owner. The codings were then scored by an independent panel of clinicians for acceptability. Codes were checked for definitions provided with the scheme. Codes for a random sample of source records were analyzed by an investigator for {"}parent{"} and {"}child{"} codes within the scheme. Parent and child pairs were scored by an independent panel of medical informatics specialists for clinical acceptability. Administrative and billing code mapping from the published scheme were reviewed for all coded records and analyzed by independent reviewers for accuracy. The investigator for each scheme exhaustively searched a sample of coded records for duplications. Results: SNOMED was judged to be significantly more complete in coding the source material than the other schemes (SNOMED* 70{\%}; READ 57{\%}; UMLS 50{\%}; *p < .00001). SNOMED also had a richer clinical taxonomy judged by the number of acceptable first-degree relatives per coded concept (SNOMED* 4.56; UMLS 3.17; READ 2.14, *p < .005). Only the UMLS provided any definitions; these were found for 49{\%} of records which had a coding assignment. READ and UMLS had better administrative mappings (composite score: READ* 40.6{\%}; UMLS* 36.1{\%}; SNOMED 20.7{\%}, *p <. 00001), and SNOMED had substantially more duplications of coding assignments (duplication rate: READ 0{\%}; UMLS 4.2{\%}; SNOMED* 13.9{\%}, *p <. 004) associated with a loss of clarity. Conclusion: No major terminology source can lay claim to being the ideal resource for a computer-based patient record. However, based upon this analysis of releases for April 1995, SNOMED International is considerably more complete, has a compositional nature and a richer taxonomy. It suffers from less clarity, resulting from a lack of syntax and evolutionary changes in its coding scheme. READ has greater clarity and better mapping to administrative schemes (ICD-10 and OPCS-4), is rapidly changing and is less complete. UMLS is a rich lexical resource, with mappings to many source vocabularies. It provides definitions for many of its terms. However, due to the varying granularities and purposes of its source schemes, it has limitations for representation of clinical concepts within a computer-based patient record.",
author = "Campbell, {James R} and Paul Carpenter and Charles Sneiderman and Simon Cohn and Chute, {Christopher G.} and Judith Warren",
year = "1997",
month = "1",
day = "1",
doi = "10.1136/jamia.1997.0040238",
language = "English (US)",
volume = "4",
pages = "238--251",
journal = "Journal of the American Medical Informatics Association",
issn = "1067-5027",
publisher = "Oxford University Press",
number = "3",

}

TY - JOUR

T1 - Phase II Evaluation of Clinical Coding Schemes

T2 - Completeness, Taxonomy, Mapping, Definitions, and Clarity

AU - Campbell, James R

AU - Carpenter, Paul

AU - Sneiderman, Charles

AU - Cohn, Simon

AU - Chute, Christopher G.

AU - Warren, Judith

PY - 1997/1/1

Y1 - 1997/1/1

N2 - Objective: To compare three potential sources of controlled clinical terminology (READ codes version 3.1, SNOMED International, and Unified Medical Language System (UMLS) version 1.6) relative to attributes of completeness, clinical taxonomy, administrative mapping, term definitions and clarity (duplicate coding rate). Methods: The authors assembled 1929 source concept records from a variety of clinical information taken from four medical centers across the United States. The source data included medical as well as ample nursing terminology. The source records were coded in each scheme by an investigator and checked by the coding scheme owner. The codings were then scored by an independent panel of clinicians for acceptability. Codes were checked for definitions provided with the scheme. Codes for a random sample of source records were analyzed by an investigator for "parent" and "child" codes within the scheme. Parent and child pairs were scored by an independent panel of medical informatics specialists for clinical acceptability. Administrative and billing code mapping from the published scheme were reviewed for all coded records and analyzed by independent reviewers for accuracy. The investigator for each scheme exhaustively searched a sample of coded records for duplications. Results: SNOMED was judged to be significantly more complete in coding the source material than the other schemes (SNOMED* 70%; READ 57%; UMLS 50%; *p < .00001). SNOMED also had a richer clinical taxonomy judged by the number of acceptable first-degree relatives per coded concept (SNOMED* 4.56; UMLS 3.17; READ 2.14, *p < .005). Only the UMLS provided any definitions; these were found for 49% of records which had a coding assignment. READ and UMLS had better administrative mappings (composite score: READ* 40.6%; UMLS* 36.1%; SNOMED 20.7%, *p <. 00001), and SNOMED had substantially more duplications of coding assignments (duplication rate: READ 0%; UMLS 4.2%; SNOMED* 13.9%, *p <. 004) associated with a loss of clarity. Conclusion: No major terminology source can lay claim to being the ideal resource for a computer-based patient record. However, based upon this analysis of releases for April 1995, SNOMED International is considerably more complete, has a compositional nature and a richer taxonomy. It suffers from less clarity, resulting from a lack of syntax and evolutionary changes in its coding scheme. READ has greater clarity and better mapping to administrative schemes (ICD-10 and OPCS-4), is rapidly changing and is less complete. UMLS is a rich lexical resource, with mappings to many source vocabularies. It provides definitions for many of its terms. However, due to the varying granularities and purposes of its source schemes, it has limitations for representation of clinical concepts within a computer-based patient record.

AB - Objective: To compare three potential sources of controlled clinical terminology (READ codes version 3.1, SNOMED International, and Unified Medical Language System (UMLS) version 1.6) relative to attributes of completeness, clinical taxonomy, administrative mapping, term definitions and clarity (duplicate coding rate). Methods: The authors assembled 1929 source concept records from a variety of clinical information taken from four medical centers across the United States. The source data included medical as well as ample nursing terminology. The source records were coded in each scheme by an investigator and checked by the coding scheme owner. The codings were then scored by an independent panel of clinicians for acceptability. Codes were checked for definitions provided with the scheme. Codes for a random sample of source records were analyzed by an investigator for "parent" and "child" codes within the scheme. Parent and child pairs were scored by an independent panel of medical informatics specialists for clinical acceptability. Administrative and billing code mapping from the published scheme were reviewed for all coded records and analyzed by independent reviewers for accuracy. The investigator for each scheme exhaustively searched a sample of coded records for duplications. Results: SNOMED was judged to be significantly more complete in coding the source material than the other schemes (SNOMED* 70%; READ 57%; UMLS 50%; *p < .00001). SNOMED also had a richer clinical taxonomy judged by the number of acceptable first-degree relatives per coded concept (SNOMED* 4.56; UMLS 3.17; READ 2.14, *p < .005). Only the UMLS provided any definitions; these were found for 49% of records which had a coding assignment. READ and UMLS had better administrative mappings (composite score: READ* 40.6%; UMLS* 36.1%; SNOMED 20.7%, *p <. 00001), and SNOMED had substantially more duplications of coding assignments (duplication rate: READ 0%; UMLS 4.2%; SNOMED* 13.9%, *p <. 004) associated with a loss of clarity. Conclusion: No major terminology source can lay claim to being the ideal resource for a computer-based patient record. However, based upon this analysis of releases for April 1995, SNOMED International is considerably more complete, has a compositional nature and a richer taxonomy. It suffers from less clarity, resulting from a lack of syntax and evolutionary changes in its coding scheme. READ has greater clarity and better mapping to administrative schemes (ICD-10 and OPCS-4), is rapidly changing and is less complete. UMLS is a rich lexical resource, with mappings to many source vocabularies. It provides definitions for many of its terms. However, due to the varying granularities and purposes of its source schemes, it has limitations for representation of clinical concepts within a computer-based patient record.

UR - http://www.scopus.com/inward/record.url?scp=0031136437&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0031136437&partnerID=8YFLogxK

U2 - 10.1136/jamia.1997.0040238

DO - 10.1136/jamia.1997.0040238

M3 - Article

C2 - 9147343

AN - SCOPUS:0031136437

VL - 4

SP - 238

EP - 251

JO - Journal of the American Medical Informatics Association

JF - Journal of the American Medical Informatics Association

SN - 1067-5027

IS - 3

ER -