Controlled Terminology:
its use in (Static) Information
Models and in Applications
© Blue Wave Informatics LLP, 2012
 What is “controlled terminology”?
 Some definitions that I use.....
© Blue Wave Informatics LLP, 2012
• A vocabulary is “Words used by language, book, branch of
science or author; (a) list of these (words)”
• In the informatics context, a vocabulary is a list of words or
phrases (“concepts”) that describe concepts in a
particular domain
© Blue Wave Informatics LLP, 2012
• A dictionary is a listing (usually alphabetic) of concepts
accompanied by explanatory text for those concepts
(explaining meaning of the concept)
• it may also give corresponding words in another language
© Blue Wave Informatics LLP, 2012
Classification (Taxonomy):
• Classification (verb and noun)
– Classification is the action or process of classifying
concepts; arranging concepts in classes or categories
according to shared qualities or characteristics
– Classification is a means of giving order to a group of
disconnected facts
• Taxonomy (noun)
– Taxonomy is the practice and science of classification.
Taxonomies, or taxonomic schemes, are composed of
taxonomic units known as taxa, or kinds of things that are
arranged in a hierarchical structure, related by subtypesupertype relationships.
– A taxonomy is a systematic classification of concepts
within a domain.
© Blue Wave Informatics LLP, 2012
• A hierarchy is a system or organisation in which concepts are
ranked one above the other according to rules (in the wider
world, according to status or authority)
• A hierarchy is therefore a type of classification based on
• * “ranking” is “putting something in its place within a system”
© Blue Wave Informatics LLP, 2012
• Terminology is the study of terms and their use — of words
and compound words (phrases) that are used in specific
contexts (domains)
• Terminology also describes a formal discipline which
systematically studies or practices the description and
organisation of concepts in a domain (subject area) for one or
more uses – the product of this is sometimes called a
“systematic terminology”
© Blue Wave Informatics LLP, 2012
Terminology Principles:
• Analysis and description of the concepts (“units of thought”)
within the particular domain
• Analysis and description of the relationships between the
concepts (the organisation between the terms)
• Identification of all the terms (words or phrases) also
associated with a concept (synonyms); each concept has one
unambiguous preferred term and any number of synonyms
(note: synonyms may be shared with other concepts). This
produces definition of the concepts
• A terminology has aspects of a vocabulary (the set of
concepts used in a domain) and a dictionary (explaining
concepts) and taxonomy/classification (organising the
concepts – structure – relationships between concepts).
© Blue Wave Informatics LLP, 2012
Controlled Terminology:
• The product of applying the principles of terminology to
produce “a body of terms* used with a particular technical
application in a subject of study, theory, profession etc.”
• It is authored (“controlled”) in such a way as to adhere to the
“Cimino Desiderata” for healthcare practice
• It is (therefore) specifically designed to support robust
semantics in machine processing as well as human use
• May be called a “systematic terminology” because the
principles are applied in an organised way
• * Because the “terms” are words or phrases, this is often also
referred to as “controlled vocabulary”
© Blue Wave Informatics LLP, 2012
Interface v. Reference Terminology:
• An interface terminology is one designed for use directly in
• May have
– navigation concepts
– many synonyms
– short keys (and short cuts!)
• A reference terminology is one designed primarily for robust
definition of concepts
– used in analysis
– decision support
– may be very large or complex, so may not be easily used
(without modification) by end users/end user systems
© Blue Wave Informatics LLP, 2012
What about the “o” word?:
• Traditionally, “ontology” is a “branch of metaphysics
concerned with the nature of being”
• An ontology (as opposed to “ontology”) is a representation of
a set of concepts within a domain and the relationships
between those concepts
• How therefore does an ontology differ from a terminology (if at
© Blue Wave Informatics LLP, 2012
• An ontology is used in information science to “reason” about
the properties of the domain, and may be used to “define” the
domain. An ontology moves from the purely informational
representation into the area of assertional knowledge, but like
all boundaries, it is not easy to draw an exact line, and many
terminologies become more “ontological” over time
• Ontologies generally have a rich set of relationships –
– synonymy and sometimes antonymy (although that’s
– hyponymy/hypernymy (the is_a subsumption relationship)
– meronymy and holonymy (partitive relationships)
– negation (although this is really difficult)
© Blue Wave Informatics LLP, 2012
Ontology (II):
• Ontologies will generally include “classification(s)” for the
concepts within it
– Note that this is different from the subsumption relationship
because of the contextualisation of classification
• Ontologies generally have individual instances of concepts,
classes of concepts (also sometimes known as “types”, “sorts”
“categories” or “kinds” – they are collections of concepts),
relationships between concepts and attributes of concepts
– As such, ontologies have an underlying information model
that they instantiate with the concepts themselves, rather
than using concepts from terminology(ies) to instantiate
© Blue Wave Informatics LLP, 2010
Starting to Join Terminology to Information
 Using Terminology in Information Models
 Introducing Concept Domains
© Blue Wave Informatics LLP, 2012
3 + 1 Pillars of CSI
• Following Charlie Mead, US NCI
Necessary, but not necessarily sufficient …
1. A common information model
2. Information model uses robust standard datatypes
3. Information model uses domain specific attribute semantics
from concept-based terminologies
1. Specification for information exchange
© Blue Wave Informatics LLP, 2012
© Blue Wave Informatics LLP, 2012
Information Models
• A static information model – a class model - has classes representations of types of things in relationship to each
• In the logical model level, the things will have attributes –
properties of the thing that describe the thing (may or may
not be definitional to the thing)
© Blue Wave Informatics LLP, 2012
Concept Domains
• Each class and attribute in a model has a name, a label, that
describes what it represents
• It should also have a definition, a description of the semantic
space that it encompasses
• For those classes and attributes that will be instantiated,
helped to represent real things (instances) using vocabulary,
(so using the CD datatype) that description of the semantic
space (semantic type) is the concept domain
• A Concept Domain defines the Semantic Space for a “thing"
attribute in an Information Model
© Blue Wave Informatics LLP, 2012
Concept Domains (II)
• Therefore, a concept domains has a definition, usually
identical to the definition of the attribute in the model that it
• It also may have a description or usage notes
• If it can have some examples of instances of things in that
semantic space, even better 
Make: the manufacturer of the car [Examples: Ford, General Motors, Ferrari]
Model: the version of the car, usually a defined by a particular chassis [Examples:
Mondeo, Mustang, F450]
Colour: the hue of the paint on the body of the car [Examples: Red, Racing Green,
© Blue Wave Informatics LLP, 2012
Value of Concept Domains
• Using concept domains is useful because it supports:
– the selection of vocabulary occurring at a separate time from the
information model design
– gaining consensus on the set of concepts to be used in the concept
– different implementations to use their own vocabulary but still share
semantic foundations
– management of changing instances over time
• BUT no information model, or application built upon such an
information model, is implementable or useable until the
concept domain is “bound” to a code system or value set –
this is sometimes known as “making a vocabulary declaration”
© Blue Wave Informatics LLP, 2012
An Example – SiteStatusCode in BRIDG
© Blue Wave Informatics LLP, 2012
An Example – in an Application
© Blue Wave Informatics LLP, 2012
Key pieces in the vocabulary machinery to
instantiate concept
 Concepts
 Codes and Designations – Concept Representation
 Code Systems and Concept Identifiers
 Value Sets
Using the definitions and principles from ISO 21090 and
ISO 17583
© Blue Wave Informatics LLP, 2012
Codes and Designations – Concept
© Blue Wave Informatics LLP, 2012
© Blue Wave Informatics LLP, 2012
A concept is a unit of thought
With thanks to : David Robinson - NHSIA
© Blue Wave Informatics LLP, 2012
Concept Definition
• A Concept is a unitary mental representation of a real or
abstract thing – an atomic unit of thought.
• Concepts, as abstract, language- and context-independent
representations of meaning, are important for the design and
interpretation of static information models. They constitute the
smallest semantic entities with which models are built. The
authors and the readers of a model use concepts and their
relationships to build and understand the models; these are
what matter to the human user of models.
• The vocabulary machinery exists to permit software
manipulation of these units of thought
 As models are layered and developed, the size and description of the
smallest semantic entity may change, to best meet the use case(s) and
requirements, and to show different views on reality
© Blue Wave Informatics LLP, 2012
A concept can be labelled with a code
With thanks to : David Robinson - NHSIA
© Blue Wave Informatics LLP, 2012
Code Definition
• A Code is a machine processable Concept Representation
published by the author of a Code System as part of the Code
• It is the preferred unique identifier for that concept in that Code
System for the purpose of communication (preferred machinereadable identifier), and is used in the 'code' property of an ISO
21090 CD data type
• Codes are sometimes meaningless identifiers, and sometimes they
are mnemonics that imply the represented concept to a human
– MedDRA code – has meaningless identifiers – “10040589”
– ISO (2 letter) Country codes – mnemonic – GB = Great Britain
• Meaningless identifiers are advised (see the Cimino Desiderata)
particularly in larger vocabulary systems
© Blue Wave Informatics LLP, 2012
A concept can be labelled with a designation
With thanks to : David Robinson - NHSIA
© Blue Wave Informatics LLP, 2012
Designation Definition
• A Designation is a language symbol for a concept that is
intended to convey the concept meaning to a human being
• A Designation may also be known as an appellation, symbol,
or term
• A Designation is typically used to populate the 'displayName'
property of an ISO 21090 CD data type
Concept Representation
• Putting together a code and a designation gives a concept
representation for a concept, a single unit of thought
• This is something that is both machine-readable and humanreadable
X79Q8: Apple
© Blue Wave Informatics LLP, 2010
Concept Representation (II)
• A Concept Representation is a vocabulary object that
enables the description and manipulation of a Concept in
systems and applications (such as information models, xml
• A Concept Representation exists in some form that is
computable, and can be used in information models and
• Concept Representations can take on a number of different
roles in the structure and processing of vocabulary in
information models
© Blue Wave Informatics LLP, 2012
Code Systems – collections of concepts
and Concept Identifiers
© Blue Wave Informatics LLP, 2012
© Blue Wave Informatics LLP, 2012
Code System
• A Code System is a managed collection of Concept
Representations, including codes and/or designations, but
sometimes with more complex sets of rules, references
(definitions), and relationships
• A Code System may be described as “ a collection of uniquely
identifiable concepts with associated representations,
designations, associations, and meanings”
• A Concept should be unique in a given Code System
– A concept may have synonyms
– A concept maybe a singleton, or may be constructed of other concepts
(i.e. post-coordinated concepts)
• Although these things may be differentially referred to as
terminologies, vocabularies, or coding schemes, or even
classifications, the ISO 21090 CD datatype considers all such
collections ‘Code Systems’
– Examples include ICD-9 CM, SNOMED CT, LOINC, and MedDRA
 Hence a “terminology model” 
© Blue Wave Informatics LLP, 2012
Code System Properties
• Code systems should have:
– an identifier that uniquely identifies the Code System. For ISO 21090
conformant model instances, this SHALL be in the form of an ISO OID
– a description consisting of prose that describes the Code System, and
may include the Code System uses, maintenance strategy, intent and
other information of interest
• when using a code system to support instantiation of a model, it is this
description that should match or be compatible with the relevant concept
– administrative information proper to the Code System, independent of
any specific version of the Code System, such as ownership, source
URL, and copyright information
© Blue Wave Informatics LLP, 2012
Managing Change in Code Systems
• Code Systems should evolve over time
• Changes occur because of
– corrections and clarifications
– the understanding of the concepts being described evolves (e.g., new
genes and proteins are discovered)
– the concepts being described change (e.g., new countries emerge; old
countries are absorbed)
– the assessment of the relevance of particular concepts within the
knowledge resource change (e.g., the addition of new parent-child
Code System Versions
• Depending upon how well the Code System adheres to Good
Vocabulary Practices (the “Cimino Desiderata”), changes
could be significant
• Changes in concept meaning – although discouraged – can
occur and can cause issues which could themselves be
• Therefore it can be important to know which version of a given
Code System was used in
– the creation of a system record or message instance
– (in some cases) the creation of an information model/schema
• Hence “Code System Version” is a property of the CD
© Blue Wave Informatics LLP, 2010
The Concept Identifier
• A Concept Identifier is a vocabulary object that
unambiguously and globally uniquely represents a
concept within the context of a Code System in a
machine readable way
• A Concept Identifier consists of:
the OID for Code System + Code (+ Designation/Display
• To make a Concept Identifier human readable, add the
“display name” (the designation) thus:
the OID for Code System + Code (+ Designation/Display
note that the designation (display name) is not mandatory for the
concept identifier, but it is considered good practice to always have the
designation for safety reasons (data unscrambling etc.)
© Blue Wave Informatics LLP, 2012
Value Sets – making concepts and code
systems work in information models and
© Blue Wave Informatics LLP, 2010
© Blue Wave Informatics LLP, 2012
Value Sets
• A Value Set represents a uniquely identifiable set of valid
concept identifiers where any concept identifier used within
the CD datatype can be tested to determine whether it is a
member of the Value Set at a specific point in time
– it is this that makes a particular attribute “conformance testable”
• Value Sets exist to constrain the permissible content of a
concept domain for a particular use
– in an information model vocabulary binding
– in analysis
– In UI data collection - in a pick list (drop-down box), etc.
• A Value Set may have a description, but this is not intended to
describe the semantics of the Value Set; a Value Set has no
intrinsic semantics separate from the coded concepts
contained in its expansion
– a value set is useful only in context, not as a stand-alone object
Looking at the ISO 21090 CD datatype
© Blue Wave Informatics LLP, 2012
© Blue Wave Informatics LLP, 2012
ISO 21090 Concept Descriptor Datatype
code 
codeSystem (identified
using an OID) 
codeSystemName 
codeSystemVersion 
displayName 
valueSet (identified
using an OID) 
valueSetVersion 
© Blue Wave Informatics LLP, 2012
Concept Descriptor Attributes
code - the (machine readable) concept representation
Note – the cardinality is 0..1 to allow for NULL FLAVOURS
codeSystem (OID) – uniquely and machine-readably identifies the code system that
the code comes from
codeSystemName – the human readable name of the code system (e.g. “MedDRA”)
codeSystemVersion – the version of the code system that the code comes from
codingRationale – information about how /why the code was selected – the reason
the concept has been provided – rarely if ever used
displayName – the human readable description of the concept - as it exists in the
code system – the Term Name
originalText – the piece of text in a document or report that the concept has been
selected to represent (it shows the meaning the user intended to communicate) - this
might be used in an ICSR, for example
source – if any translation (mapping) has occurred, this gives the source code
translation - a set of other concept descriptor information that each represent a
translation of this code into equivalent codes within the same code system or into
corresponding concepts from other code systems (could be used for synonyms, or
could be used to describe mapped concepts)
valueSet – the value set that applied when this instance of information was created
valueSetVersion – the version of the value set that applied when this instance of
© Blue Wave Informatics LLP, 2012
information was created

Using Controlled Vocabularies in HL7 V3 Messages