Terminology standards – enhancing language
ISO/TC 37  Semantic Interoperability
ISO TC 37 Secretariat
c/o Infoterm
Christian Galinski
Bamako (Mali) 2005-05-06/07
Overview


















UNESCO’s IFAP Area 4
IFAP UNESCO and multilinguality
Advocating open access solutions
Language in industry
eContent development
Global semantic interoperability
Standards for ...
Terminology standardization
Terminology?  Content entities
Terminology  eContent
Terminology in ISO/TC 37
+ Language resources & LR management
+ Content resources
Standardization of terminological principles and methods
ISO/TC 37
ISO/TC 37/SC 1 ~ 4
ISO/TC 37 Outlook
Semantic interoperability – HOW?
ISO/TC 37 – Bamako 2005-06/07
What is terminology?

The description of the specialized vocabulary of
an application domain
 Cf. Eugen Wüster: conceptual view
knowledge representation at concept level
 Monolingual or multilingual
 Mainly nouns (in cl. multi-words nominal units),
some verbs, adjectives and adverbs
 A strong yet practical simplification of lexical
description
 Increasing occurrence of non-verbal
knowledge representations
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37 – Bamako 2005-06/07
IFAP Areas of intervention
What are IFAP’s areas of intervention?
• Area 1: Development of international, regional and national information
policies
• Area 2: Development of human resources and capabilities for the
information age
• Area 3: Strengthening institutions as gateways for information access
•
Area 4: Development of information
processing and management tools and
systems (Multilingualism) standards
ISO/TC 37 methodology standards:
• terminology
• language resources (at the level of concepts)
• other content entities (at the level of concepts)
ISO/TC 37 – Bamako 2005-06/07
UNESCO and multilinguality

Promoting a wider, more equitable access to information

Raising awareness of issues of equitable access and multilingualism

Encouraging Member States to
(« Recommendation on the promotion of multilingualism and universal access
to Cyberspace »/ Initiative [email protected])

Develop strong policies which promote and facilitate language
diversity on the Internet Guidelines for Terminology Policies

Create widely-available online tools and applications (such as
terminologies, automatic translators, dictionaries) for content in
local languages

Share of best practices and information  ISO/TC 37
ISO/TC 37 – Bamako 2005-06/07
Advocating open access solutions

“Member States and international organizations should
encourage open access solutions including the
formulation of technical and methodological
standards for information exchange, portability and
interoperability, as well as online accessibility of
public domain information on global information
networks.”
(UNESCO Recommendation on Multilingualism and Access to Cyberspace)

“Governments should promote the development and
use of open, interoperable, non-discriminatory
and demand-driven standards.” (WSIS Action Plan)
 Open source software? + Open content?
ISO/TC 37 – Bamako 2005-06/07
Language in industry
Exchange of content entities:
e.g. entry in a product catalogue
 Name of company (® enterprise)
225/55/16 V
 Name of product (model) (™ enterprise)
 Generic name of product (e.g. © Harmonized System)
 Class (name under which the product falls) (e.g. © [email protected])
 Verbal/textual description (© enterprise)
 Picture (© rights owner)
 Technical data
•
•
•
•
(unified) branch properties (e.g. © OAGi)
Standardized characteristics (e.g. © DIN)
Enterprise product specific data (e.g. for collaborative business)
Enterprise internal data (maybe confidential/secret)
ISO/TC 37 – Bamako 2005-06/07
eContent DEVELOPMENT
Workflow management for content development:
 net-based, distributed, cooperative creation of structured content
 multilingual
Re-use in applications:
 multimodal
(based on the “single-source” principle)
 multimedia
• eLearning
complying with
 multi-channel output
 accessibility requirements
• eGovernment
• eHealth
• eBusiness
• other e...s
CO-OPERATION  INTEROPERABILITY
STANDARDIZATION
ISO/TC 37 – Bamako 2005-06/07
THE CHALLENGE: (user point-of-view)
• throughout the enterprise/organization
• between enterprises/organizations
• within industry consortia
• between industry consortia
• between different e…s
• between different language communities
 requested e.g. in e-government
 requested by the market
 requested by industry branches
 ??? (urgently needs harmonization
and especially open standards)
 requested by the user
 requested by the end user
 within the standardization world
 Global Semantic Interoperability
ISO/TC 37 – Bamako 2005-06/07
STANDARDS FOR:
hw  sw  methodology standards







Technology  ITU, ISO, IEC, industry
Business models  UN/ECE, ISO, industry
“Language”  ISO/TC 37, research consortia
Transfers/transactions  ITU, UN/ECE, industry
Standards*  MoU/MG – why?
Content  ? Methodology!!! semantic interoperability
Legal issues ?
 *standards should be examined, whether they support, allow or
hinder multilinguality and cultural diversity (very important for
SMEs) and semantic interoperability at large
ISO/TC 37 – Bamako 2005-06/07
Terminology standardization

Standardization of terminologies
• Terminological data
• Linguistic and non-linguistic representations
• Designations: term, abbreviation, graphic symbol, formula,
acoustic symbol, etc.
• Descriptions: definition, explanation, non-linguistic
[descriptive] representation, etc.
• Source-related data
• Data management related data (field, record, holding)
• Classification (multiple)
Terminology-related data: names, phraseology, ...
Standardization of terminological principles and methods
•

 generic for many types of content entities
ISO/TC 37 – Bamako 2005-06/07
Terminology?  content entities

Terminology?  knowledge representations
•
•
•
•
•
•
•
Nomenclature, taxonomy, typology, partonomy, ...
Glossary, vocabulary, ...
Terminological phraseology
Graphical symbols and other non-linguistic representations?
Properties, characteristics, attributes, ...
Ontology
Names? to be further studied
+ closely related:


Thesauri, classification schemes, keywords
Encyclopedic (knowledge) entries
•
•

Knowledge-enriched terminology entries
Names, proper names, ...
Ontologies, topic maps, ...
 ONE methodology
ISO/TC 37 – Bamako 2005-06/07
Terminology  eContent

embedded terminology (or combination of terminology + …)
•
•
•
•

Texts:  translation, localization, internationalization…
Speech:  communication…
Image:  CAD/CAM…
Multimedia:  video, presentations…
knowledge-rich terminology
•
•
Encyclopedic knowledge: Wikipedia…
“Knowledge” management:  incl. true “content management”
• document management,
• communication management,
• information management

“popularized” terminology

“Terminology and other language and content resources”

ONE methodology
ISO/TC 37 – Bamako 2005-06/07
Terminology today
Given its pervasive occurrence in all (written or spoken)
domain communication, terminology today has to be
considered an economic factor especially in






product data description and management (incl. eCatalogues and
product classification)
quality management
inter-cultural aspects of management and marketing
translation and localization
information, documentation, software development
knowledge transfer, teaching and training, …
 Multilinguality and cultural diversity
 terminology science as a field of fundamental research as well as applied R&D
 impact on standardization
ISO/TC 37 – Bamako 2005-06/07
Terminology in ISO/TC 37
Multifunctional nature of terminology:




Terminology as knowledge representation
Terminologies as means of domain
communication
Terminologies as means of access to other kinds
of information (objects)
Terminologies as means of knowledge ordering at
micro-level
ISO/TC 37 – Bamako 2005-06/07
+ Language resource management


Language resources:
• Text corpora  tagging (on the basis of grammar models)
• Lexicographical data
• Words
• Collocations
• Morphology
• Terminology
• Speech data
LR management:
• Input / import
• Metadata (incl. bundling/bindings etc.)
• Data modelling & metamodel(s)
• Exchange / interoperability
• etc.
ISO/TC 37 – Bamako 2005-06/07
+ other kinds of content entities
Textual & non-linguistic types of content:
 Audio information (e.g. read-out written content)
 av information (e.g. sign language)
 Multimedia information
 Haptic information (e.g. in “intelligent cars”)
 …
Increasingly different (technical) types of content co-occur or
are embedded in each other or are combined with each
other – e.g. traffic telematics
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37 – Standardization of
terminological principles and methods







Fundamental principles
Vocabulary of terminology
Terminography
Language resource management
Terminology work (especially systematic ~~)
Applications based on terminology methods
Content management?  eContent  mContent
• Multilingual, multimodal, multimedia,
universal accessibility, multi-channel
• Re-usability  interoperability/ies
• Resource-sharing  peer2peer
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37




Old title:
Terminology and other language resources
Old scope:
Standardization of principles, methods and applications relating to terminology and
other language resources
New title:
Terminology and language and content resources
New scope:
Standardization of principles, methods and applications
relating to terminology and other language and content
resources in the contexts of multilingual communication
and cultural diversity
As is the case with terminologies, language resources in general have to be considered
as multilingual, multimedia and multimodal from the outset.
 Generic fundamental standards for all activities involving language
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 1 (1)




Title: Principles and methods
Old scope: Standardization of basic principles
and methods for developing scientific and
technical terminologies and other language
resources
New scope: ??? still under discussion
ISO/TC 37/SC 1 prepares the meta-standards for the documents
prepared by ISO/TC 37/SCs 2, 3 and 4, which cannot be consistent
and coherent without these standards. The same applies to the
documentation of content management in organizations.
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 1 (2)
The following standards are under the direct responsibility of
ISO/TC 37/SC 1:
 ISO 704:2000
Terminology work – Principles and methods
 ISO 860:1996
Terminology work – Harmonization of
concepts and terms
 ISO 1087-1:2000 Terminology work – Vocabulary – Part 1:
Theory and application
The following standards are under preparation:
 ISO/CD 704
Terminology work – Principles and methods
 ISO/CD 860
Terminology work – Harmonization of
concepts and terms
 ISO/PWI 1087-1 Terminology work – Vocabulary – Part 1:
Theory and application
 ISO/WD 22134
Practical guide for socioterminology
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 2 (1)



Title: Terminography and lexicography
New scope: Standardization of terminological
and lexicographical working methods,
procedures, coding systems, workflows, and
cultural diversity management, as well as
related certification schemes
Tens of thousands of terminology commissions, committees and other
terminological entities (especially terminology standardizing SCs and WGs
within the standardization framework) are using ISO/TC 37/SC 2
standards. This indirectly improves the overall degree of re-usability and
interoperability of the resulting data and documents.
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 2 (2)
The following standards are under the direct responsibility of
ISO/TC 37/SC 2:







ISO 639-1:2002
Codes for the representation of names of
languages – Part 1: Alpha-2 code
ISO 639-2:1998
Codes for the representation of names of
languages – Part 2: Alpha-3 code
ISO 1951:1997
Lexicographical symbols and typographical
conventions for use in terminography
ISO 10241:1992
International terminology standards -- Preparation
and layout
ISO 12199:2000
Alphabetical ordering of multilingual terminological
and lexicographical data represented in the Latin alphabet
ISO 12616:2002
Translation-oriented terminography
ISO 15188:2001
Project management guidelines for terminology
standardization
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 2 (3)
The following standards are under preparation:











ISO/CD 639-3
Codes for the representation of names of languages
– Part 3: Alpha-3 code for comprehensive coverage of languages
ISO/WD 639-4
Codes for the representation of names of languages
– Part 4: Implementation guidelines and general principles for language coding
ISO/WD 639-5
Codes for the representation of names of languages
– Part 5: Alpha-3 code for language families and groups
ISO/CD 639-6
Codes for the representation of names of languages
– Part 6: Extension coding for language variation
ISO/DIS 1951
Presentation/representation of entries in dictionaries
ISO/CD 10241-1
Terminological entries in standards – Part 1: General
requirements
ISO/AWI 10241-2
Terminological entries in standards
ISO 12615
Bibliographic references and source identifiers for
terminology
ISO/PWI TR 22128
Quality assurance guidelines for terminology products
ISO/PWI 22130
Additional language coding
ISO/NP 23185
Assessment and benchmarking of terminological
holdings
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 3 (1)



Old title: Computer applications for terminology
New title: Terminology management systems
and content interoperability
New scope: Standardization of principles and
requirements for semantic interoperability,
terminology and content management systems,
and knowledge ordering tools
Software developers are taking the documents of ISO/TC 37/SC 3 for designing
terminology management systems (TMS) or terminology management modules to
be integrated into content management as well as information and knowledge
management systems.
In this way the terminological principles and methods (provided by ISO/TC 37/SC 1)
are directly integrated as ‘defaults’ into concrete system design for handling all
kinds of information.
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 3 (2)
The following standards are under the direct responsibility of
ISO/TC 37/SC 3:
 ISO 1087-2:2000 Terminology work – Vocabulary – Part 2:
Computer applications

ISO 6156:1987
Magnetic tape exchange format for
terminological/ lexicographical records (MATER) - withdrawn

ISO 12200:1999 Computer applications in terminology –
Machine-readable terminology interchange format (MARTIF) –
Negotiated interchange
ISO 12620:1999 Computer applications in terminology – Data
categories
ISO 16642:2003 Computer applications in terminology –
Terminological markup framework


ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 3 (3)
The following standards are under preparation:
 ISO/PWI TR 12618
Computational aids in
terminology – Design, implementation and use of
terminology management systems
 ISO/CD 12620-1
Computer applications in
terminology – Data categories – Part 1: Model for
description and procedures for maintenance of data
category registries for language resources
 ISO/CD 12620-2
Computer applications in
terminology – Data categories – Part 2: Terminological
data categories
ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 4 (1)


Title: Language resource management
Scope: Standardization of specifications for computerassisted language resource management
Given the fact that
•
linguistic infrastructures are being established or re-enforced as part of the
rapidly evolving information and communication society;
•
professional activities involving language resource sharing and standardization
are increasing in diverse areas: governmental or non-governmental
organizations, public or private institutions, educational institutions,
commercial enterprises, etc., both, globalization and localization necessitate
multilingual communication;
there is an increasing need for new standardization as well as urgent recognition of
existing de facto standards and their transformation into International Standards.

ISO/TC 37 – Bamako 2005-06/07
ISO/TC 37/SC 4 (2)
The following standards are under preparation:
 ISO/AWI 21829
Terminology for language resources
 ISO/CD 24610-1
Language resource management – Feature
structures – Part 1: Feature structure representation
 ISO/WD 24611
Language resource management – Morphosyntactic
annotation framework
 ISO/WD 24612
Language Resource Management – Linguistic
Annotation Framework
 ISO/WD 24613
Language resource management – Lexical markup
framework
 ISO/AWI 24614-1
Word segmentation of written texts for mono-lingual
and multi-lingual information processing – Part 1: General principles and
methods
 ISO/AWI 24614-2
Word segmentation of written texts for mono-lingual
and multi-lingual information processing – Part 2: Word segmentation for
Chinese, Japanese and Korean
 ISO/NP 24614-3
Word segmentation of written texts for mono-lingual
and multi-lingual information processing – Part 3: Word segmentation for
other languages
ISO/TC 37 – Bamako 2005-06/07
State-of-the-art
METHODOLOGY
ISO
16642*
APPLICATIONS
(family of)
metamodels*
Datamodels
ISO 12200**
Datamodels**
eBusiness
Datamodels
other e...s**
Data categories
ISO 12620***
Domain data
dictionaries***
DDDs DDDs DDDs DDDs
***
***
***
***

Datamodels
other e...s**
Basic principles and requirements concerning multilingual e/m-content development, data
categories/metadata, data modelling, rules for repositories (maintained in MAs/RAs/Reg’s)
*ISO 16642 TMF; ISO 10303-11 EXPRESS; ISO 10303-21 SDAI; …
**ISO 12200 MARTIF; ISO 13584-42 PLIB ~ IEC 61360-2
***ISO 12620 Data categories; ISO 13584-511 Fastener dictionary; IEC 61360-4 Core dictionary; …
ISO/TC 37 – Bamako 2005-06/07

Semantic interoperability standards










Content-related requirements
Workflow methodology
Metadata
Metadata repositories
Data modelling principles and requirements
Micro data models
Metamodels
Content repositories
Federation of repositories
…
ISO/TC 37 – Bamako 2005-06/07
CONFERENCES




Terminology Summer School
- Cologne (Germany) 2005-07-14/23
TAMA 2005 “Terminology in Advanced
Management Applications”
– Wiesbaden (Germany) 2005-11-09
TKE 2005 “Terminology and Knowledge
Engineering”
– Copenhagen (Denmark) 2005-08-15/19
OFMR 2006 “Open Forum on Metadata Registries”
– Japan 2006-03-20/22
ISO/TC 37 – Bamako 2005-06/07
Thank you for your attention
ISO/TC 37 Secretariat:
Secretary: Christian Galinski
Chairman: Håvard Hjulstad (SN)
ISO/TC 37
c/o Infoterm – International Information
Centre for Terminology
ADDRESS:
Aichholzgasse 6/12
A-1120 Vienna – Austria
Tel: +43-1-817 44 99
Fax:+43-1-817 44 99-44
[email protected]
http://www.infoterm.info
Descargar

Title of presentation