Working with Natural
Language Text: Tools and
Techniques
Nestor Rychtyckyj
Advanced & Manufacturing
Engineering Systems
Ford Motor Company
1
Agenda
• Introduction
• Description of problem– Why is language
so important?
• Dealing with Natural Language Text
• Application Examples
• Machine Translation
• Future Directions
• Conclusions
2
Natural Language Text is
“everywhere”
•
•
•
•
•
•
•
•
•
•
Internet
Web sites
Blogs
Customer Feedback
Dealer Feedback
Lessons Learned
Corporate Knowledge
Warranty Claims
Internal documentation
Spoken Dialog systems
3
Dealing With Text Information
• Search Engines (Google, askjeeves.com)
• Excel
• Commercial Text Mining Tools (Wordstat, SAS
Text Miner, SMART Text Miner, etc)
• Open Source tools (Wordnet, Senseclusters,
etc.)
• Controlled Languages
• Ontologies
• Natural Language Processing
• Semantic Web
4
Present Status
• Mostly key-word based
• Very little intelligence, no background knowledge
or context
• Limited natural language dialog interpretation
• Most of the processing is left to the human user
• Difficult to build computer systems that can
retrieve information in an “intelligent” manner
5
Future State
• Semantic Web – information on the web is
organized using structured tagging based on
XML, RDF, OWL, SWRL
• machine-processable data on the web
• standard interface to data
• rich knowledge representations through
ontologies
• Allows for the development of systems that cab
retrieve information in an intelligent manner
6
Semantic Web Architecture
Source: Tim Berners-Lee, 2000
7
Artificial Intelligence (AI)
• Study on how to build human-level intelligence into
computer applications
• Uses learning, representation of human knowledge,
understanding of language, vision, speech, etc.
• Applies the built-in knowledge using inference and
reasoning
• Been very successful in limited problem domains – less
so for general applications
• Integrated into many applications areas including
manufacturing, planning, search, speech recognition,
financial analysis, games, customer analysis,
commercial fishing, etc.
8
Current use of AI in Manufacturing
at Ford
• AI applications for manufacturing
• Bring appropriate knowledge about
manufacturing to the proper people at the right
time
• Improve manufacturing efficiency
• Reduce workplace injuries through better upfront ergonomics analysis
• Make assembly build instructions available to
operators in other languages
• Develop common framework for representing
knowledge and exchanging it between different
systems
9
Knowledge Sources in
Manufacturing
•
•
•
•
•
•
•
•
Process Build Information
Required Tooling
Part Information
Ergonomics Analysis
Plant Layout Information
Assembly Visualization
Safety Concerns
Manufacturing “Best Practices”
10
Global Study Process Allocation
System (GSPAS)
• The Allocation system
used to assign
manufacturing processes
to plant operation
resources.
• Process sheets use
STANDARD LANGUAGE
(159) verbs
• Like - insert, select, grasp,
load …
11
Global Study Process Allocation
System (GSPAS)
• Global System to handle Manufacturing Costing,
Process and Labor Management for vehicle
assembly.
• Standard Language and AI is an integral part of
GSPAS.
• Launched in North America and Europe in 1998
to support the Focus program.
• Currently deployed for almost all car and truck
manufacturing at Vehicle Operations assembly
plants world-wide.
12
Step by Step Instructions
 Process sheets specify the operations, tasks, parts and
tools required to support the production of a vehicle.
13
Standard Language
• Controlled language where the grammar and
syntax is restricted.
• Developed at Ford Body & Assembly to describe
the vehicle assembly process.
• Contains information about tools, parts and work
required to build a vehicle.
• Contains over 5000 words, 1000 abbreviations
that can be used by the process engineers.
• Standard Language is checked by Artificial
Intelligence (AI) system.
14
Examples of Standard
Language
1. ALIGN-AND-SEAT DOOR TRIM PROTECTOR
2. FIRMLY PRESS SEALER INTO JOINT TO
AFFECT A POSITIVE SEAL
3. APPLY DAUB OF SEALER TO THE JOINT OF
THE CENTER FLOOR PAN AND FRONT
FLOOR PAN AT ROCKER PANEL
4. PUSH SEAT REARWARD TO EXPOSE
FRONT ATTACHMENTS
15
Standard Language Rules
• Imperative form
• Sentence must start with verb clause followed by
noun phrase.
• Only one Standard Language (main action) verb
per sentence.
• Some prepositions have special meaning
(“using”, “with”).
• Size modifiers may follow nouns (“bumper
large”).
• Free form allowed for certain verbs “verify that..”)
16
Standard Language Process
Sheet
Process Sheet Written in Standard Language from CAP (Focus) deck
TITLE: ASSEMBLE IMMERSION HEATER TO ENGINE
10 OBTAIN ENGINE BLOCK HEATER ASSEMBLY FROM STOCK
20 LOOSEN HEATER ASSEMBLY TURNSCREW USING POWER TOOL
30 APPLY GREASE TO RUBBER O-RING AND CORE OPENING
40 INSERT HEATER ASSEMBLY INTO RIGHT REAR CORE PLUG HOSE
50 ALIGN SCREW HEAD TO TOP OF HEATER
TOOL 20 1 P AAPTCA TSEQ RT ANGLE NUTRUNNER
TOOL 30 1 C COMM TSEQ GREASE BRUSH
Resulting Work Instructions Generated by DLMS For Line 20
LOOSEN HEATER ASSEMBLY TURNSCREW USING POWER TOOL
005 GRASP POWER TOOL (RT ANGLE NUTRUNNER) <01M4G1>
010 POSITION POWER TOOL (RT ANGLE NUTRUNNER) <01M4P2>
015 ACTIVATE POWER TOOL (RT ANGLE NUTRUNNER) <01M1P0>
020 REMOVE POWER TOOL (RT ANGLE NUTRUNNER) <01M4P0>
025 RELEASE POWER TOOL (RT ANGLE NUTRUNNER) <01M4P0>
.
17
Natural Language Parsing
Secure bracket
using multiple motor nutrunner
Prepositional
Phrase
Verb Phrase
Noun Phrase
Verb
Noun
Preposition
Secure
Bracket
Using
Noun Phrase
Noun
18
Process for Natural Language
Processing
• Parse the text (sentence by sentence) into parse
tree structure
• Bypass/ignore common words (articles, common
terms)
• Stemming (get the root of the word)
• Word lookup (synonyms, misspellings,
acronyms)
• Word understanding (deeper-level ontologies)
• Controlled languages with automated checking
19
Parsing Information in Standard
Language
• Example of Standard Language parsing: “Feed 2 150
mm wire assemblies through hole in liftgate panel”
• (S (VP (VERB FEED)) (NP (SIMPLE-NP (QUANTIFIER
2) (DIM (QUANTIFIER 150) (DIM-UNIT-1 MM))
(ADJECTIVE WIRE) (NOUN ASSEMBLY))) (S-PP (SPREP THROUGH) (NP (SIMPLE-NP (NOUN HOLE) (NPP (N-PREP in) (NP (SIMPLE-NP (ADJECTIVE
LIFTGATE) (ADJECTIVE OUTER) (NOUN PANEL))))))))
20
Ontology – used to represent
knowledge
•
•
•
•
•
•
Individuals
Classes (with hierarchy); think sets
Properties (w/ hierarchy); not part of class
Equivalence
Property characteristics/restrictions
Complex classes
21
GSPAS Ontology
Thing
Tools
Parts
Lexical Nodes
Operations
Intervening Concept Nodes
HAMMER
Attributes: Size,
Part of Speech,
Subsystem-id, etc….
22
GSPAS Knowledge Base
23
Ergonomics Analysis
• Check the assembly work instructions to
determine what type of physical action is being
described
• Check the assembly work instruction to
determine what object is manipulated
• Check the associated parts and tools for part
weight and tool properties
• Flag potential ergonomics concerns at the
process level and at the work allocation level
• Knowledge can be represented as a business
rule
24
Machine Translation
• “The Spirit is willing but the flesh is weak”
• "The vodka is tempting, but the meat's a bit
suspect".
• “The alcohol is arranged, but the meat is weak.”
• “This kind of spirit is wants, but the flesh and
blood is weak.”
• “The spirit is willing, but the flesh is impossible”
• “The spirit puts out the flag and does, the flesh
omits but.”
25
Machine Translation
• Use of computers to translate from one
language to another
• Examples: Babelfish
• Translation accuracy is highly dependant on the
quality of the source text
• Use proper grammar, punctuation, shorter
sentences, active voice to improve quality
• Customize translation systems for each
application domain
26
Problem Description
• Need to translate assembly build instructions from
English to the language used at the assembly plants
• A single vehicle may require several thousand process
sheets to describe the assembly process
• Large amount of assembly instructions are frequently
modified
• Large volume of translations precludes the use of human
translators
• Specialized terminology requires technical glossaries
• MT performance can be improved greatly by improving
the source text
27
Application Description
• Machine Translation is integrated into the process
planning for manufacturing system known as GSPAS
(Global Study Process Allocation System)
• The translation process is fully automated and does not
require human intervention
• Translation occurs automatically after a process sheet is
validated by the AI system and before it is released to
the assembly plants.
• We currently translate build instructions for 26 different
vehicle lines in 5 languages (we also have a separate
glossary for Mexican Spanish)
• Data is read in from an Oracle database, processed
through the translation system and the output is then
written out to the Oracle database
28
Machine Translation
• Source: Process build instructions in English
• Target: Process build instructions in Spanish, German,
Portuguese, Dutch & Turkish
• Translate both controlled language and embedded freeform text
• Example: SECURE BUMPER BRACKET {FOR LHS
ONLY} TO VEHICLE BODY USING POWER TOOL
• Utilize customized SYSTRAN translation engine,
automotive and Ford-specific terminology glossaries and
embedded tagging
• Future plans include additional parsing and tagging
information to improve translation accuracy
29
Machine Translation
Implementation in GSPAS
• Worked with Systran & Apptek to customize their
translation software for our requirements.
• Develop technical dictionaries that contain Ford
terminology with correct translation for each
language pair.
• Develop and integrate the translation process
into GSPAS.
• Developed a system to check and improve the
source text prior to translation
30
Translation Statistics
• Language pairs being translated:
English/German, English/Spanish,
English/Dutch, English/Portuguese, EnglishSpanish (Mexican), English-Turkish
• Ford specific terminology in Standard Language:
over 5000 words, 13,000 noun phrases, over
1000 abbreviations and acronyms .
• Typically translate over 200,000 records each
month
• Over 10,000,000 records already translated.
31
GSPAS Translation Process
32
Standard Language Translation
Issues
• Sentence structure is not grammatical English (ROBOT
APPLY 50 MM TAPE-STRIPE)
• Ford terminology is complex and must be explicitly
translated as an entire phrase (INSULATION
ASSEMBLY BODY PILLAR)
• Use of abbreviations, misspellings, acronyms (ABS,
A.B.S)
• Use of compound verbs (PICK-AND-SPOON)
• Inverted phrase structure with modifiers (BODY PANEL
LRG)
• Embedded comments (LOAD BUMPER {LOWER} TO
VEHICLE)
33
Standard Language Translation
• Use of slang (“shotgun”)
• Articles are seldom used (HAMMER HAMMER).
• Need to handle “British” English as well as
“American” English. (terminology, use, spellings)
• Source text is incorrectly written and not
understandable.
• Punctuation is rarely used.
• Standard Language is always evolving and
needs to be maintained.
34
Uses of AI Technology
• Apply natural language processing (NLP) along
with knowledge representation and reasoning to
improve the source text
• Analyze the source text; utilize the ontology to
identify terminology
• Convert the source text to a more “translatable”
form by adding articles, replacing abbreviations,
improving grammar and punctuation
• Utilize XML tagging and ontology lookup to
improve the structure of free-form source text
35
Improving Translation Quality
• Process the source text prior to translation
(Standard Language pre-processor).
• Add articles before the nouns.
• Adjust the word order to deal with size modifiers
coming after nouns.
• Replace acronyms, synonyms with original
expanded text (ASY -> ASSEMBLY)
• Verify that punctuation is correct.
• Pre-process the embedded comments to
improve translation quality.
36
Issues with Machine Translation
Quality
• Localization issues (even with technical
terminology) – Spanish in Spain, Mexico,
Argentina, etc.
• Ensure that system correctly displays
special characters (umlaut, accents etc.)
• Have additional space available on screen
as target languages require more room
than English.
37
Conclusions
• Machine Translation is a cost-effective way to
translate information with high quality if you are
willing to customize the application to your
requirements
• Machine Translation is not an “out of the box”
solution
• Machine Translation accuracy can be greatly
improved by controlling and improving the
quality of the source text
38
Where are we going?
• Intelligent search w/ context and understanding
• Sharing of knowledge through ontologies
• Growth of user-defined knowledge
(folksonomies)
• Intelligent Dialog Systems – integration of
speech recognition w/ intelligent engines
(“Sync”)
• Automate the process of information retrieval
39
Questions
?????
40
Descargar

Intelligent Manufacturing Applications at Ford Motor …