Search for the Enterprise:
Creating Findability
Jean Bedord
Findability and Search Consulting
EContent Strategies
Gilbane Conference San Francisco
June 18-20, 2008
In the Beginning
• Research reports at
Lockheed - 40 years ago
• First search service,
Dialog Information
Services
Same Problem
• More content, more data, more digital
assets…….
Findability is Answers
• Not so easy…..
• Digital content needs metadata
Metadata Varies by Asset…
• Books - Cataloging
• Author’s Name
• Title of the Book
• ISBN, etc.
• Library of Congress
or Dewey Decimal
number
• Subject Descriptors
Periodical Databases
• Publications
• Title of article
• Author’s Name
• Publication Name, ISSN
• Publication date
• Abstract / Full Text
• Subject Indexing
Collection Standardization ???
• Music –Cataloging
•
•
•
•
•
•
Title of work
Lyric writer
Song writer
Performer
Instrument
Categorization of musical
style
Visual Images / Video
• Pictures – Archives / Flickr /YouTube
• Subject
• Title of the Work
• Photographer
• Location & Date
• User Generated Tags
• Generally poor search implementation
• Rights to use / remix
Corporate Assets
• Intellectual Property
• Patents
• Engineering drawings
• Manufacturing processes
• Customer Information
• Purchasing History
• Agreements
• Product Information
Vocabulary Mismatch
• Missing Metadata
• Categorization by domain
EXPERT
The Problem:
• Provide answers for nonspecialist
Missing Metadata
• Full text indexing and search is default
• Adding subject terms not obvious
• Microsoft Office
• Adobe
• Ideal – add standard descriptive terms
as part of work flow
The Synonym Problem
• Different words
• Different languages
• Different contexts
• Order of words
Failed searches
Intent ???? Choosing search terms
•
•
•
•
•
•
Expert vs. non-expert
Native language
Regional dialects
Academic training
Industry / Technology
Generation
Vocabularies – Key to Metadata
• Folksonomies
• No consistency, limited
• Controlled vocabularies
• Better consistency
• Lack relationships
• Taxonomy / thesaurus / ontology
• Relationships
• Key is flexibility with multiple related terms
Multiple Search Engines
• Enterprises – 4 to 5
search software vendors
• Departmental solutions
• Ideal – same metadata
structure
• Workable – shared core
of descriptors
WAND INC. Case Study
• Product and Service Taxonomies
• Uses numeric codes for preferred terms, i.e.
148087 is Agricultural Equipment
• Synonyms, related terms, broader terms
• Terms translated into 11 languages
• Harmonized with SIC, NAICS, Harmonized
Codes and Yellow Page headings
• Over 82,000 terms with 1.3M associated
attributes covering multiple industries
WAND Current Applications
• Insurance Adjustors – Replacement
of Consumer Electronics
• Intellectual / Business Assets for
Sale & Licensing
• Yellow Pages
• Vertical Business Directories
Yellow Page Business Model
•
•
•
•
Businesses pay to be included
Need high consumer web traffic
Approximately 50% search failure rate
Failed searches problematic
• Buyer frustration
• Advertiser loses value
Search Problems
• Plurals – similar results
• book / books
• Spacing – very different results
• video conferencing / videoconferencing
• text books / textbooks
• Morphological variations - varies
• wedding planning catering /wedding planning
caterer
Search Synonym Problem
• Attorneys / lawyers / law firm
• slr / single lens reflex
• Cellular phones /cell phone / cellulars /
mobiles
• Geographic Information Systems / GIS
• Global Positioning Systems / GPS
• Recreational vehicles / RVs
Results varied greatly between synonyms
Takeaways
• Metadata varies by asset
• Synonyms are a major cause of search
failure
• Expert vs. non-expert
• Alternate terms
• Vocabularies needed to improve
findability!!!
More on Findability……
• Contact
Jean Bedord
408-257-9221
www.EContentStrategies.com
[email protected]
• Questions ?????
Descargar

Search for the Enterprise: Creating Findability