Incorporating Metadata into Search
User Interfaces
Marti Hearst
UC Berkeley
March 2001
Main Ideas

Search is changing:



Web design is changing:



More emphasis on flexibly showing next choices
Less emphasis on ranking
More emphasis on dynamically determined views
Less emphasis on pre-determined links
Two key ideas:


Task-specific design
Harnessing the power of metadata
Outline

Background




Two web-based examples



Recipes
Sporting Goods
Two collection-based examples



Search
Metadata
Information architecture
Medical text
Architectural Images
Conclusions
Web Search is Working!
Survey finds high user satisfaction
Study by npd group
From
http://searchenginewatch.internet.com/reports/npd.html
Why is Web Search Working?


Web Search is successful at finding good
starting points (home pages)
Evidence:

Search engines using
Link analysis
 Page popularity
 Directory categories


These all find dominant home pages
Consequences


Web search engines are providing source
selection!
So … what happens when the user
reaches the site?
Follow Links
… or …
Search
Following Hyperlinks


Works great when it is clear where to go
next
Frustrating when the desired directions
are undetectable or unavailable
Site Search

This is not getting good reviews
An Analogy
hypertext
text search
Analogy

Hypertext:



A fixed number of choices of where to go next;
A glance at the map tells you where you are;
But may not go where you want to go.


To get from Topeka to Santa Fe, may have to go through
Frostbite Falls
Site Search:


Can go anywhere;
But may get stuck, disoriented, in a crevasse!
Goal: An All-Tertrain Vehicle

The best of both techniques



A vehicle that magically lays down track to
suggest choices of where you want to go next
based on what you’ve done so far and what
you are trying to do
The tracks follow the lay of the land and go
everywhere, but cross over the crevasses
The tracks allow you to back up easily
How to make an all-tertrain vehicle?
Two ideas:
Focus on the task.
Use metadata explicitly.
The Importance of the Task
Data-centric
Searching patent databases
Browsing newsgroups
Getting all recent news
Task-centric
vs.
vs.
vs.
Proving non-infringement
Finding the denial-of-service hacker
Anticipating the competition
The Importance of the Task:
Indirect Evidence

How does Web page download time effect
usability?

In one study, Jared Spool’s UIE team found:
(56kbit modem)



Users rated the sites:



Amazon: 36 sec/page (avg)
About.com: 8 sec/page (avg)
Fastest: Amazon
Slowest: About.com
Why?
The Importance of the Task

Perceived speed


Strong correlation between perceived speed
and whether the users felt they completed
their task
Strong correlation between perceived speed
and whether the users felt they always knew
what to do next (scent).
Metadata

Metadata is:


Data about data
Structures and languages for the description
of information resources and their elements
Thesauri (Categories)

A collection of selected vocabulary


Broader, narrower, related-to relations
Describe the content

Medical text


Architectural images


Location, Style, Materials, Period …
Recipes!


Anatomy, Disease, Chemicals, Procedures…
Cuisine, Ingredients, Season, Calories …
These are often organized as hierarchical and
faceted
New interfaces are mixing and
matching thesaurus-style metadata
GeoRegion
+ Time/Date
+
Topic
+
Role
The question: how to do this effectively?
What about Yahoo?
Let’s try to find UCB
What about Yahoo?
What about Yahoo?
What about Yahoo?
Where is UCB?
Yahoo does use some metadata well

Yahoo restaurant guide combines:



Region
Topic (restaurants)
Related Information
Other attributes (cuisines)
 Other topics related in place and time (movies)

Yellow: geographic region
Green: restaurants & attributes
Red: related in place & time
Combining Information Types

Region

State

City

A&E




Film
Theatre
Music
Restaurants



Assumed task: looking for
evening entertainment

California
Eclectic
Indian
French
Other Possible Combinations






Region + A&E
City + Restaurant + Movies
City + Weather
City + Education: Schools
Restaurants + Schools
…
Bookstore preview combinations



topic + related topics
topic + publications by same author
topic + books of same type but related topic
Goals for Metadata Usage





Well-integrated with search
Provides useful hints of where to go next
Tailored to task as it develops
Personalized
Dynamic
Recipe Example
soar.berkeley.edu/recipes
soar.berkeley.edu/recipes
soar.berkeley.edu/recipes
www.epicurious.com
www.epicurious.com
www.epicurious.com
www.epicurious.com
Epicurious Metadata Usage

Advantages





Creates combinations of metadata on the fly
Different metadata choices show the same information in
different ways
Previews show how many recipes will result
Easy to back up
Supports several task types



``Help me find a summer pasta,'' (ingredient type with event type),
``How can I use an avocado in a salad?'' (ingredient type with dish type),
``How can I bake sea-bass'' (preparation type and ingredient type)
A View of Web Site Structure
(Newman et al. 00)

Information design


Navigation design


structure, categories of
information
interaction with
information structure
Graphic design

visual presentation of
information and
navigation (color,
typography, etc.)
Courtesy of Mark Newman
Information Architecture vs. UI
(Newman et al. 00)

Information Architecture


includes management
and more responsibility
for content
User Interface Design

includes testing and
evaluation
Courtesy of Mark Newman
Recipe Information Architecture

Information design

Recipes have five types of metadata
categories
Cuisine, Preparation, Ingredients, Dish, Occasion
 Each category has one level of subcategories

Recipe Information Architecture

Navigation design

Home page:


show top level of all categories
Other pages:
A link on an attribute ANDS that attribute to the
current query; results are shown according to a
category that is not yet part of the query
 A change-view link does not change the query, but
does change which category’s metadata organizes
the results

Metadata usage in Epicurious
Ingredient
Dish
Cuisine
Recipe
Prepare
Metadata usage in Epicurious
Ingredient
Dish
Cuisine
Prepare
Dish
Cuisine
Prepare
Select
I
Metadata usage in Epicurious
Ingredient
I
>
Dish
Cuisine
Prepare
Dish
Group by
Cuisine
Prepare
Metadata usage in Epicurious
Ingredient
I
>
Dish
Cuisine
Prepare
Dish
Cuisine
Group by
Prepare
Metadata usage in Epicurious
Ingredient
I
>
Dish
Cuisine
Prepare
Dish
Cuisine
Prepare
Cuisine
Prepare
I
Select
Group by
Metadata Usage in Epicurious



Can choose category types in any order
But categories never more than one level deep
And can never use more than one instance of a
category


Items (recipes) are dead-ends


Even though items may be assigned more than one
of each category type
Don’t link to “more like this”
Not fully integrated with search
Epicurious Metadata Usage
Problem: lacks integration with search
Sporting Goods Example
REI example
REI example
REI example
REI example -- searching
REI example
REI example
REI example

REI doesn’t seem to be “conscious” of its
metadata

Doesn’t seem to be integrating the product
metadata with the text information


Don’t find search hits in “learn and share”
Hard-codes relations
Camping product attribute linked to the interior of a
pre-coded page
 No “breadcrumbs”

DSG example
DSG example
DSG example
DSG example
DSG example
DSG example

Seems to be doing many things right



But … maybe too much
Extensive, dynamic use of metadata and
query previews and postviews
Complex relationship between search,
information design, and navigation design

Hitting against some strange edge cases
The FLAMENCO Project
FLexible Access using MEtadata in Novel COmbinations

Main goal:


Perform systematic studies to determine how
metadata should be incorporated into search
Answer questions such as:

Given a set of user goals and a set of information with
certain characteristics (size, inter-connectivity)



How many metadata combinations to show?
What level of detail to show?
How best to preview and postview choices?
The FLAMENCO Project

Focusing on very large collections whose
items are not easily classified


Medical text, image databases
However, much should apply to website
design as well
Evaluation Methodology

Regression Test

Select a set of tasks


Start with a baseline system


Use these throughout the evaluation
Evaluate using the test tasks
Add a feature
Evaluation again
 Compare to baseline
 Only retain those changes that improve results

First: determine appropriate functionality
Later: Incorporate more sophisticated displays
Application to Biomedical Text
Asthma > Steroids
1.
2.
A steroid-induced acute psychosis in a child with athsma.
Management of steroid-dependent asthma with methotrexate.
Steroids
•Pregnanes
• Pregnadienes (5)
• Prednisone (5)
• Pregnenes
• Budesonide (4)
• Corticosterone (3)
Other Views
• Admin & Dosage (50)
• Drug Effects (20
• Therapeutic Use (25)
• Risk Factors (4)
• More …
User Preferred
• Musculoskeletal (4)
•Drug Resistance (6)
•All Categories (99)
99 Documents: [Sort by author] [Sort by popularity] [Sort by Steroids] [Cluster]
1. Effect of short-course budesonide on the bone turnover of asthmatic children.
2. Effect of prednisone on response to influenza virus vaccine in asthmatic children.
…
Asthma > Steroids
1.
2.
A steroid-induced acute psychosis in a child with athsma.
Management of steroid-dependent asthma with methotrexate.
Steroids
•Pregnanes
• Pregnadienes (5)
• Prednisone (5)
• Pregnenes
• Budesonide (4)
• Corticosterone (3)
Other Views
• Admin & Dosage (50)
• Drug Effects (20
• Therapeutic Use (25)
• Risk Factors (4)
• More …
User Preferred
• Musculoskeletal (4)
•Drug Resistance (6)
•All Categories (99)
99 Documents: [Sort by author] [Sort by popularity] [Sort by Steroids] [Cluster]
1. Effect of short-course budesonide on the bone turnover of asthmatic children.
2. Effect of prednisone on response to influenza virus vaccine in asthmatic children.
…
Asthma > Steroids
1.
2.
A steroid-induced acute psychosis in a child with athsma.
Management of steroid-dependent asthma with methotrexate.
Steroids
•Pregnanes
Pregnadienes (5)
Prednisone (5)
• Pregnenes
Budesonide (4)
Corticosterone (3)
Other Views
• Admin & Dosage (50)
• Drug Effects (20
• Therapeutic Use (25)
• Risk Factors (4)
• More …
User Preferred
• Musculoskeletal (4)
•Drug Resistance (6)
•All Categories (99)
99 Documents: [Sort by author] [Sort by popularity] [Sort by Steroids] [Cluster]
1. Effect of short-course budesonide on the bone turnover of asthmatic children.
2. Effect of prednisone on response to influenza virus vaccine in asthmatic children.
…
Asthma > Steroids > Admin & Dosage
1.
Dosage levels for asthmatic steroids: A survey.
Steroids
•Pregnanes
Pregnadienes (3)
Prednisone (5)
Related Categories
•Inhalators (40)
•Emotional Effects (25)
•Preferred Suppliers (30)
User Preferred
• Musculoskeletal (0)
•Drug Resistance (2)
•All Categories (50)
50 Documents: [Sort by author] [Sort by popularity] [Sort by Dosage] [Cluster]
1. Optimal dosage levels for prednisone in the treatment of childhood asthma.
2. …
Other paths: back up and go forward
Asthma > Steroids
Asthma > Steroids > Budesonide
Asthma > Steroids > Budesonide > Huang
Asthma > Huang > Budesonide
Medical example

Use dynamic previews


Allow user to select metadata in any order
At each step, show different types of relevant
metadata,
based on prior steps and personal history,
 include # of documents


Previews restricted to only those metadata
types that might be helpful
Dynamic Metadata Previews

How different from Yahoo & Amazon?

Dynamically determine what to show next
Yahoo’s combos are predefined
 Amazon’s are also predefined, and limited to taste
and general topic only


A way to seamlessly integrate



Related topics
User preferences (personalization)
Context-sensitivity
Application to Image Search
Image Search: What is the task?

Illustrate my slides?



“Find a crevasse”
Keyword match works pretty well
Find inspiration for an
architectural design?


General similarity: maybe
But more control might be better
How different from medical example?



More open-ended
Easier to scan many images quickly
Tertrain metaphor not used here



Not narrowing down a large set
Rather, always viewing more images
A mechanism for “steering” through the
metadata
Our Approach

Architecture task:




Emphasize images over text
Use hypertext-style interface as a reasonable
baseline for comparison
Find out how much choice is too much
Find out whether explicit metadata is better
than implicit more-like-this
SPIRO:
>40,000 art &
architecture images
Detailed metadata
SPIRO Query Form
SPIRO query on Subject: church
A Better Example



Greatbuildings.com
Hyperlinks metadata together
But a small collection


~1000 buildings
~4500 images total
www.greatbuildings.com
www.greatbuildings.com
www.greatbuildings.com
www.greatbuildings.com
www.greatbuildings.com
Our Approach



Create a system that allows experimentation
with different interfaces
Add functionality in a stepwise fashion
Architecture task:




Emphasize images over text
Use hypertext-style interface as a reasonable
baseline for comparison
Find out how much choice is too much
Find out whether explicit metadata is better than
implicit more-like-this
Summary



Standard search is too flexible
Hyperlinks are too restrictive
Metadata is being mixed and matched in
interesting ways, but how is not wellunderstood



In information structure
In navigation structure
In database design
Summary
Our goals

Systematically determine what works, with the following
emphases:





Task-centric
Integrate metadata with search
Dynamic previews
Easily retrace steps
Develop recommendations that reflect both the task
structure and the richness of the information structure
Conclusions


Search & hypertext are becoming more
interwoven
Metadata is being mixed and matched in
interesting ways, but how is not wellunderstood



In information structure
In navigation structure
In database design
Conclusions
Our goals

Systematically determine what works, with the
following emphases:
Task-centric
 Integrate metadata with search
 Dynamic previews
 Easily retrace steps



Develop recommendations that reflect both the
task structure and the richness of the information
structure
In future: integrate with more sophisticated
displays
Descargar

Document