XML Data Management
XLST
Werner Nutt
1
A Hello World! Stylesheet
Top-level: <xsl:stylesheet> element
with a version="1.0" attribute
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="xml" encoding="utf-8" />
<xsl:template match="/">
<hello>world</hello>
</xsl:template>
</xsl:stylesheet>
Declarations
(all elements except
the <xsl:template> ones),
in this case just an <xsl:output>
Template rules
in this case a template
that applies to the root node
Invocation of an XSLT Stylesheet
An XSLT stylesheet may be invoked:
• Programmatically, through an XSLT libraries
• Through a command line interface.
• In a Web Publishing context, by including a styling processing
instruction in the XML document
<?xml version="1.0"?>
<?xml-stylesheet href="blabla.xsl"
type="application/xml"?>
<doc>
<blublu/>
</doc>
– the transformation can be processed on the server side
by a PHP, ASP, JSP, . . . Script
– or on the client side through the XSLT engines
integrated to most browsers.
Web Publishing with XSLT
XML Document
XML Document
XSLT
Stylesheet
XSLT
Stylesheet
Network
Network
HTML
HTML
XSLT
Stylesheet
XSLT
Stylesheet
HTML
HTML
Stylesheet Output
<xsl:output method="html"
encoding="iso-8859-1"
doctype-public="-//W3C//DTD HTML 4.01//EN"
doctype-system="http://www.w3.org/TR/html4/strict.dtd"
indent="yes" />
• method is either xml (default), html or text
• encoding is the desired encoding of the result
• doctype-public and doctype-system makes it possible
to add a document type declaration in the resulting document
• indent specifies whether the resulting XML document will
be indented (default is no)
Handling Whitespace
<xsl:strip-space elements="*" />
<xsl:preserve-space elements="para poem" />
Both elements require a set of space-separated node tests
as their attribute.
• <xsl:strip-space> specifies the set of nodes whose
whitespace-only text child nodes will be removed
• <xsl:preserve-space> allows for exceptions to this list
The <xsl:template> Element
<xsl:template match="book">
The book title is:
<xsl:value-of select="title" />
<h2>Authors list</h2>
<ul>
<xsl:apply-templates select="authors/name" />
</ul>
</xsl:template>
A template consists of
• A pattern: an XPath expression (restricted) which
determines the nodes to which the template applies.
The pattern is the value of the “match” attribute.
• A body: an XML fragment (valid!) which is inserted in
the output document when the template is instantiated
XPath Patterns in XSLT
The XPath expression of the “match” attribute describes the nodes
that can be the target of a template instantiation.
Those expressions are called patterns and must satisfy:
• A pattern always denotes a node set.
Example: <xsl:template match=’1’> is incorrect.
• Checking whether a node is matched must be easy
Example: <xsl:template match=’preceding::*[12]’> is meaningful,
but difficult to evaluate.
Pattern syntax:
• A pattern is a valid XPath expression which uses only the child
and @ axes, and the abbreviation //. Predicates are allowed.
Content of a Template Body
• Literal elements and text
Example: <h2>Authors</h2> .
Creates in the output document an element h2,
with a text child node ’Authors’.
• Values and elements from the input document
Example: <xsl:value-of select=’title’/> ).
Inserts in the output document a node set,
result of the XPath expression title.
• Call to other templates
Example: <xsl:apply-templates
select=’authors’/>.
Only the basics
of XSLT programming!
Applies a template toMany
each advanced
node in thefeatures
node set result
(modes,authors.
priorities, loops and tests)
of the XPath expression
beyond this core description
Instantiation of an <xsl:template>
Main principles:
• Literal elements (those that don’t belong to the XSLT
namespace) and text are simply copied to the output
document.
• Context node: A template is always instantiated in the
context of a node from the input document.
• XPath expressions: all the (relative) XPath expression
found in the template are evaluated with respect to the
context node (an exception: <xsl:for-each> ).
• Calls with <xsl:apply-templates>: find and instantiate a
template for each node selected
by the XPath expression select.
• Template call substitution: any call to other templates
is eventually replaced by the instantiation
of these templates.
The <xsl:apply-templates> Element
<xsl:apply-templates
select="authors/name"
mode="top"
priority="1"
/>
• select: an XPath expression which, if relative,
is interpreted with respect to the context node.
Note: the default value is child::node(),
i.e., select all the children of the context node
• mode: a label which can be used to specify
which kind of template is required.
• priority: gives a priority level in case of conflict.
The <xsl:apply-templates> Mechanism
<xsl:template match="book">
<ul>
<xsl:apply-templates select="authors/name" />
</ul>
</xsl:template>
<xsl:template match="name">
<li><xsl:value-of select="." /></li>
</xsl:template>
<book>
...
<authors>
<name>Jim</name>
<name>Joe</name>
</authors>
</book>
<ul>
<li>Jim</li>
<li>Joe</li>
</ul>
The Execution Model of XSLT
An XSLT stylesheet consists of a set of templates.
The transformation of an input document I proceeds as follows:
1. The engine considers the root node r of I,
and selects the template that applies to r.
2. The template body is copied in the output document O.
3. Next, the engine considers all the <xsl:apply-templates> in O,
and evaluates the match XPath expression,
taking r as context node.
4. For each node result of the XPath evaluation,
a template is selected,
and its body replaces the <xsl:apply-templates> call.
5. The process iterates,
as new <xsl:apply-templates> are inserted in O.
6. The transformation terminates when O is free of xsl: instructions.
The Execution Model: Illustration
The Execution Model: Illustration
The Execution Model: Illustration
The Execution Model: Illustration
The Execution Model: Illustration
The Execution Model: Illustration
“Return all Titles of Recipes”
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0"
encoding="iso-8859-1" indent="yes"/>
Iterate over all
title elements
<xsl:template match="/">
<titles>
<xsl:for-each select="//title">
<xsl:copy-of select="./self::*"/>
</xsl:for-each>
</titles>
Copy each
</xsl:template>
title
</xsl:stylesheet>
A Variant That Does Not Copy the Content
…
<xsl:template match="/">
<titles>
<xsl:for-each select="//title">
<xsl:copy/>
</xsl:for-each>
</titles>
</xsl:template>
…
• copy-of: produces a deep copy, i.e., copies a subtree
• copy: produces a shallow copy, i.e., copies one node
(plus optionally namespace info for elements)
Identity Map
Deep copy
<xsl:template match="/">
<xsl:copy-of select="*"/>
</xsl:template>
Shallow copy
<xsl:template match=" node() | @*">
<xsl:copy>
<xsl:apply-templates select="node() | @*"/>
</xsl:copy>
</xsl:template>
Shallow Copy, with Explicit Text Copy
<xsl:template match="/">
<titles>
<xsl:for-each select="//title">
<xsl:copy>
<xsl:for-each select="text()">
<xsl:copy/>
</xsl:for-each>
</xsl:copy>
</xsl:for-each>
</titles>
</xsl:template>
Shallow copy
of title
Shallow copy
of title content
Element Construction
<xsl:template match="/">
<xsl:element name="titles">
<xsl:for-each select="//title">
<xsl:copy>
<xsl:for-each select="text()">
<xsl:copy/>
</xsl:for-each>
</xsl:copy>
</xsl:for-each>
</xsl:element>
</xsl:template>
Dynamic
element
constructor
Copying Attributes
<xsl:template match="/">
<xsl:element name="recipe-ingredients">
<xsl:for-each
select="//recipe[1][email protected]">
<xsl:element name="ingredient">
<xsl:copy/>
</xsl:element>
</xsl:for-each>
Copying attributes
</xsl:element>
leads to
</xsl:template>
element fusion
Constructing Attributes
<xsl:template match="/">
<xsl:element name="recipe-ingredients">
<xsl:for-each
select="//recipe[1][email protected]">
<xsl:element name="ingredient">
<xsl:attribute name="name" select="."/>
</xsl:element>
Dynamic
</xsl:for-each>
attribute
</xsl:element>
constructor
</xsl:template>
• It’s called “dynamic” because element and attribute names
can be computed on the fly
Computing Attribute Values
<xsl:template match="/">
<xsl:element name="recipe-ingredients">
<xsl:for-each
select="//recipe[1][email protected]">
<ingredient name="{self::*}"/>
</xsl:for-each>
</xsl:element>
</xsl:template>
Expressions in {braces}
will be evaluated
Nested Iteration with <xsl:for-each>
<xsl:template match="/">
<my-recipes>
<xsl:for-each select=".//recipe">
<my-recipe title="{title}">
<xsl:for-each select="ingredient">
<my-ingredient>
<xsl:value-of select="@name"/>
</my-ingredient>
</xsl:for-each>
</my-recipe>
</xsl:for-each>
Turn
</my-recipes>
• recipe titles into attributes
</xsl:template>
• ingredient names into strings
As Before, with 2 Levels of Ingredients
<xsl:template match="/">
<my-recipes>
<xsl:for-each select=".//recipe">
<my-recipe title="{title}">
<xsl:for-each select="ingredient">
<my-ingredient>
<xsl:value-of select="@name"/>
<xsl:for-each select="ingredient">
<my-ingredient>
<xsl:value-of select="@name"/>
</my-ingredient>
</xsl:for-each>
</my-ingredient>
</xsl:for-each>
</my-recipe>
</xsl:for-each>
</my-recipes>
</xsl:template>
Level 1
Level 2
Nested Iteration with Template Calls
<xsl:template match="/">
<xsl:apply-templates select="recipes"/>
</xsl:template>
<xsl:template match="recipes">
<my-recipes>
<xsl:apply-templates select="recipe"/>
</my-recipes>
• Root calls recipes
</xsl:template>
• Recipes calls recipe
Nested Iteration with Template Calls (cntd)
<xsl:template match="recipe">
<my-recipe title="{title}">
<xsl:apply-templates select="ingredient"/>
</my-recipe>
• Recipe calls
</xsl:template>
ingredient
• Ingredient calls
ingredient
<xsl:template match="ingredient">
<ingredient>
<name>
<xsl:value-of select="@name"/>
</name>
<xsl:apply-templates select="ingredient"/>
</ingredient>
</xsl:template>
Sorted List of All Ingredients
<xsl:template match="/">
<result>
<xsl:apply-templates select="/recipes"/>
</result>
</xsl:template>
<xsl:template match="recipes">
<xsl:for-each select="recipe//ingredient">
<xsl:sort select="@name" />
<ingredient>
<xsl:value-of select="@name"/>
</ingredient>
</xsl:for-each>
</xsl:template>
<xsl:sort> can be nested
into <xsl:for-each>
Sorting: Does This Work?
<xsl:template match="/">
<result>
<xsl:apply-templates select="//ingredient"/>
</result>
</xsl:template>
We want to sort
<xsl:template match="ingredient">
all ingredients
<xsl:for-each select=".">
<xsl:sort select="@name" />
by name
<ingredient>
<xsl:value-of select="@name"/>
</ingredient>
</xsl:for-each>
</xsl:template>
Exercise: Restructuring Recipes
Return a list, inside an element <recipes>, of recipes,
containing for every recipe the recipe’s title element and
an element with the number of calories.
Use different approaches:
(a) express iteration by recursive calls of templates
(b) express iteration by <xsl:for-each> elements.
Create new elements
(a) by explicit construction, that is by writing the tags into the
code,
(b) by dynamic construction, that is, by using <xsl:element>
and <xsl:attribute> elements,
(c) by shallow and deep copying, wherever the latter is
possible.
Ordered Output
Using iteration by recursion, return a similar list,
alphabetically ordered according to title.
Using iteration by means of <xsl:for-each>,
return a similar list, ordered according to calories
in descending order.
Element and Attribute Construction
Return a similar list, with title as attribute and
calories as content.
Return a list, inside an element <recipes>, of recipes,
where each recipe contains the title and the top level
ingredients, while dropping the lower level ingredients.
XSLT for Recipes (1/6)
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"
encoding="iso-8859-1" indent="yes"/>
<xsl:template match="/">
<html>
<head>
<title>My Best Recipes</title>
</head>
<body> <table border="1">
<xsl:apply-templates select="recipes/recipe"/>
</table>
</body>
</html>
</xsl:template>
XSLT for Recipes (2/6)
<xsl:template match="recipe">
<tr>
<td>
<h1><xsl:value-of select="title"/></h1>
<ul><xsl:apply-templates
select="ingredient"/></ul>
<xsl:apply-templates select="preparation"/>
<xsl:apply-templates select="comment"/>
<xsl:apply-templates select="nutrition"/>
</td>
</tr>
</xsl:template>
XSLT for Recipes (3/6)
<xsl:template match="ingredient">
<xsl:choose>
<xsl:when test="@amount">
<li>
<xsl:if test="@amount!='*'">
<xsl:value-of select="@amount"/>
<xsl:text> </xsl:text>
<xsl:if test="@unit">
<xsl:value-of select="@unit"/>
<xsl:if test="number(@amount)>number(1)">
<xsl:text>s</xsl:text>
</xsl:if>
<xsl:text> of </xsl:text>
</xsl:if>
</xsl:if>
<xsl:text> </xsl:text>
<xsl:value-of select="@name"/>
</li>
</xsl:when>
XSLT for Recipes (4/6)
<xsl:otherwise>
<li><xsl:value-of select="@name"/></li>
<ul><xsl:apply-templates select="ingredient"/></ul>
<xsl:apply-templates select="preparation"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
XSLT for Recipes (5/6)
<xsl:template match="preparation">
<ol><xsl:apply-templates select="step"/></ol>
</xsl:template>
<xsl:template match="step">
<li><xsl:value-of select="node()"/></li>
</xsl:template>
<xsl:template match="comment">
<ul>
<li type="square">
<xsl:value-of select="node()"/>
</li>
</ul>
</xsl:template>
XSLT for Recipes (6/6)
<xsl:template match="nutrition">
<table border="2">
<tr>
<th>Calories</th><th>Fat</th><th>Carbohydrates</th><th>Pr
otein</th>
<xsl:if test="@alcohol"><th>Alcohol</th></xsl:if>
</tr>
<tr>
<td align="right"><xsl:value-of select="@calories"/></td>
<td align="right"><xsl:value-of select="@fat"/></td>
<td align="right"><xsl:value-of
select="@carbohydrates"/></td>
<td align="right"><xsl:value-of
select="@protein"/></td>
<xsl:if test="@alcohol">
<td align="right"><xsl:value-of select="@alcohol"/></td>
</xsl:if>
</tr>
</table>
</xsl:template>
A Different View
<xsl:template match="/">
<nutrition>
<xsl:apply-templates select="recipes/recipe"/>
</nutrition>
</xsl:template>
<xsl:template match="recipe">
<dish name="{title/text()}"
calories="[email protected]}"
fat="[email protected]}"
carbohydrates="[email protected]}"
protein="[email protected]}"
alcohol="{if ([email protected])
then [email protected] else '0%'}"/>
</xsl:template>
The Output
<?xml version="1.0" encoding="iso-8859-1"?>
<nutrition>
<dish name="Beef Parmesan with Garlic Angel Hair Pasta"
calories="1167" fat="23" carbohydrates="45"
protein="32" alcohol="0%"/>
<dish name="Ricotta Pie"
calories="349" fat="18" carbohydrates="64"
protein="18" alcohol="0%"/>
<dish name="Linguine alla Pescatora"
calories="532" fat="12" carbohydrates="59"
protein="29" alcohol="0%"/>
<dish name="Zuppa Inglese"
calories="612" fat="49" carbohydrates="45"
protein="4" alcohol="2"/>
<dish name="Cailles en Sarcophages"
calories="1892" fat="33"
carbohydrates="28" protein="39" alcohol="0%"/>
</nutrition>
A Sorted List of Ingredients w/o Duplicates
<xsl:template match="recipes">
<xsl:for-each select="recipe//ingredient">
<xsl:sort select="@name" />
<xsl:if test=
"not(@name = preceding::[email protected])"
<ingredient>
<xsl:copy-of select="@name"/>
</ingredient>
</xsl:if>
We ensure that
</xsl:for-each>
</xsl:template>
only ingredients are output
that have not appeared before
This test for duplicates can be expensive!
Duplicate Eliminations à la Muench *
Step 1: Construct keys (= indices for node sets)
<xsl:key name="ingredients-by-name"
match="ingredient"
use="@name"/>
<xsl:key name="recipes-by-title"
<xsl:key> is a top-level element
match="recipe“
that declares a named key that
use="title"/>
can be used in the style sheet
with the key() function.
name: name of the key
Note: A key does not have
match: the node set to be indexed
to be unique!
use: the index key values
* Invented by Steve Muench, called the “Muenchian Method” in the XSLT world
Duplicate Eliminations à la Muench
Step 2: Iterate over the recipes …
<xsl:template match="/">
<result>
<xsl:apply-templates select="/recipes"/>
</result>
</xsl:template>
Duplicate Elimination à la Muench
select those ingredients
<xsl:template match="recipes">
that occur as
<xsl:for-each select="recipe//ingredient
the first element
[count(. | key('ingredients-by-name',
@name)[1]) = 1]"> in their index group
<xsl:sort select="@name" />
the others
<ingredient>
are redundant …
<xsl:copy-of select="@name"/>
<xsl:for-each select="key('recipes-by-title',
ancestor::recipe/title)">
<xsl:copy>
sort ingredients by name,
<xsl:copy-of select="title"/>
then retrieve recipes
</xsl:copy>
from their index
</xsl:for-each>
</ingredient>
</xsl:for-each>
</xsl:template>
Grouping in XSLT 2.0
<xsl:template match="/">
<uses>
<xsl:for-each-group select="//ingredient"
group-by="@name">
<xsl:sort select="@name"/>
<use name="{current-grouping-key()}"
count="{count(current-group())}"/>
</xsl:for-each-group>
</uses>
</xsl:template>
countries.dtd
<!ELEMENT countries (country*)>
<!ELEMENT country (city*, language*)>
<!ATTLIST country
name
CDATA #REQUIRED
population CDATA #REQUIRED
area
CDATA #REQUIRED>
<!ELEMENT city (name, population)>
<!ELEMENT language (#PCDATA)>
<!ATTLIST language
percentage CDATA #REQUIRED>
<!ELEMENT name (#PCDATA)>
<!ELEMENT population (#PCDATA)>
Queries About Countries: Example 1
Restructure the document by
• listing countries according to population,
• cities within each country according to population, and
• languages within each country according to percentage.
Restructuring the Countries Document (1)
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0"
encoding="iso-8859-1" indent="yes"/>
<xsl:template match="/">
<xsl:apply-templates select="countries"/>
</xsl:template>
<xsl:template match="countries">
<xsl:copy>
<xsl:apply-templates select="country">
<xsl:sort select="@population" order="descending"
data-type="number"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
Restructuring the Countries Document (2)
<xsl:template match="country">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:apply-templates select="city">
<xsl:sort select="population"
order="descending"
data-type="number"/>
</xsl:apply-templates>
<xsl:apply-templates select="language">
<xsl:sort select="@percentage"
order="descending"
data-type="number"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
Restructuring the Countries Document (3)
<xsl:template match="city">
<xsl:copy-of select="."/>
</xsl:template>
<xsl:template match="language">
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
Compare the XQuery
let $doc := doc("countries.xml")
let $cs := $doc//country
return
element countries
{for $c in $cs
order by number([email protected]) descending
return
element country
{$c/@*,
for $city in $c/city
order by number($city/population) descending
return
$city,
for $l in $c/language
order by number([email protected]) descending
return
$l
}
}
Languages Spoken in Countries
Return a list of language elements, alphabetically sorted,
where each language element contains
– a list of country elements,
– such that the language is spoken in the country,
– together with the number of speakers of the language
in that country.
Difficulties:
• eliminate duplicates among languages
• retrieve the countries where the language is spoken
 how can we remember that language?
Languages Spoken in Countries: XQuery
let $doc := doc("countries.xml")
let $ls := distinct-values($doc//language)
let $cs := $doc//country
return
<languages>
{for $l in $ls order by $l language
is remembered
return
<language>
in variable $l
{attribute name {$l}}
{for $c in $cs[language=$l] order by [email protected]
return
<country>
[email protected]}
{attribute speakers {xs:int(([email protected] div 100)
* $c/language[.=$l][email protected]) }}
</country>}
</language>}
</languages>
Languages Spoken in Countries: XSLT
named template
<xsl:template name="top" match="/">
<xsl:element name="languages">
<xsl:apply-templates select=".//language">
<xsl:sort select="text()"
order="ascending"
data-type="text"/>
</xsl:apply-templates>
</xsl:element>
</xsl:template>
Languages Spoken in Countries: XSLT
why not
" not( . = preceding::language)"
<xsl:template name="language" match="language">
<xsl:if test="not(text()calling
= preceding::language)">
a named
<xsl:copy>
template
<xsl:attribute name="name" select="text()"/>
<xsl:call-template name="country-with-language">
eliminate
duplicates
<xsl:with-param
name="language" select="."/>
</xsl:call-template>
</xsl:copy>
adding a parameter
</xsl:if>
to the call
</xsl:template>
Languages Spoken in Countries: XSLT
<xsl:template name="country-with-language">
<xsl:param name="language"/>
context node =
<xsl:for-each
context node at call
select="//country[language=$language]">
(no matching)
makes<xsl:copy>
the parameter
of the call
available
<xsl:copy-of
select="@name"/>
$language is a
<xsl:attribute name="speakers"
reference to the
select="format-number( paramenter
(@population div 100)
* [email protected],'0')"/>
</xsl:copy>
</xsl:for-each>
with
format-number we can
specify number formats,
</xsl:template>
e.g., ‘0’ indicates digit notation
Descargar

Database Design - Free University of Bozen