Native XML
Support in
DB2 9 for z/OS
Phil Grainger
CA
Agenda
> Introduction
> What exactly IS XML?
> DB2 9 XML storage
> DB2 9 XML processing
> Further thoughts on XML and DB2
> Bibliography
2
What IS XML?
> eXtensible Markup Language
> Self describing data storage/transport
> Vendor and platform independent
 Eg RSS feeds
 Podcasts
> Can contain structured, unstructured or a mix of data
3
An example of XML
> XML consists of a series of nodes which form a hierarchy
> Neither the names nor the contents of the nodes are
predefined

This is why it’s termed “extensible”
> A node is enclosed between <nodename> and
</nodename> tags

Windows Word has a neat way of showing this
> I’ll use some XML borrowed from the Sky television news
feed

RSS feeds are a great example of XML usage
> Please note that the screenshots are only the FIRST PART
of the XML

4
So some </end> tags are missing
An example of “raw” XML
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<title>Sky News | Strange News | First For Breaking News</title>
<link>http://news.sky.com/skynews/strangebuttrue</link>
<image>http://static.sky.com/images/skynews/rss/rss.gif
<title>Sky News</title>
<url>http://static.sky.com/images/skynews/rss/rss.gif</url>
<link>http://news.sky.com/</link>
</image>
<description>Sky News Strange But True</description>
<language>en-us</language>
<copyright>Copyright 2007, BSKYB. All Rights Reserved.</copyright>
<lastBuildDate>Thu, 02 Aug 2007 11:22:30 GMT</lastBuildDate>
<category>Sky News</category>
<ttl>60</ttl>
<channelLinks>
<link name="Uk News" url="http://news.sky.com/skynews/uknews"/>
<link name="World" url="http://news.sky.com/skynews/worldnews"/>
<link name="Money" url="http://news.sky.com/skynews/money"/>
<link name="Business" url="http://news.sky.com/skynews/business"/>
</channelLinks>
<item>
<title><![CDATA[World's Cheekiest Burglar Hunted On Facebook]]></title>
<link>http://news.sky.com/skynews/article/0,,30100-1278207,00.html?f=rss</link>
<description><![CDATA[A disgruntled homeowner has fallen victim to possibly the
cheekiest burglar in the world - and has now turned to social networking website Facebook to track him
down.]]></description>
<enclosure url="" length="123456" type="image/gif" height="45"
width="95"><![CDATA[]]></enclosure>
</item>
5
Or as Microsoft Word shows it
6
Or an alternative Word view
> Showing how all the elements and
nodes are related
> Also this makes the hierarchical
nature of XML even more obvious
7
So we could draw it as a hierarchy!
Hey
Does this look
like IMS to you?
So, who
defines the
format?
rss
In our
example
this is
repeating
channel
ttile
8
link
image
description
title
url
link
language
copyright
lastbuilddate
category
ttl
*
channellinks
link
item
title
link
description
enclosure
XML Schemas
> We’ve already seen that XML is infinitely extensible
> Does this mean anarchy?
> It can
> But there are also things called “XML Schemas”
 A schema defines what can appear in an XML document
 A well formed XML document can still violate an XML
schema
9
DB2 9 for z/OS support for XML
> DB2 9 provides a new XML datatype
> An entire XML document can be stored in a single XML
column
 So one column of one row in a table has one complete
document
10
DB2 9 for z/OS support for XML
There are some limitations, for example:
> Only well-formed documents are allowed
 All tags must have end tags
 Elements must be nested correctly
 Attributes must have values (enclosed by “ or ‘)
 Tags are case sensitive </A> doesn’t end <a>
> An XML schema can optionally be applied with the
DSN_XMLValidate() function
 Providing you have defined a schema to DB2
 In database DNSXRS (part of the catalog)
11
DB2 9 for z/OS support for XML
> Documents are not stored as strings
 So not comparable with any string data type
> But are manipulated by various XML expressions and
functions
 Including an XML predicate “function”
12
DB2 9 for z/OS support for XML
> Storage of XML data is a little like LOB storage
> When you create a table with an XML column, you get
some other things as well

A hidden column called DB2_GENERATED_DOC_ID_FOR_XML

A unique index on this column

An table space to store the XML data

A table in the above table space

An XML index for the above table
> Luckily DB2 creates all of these things for us!
13
DB2 9 for z/OS support for XML
CREATE
TABLE GRAPH02.XML_TABLE2
( KEY_COULMN INTEGER NOT NULL,
XML_COLUMN XML NOT NULL);
> Wasn’t THAT easy!
> Note that there is no length specification
 Maximum XML size is the same as the max
LOB size
 Currently 2GB
14
DB2 9 for z/OS support for XML
> There are some things you can’t do with an XML column
 Sort
 Group
 Most predicates
 Primary, foreign or unique key
> Also, no host languages have XML data type manipulation
support
 Yet …….
 So XML data has to be manipulated as string data
15
Processing XML data
> Inserting data into an XML column is simply a matter of
issuing an INSERT statement
 Or a LOAD
> The XML statement MUST conform to DB2 standards
 And must be “well formed”
 Look out for SQLCODE -20398 which says you have an
error somewhere
– A byte offset IS given, but this is after DB2 has converted
the XML to UTF-8
– So may not exactly match where the error is
16
Processing XML data
> And you can optionally apply a schema too remember
 Bear in mind that there WILL be an overhead to applying a
schema
17
XML functions
> There are a number of functions for manipulating XML
data
> Be careful though, not all the functions starting “XML” are
for manipulating XML data
 Many are for CREATING XML data from relational data
18
XMLDOCUMENT()
> For creating XML documents from relational data
 or from parts of other XML documents
> At it’s simplest, it returns the same as a basic SELECT
from the table
 But can produce XML documents with all the necessary
headers
19
XMLSERIALIZE()
> Converts XML data into textual data
 Can include/exclude XML declarations
 Converts to LOB, BLOB, CLOB or DBCLOB
20
XPath
> Before we can talk about working with these XML data
types, we need to talk about XPath
> XPATH notation allows you to navigate the XML document
> You can use XPATH to return subsets of your documents
21
XPath
> There is not time here for an in-depth XPATH discussion
 But, for example ….
 DB2 needs to know, when we refer to a node name, which
specific one we mean
 /rss/channel/item/title
would allow us to work with the /title/ nodes in our XML
data
– In our case the <item> node is also a repeating node
22
XPath
> XPath can be used in SELECT lists
 Using XMLQUERY functions, for example
> In predicates
 Using the new XMLEXISTS predicate
 Returns TRUE or FALSE depending on XPath expression
23
Let’s start simple - XMLQUERY()
> Returns a portion of an XML document matching a query
> Also returns all the subsidiary nodes
SELECT KEY_COLUMN,
XMLSERIALIZE(XMLQUERY('/rss/channel/item
[title="Worlds Cheekiest Burglar Hunted On Facebook"]'
PASSING XML_COLUMN)
AS CLOB(2K))
FROM GRAPH02.XML_TABLE
> Could return
24
XMLQUERY()
> KEY_COLUMN followed by textual XML
1
<item><title>Worlds Cheekiest Burglar Hunted On
Facebook</title><link>http://news.sky.com/skynews/art
icle/0,,301001278207,00.html?f=rss</link><description>A
disgruntled homeowner has fallen victim to possibly
the cheekiest burglar in the world - and has now
turned to social networking website Facebook to track
him down.</description><enclosure url=""
length="123456" type="image/gif" height="45"
width="95"/></item>
> This is ONE of a repeating set of nodes from
one document
25
XMLQUERY()
> HOWEVER, if the XMLQUERY() returns <null>
 As it will if it can’t find the text in your document
> A row will still be returned for each row in the table
 KEY_COLUMN value and <null>
> We also need a way to specify predicates on the XML
data
26
A bit more complex - XMLEXISTS()
> XMLEXISTS() returns TRUE or FALSE depending on
whether an XPath expression finds a result
> So we expand our query into:
SELECT KEY_COLUMN,
XMLSERIALIZE(XMLQUERY('/rss/channel/item
[title="Worlds Cheekiest Burglar Hunted On Facebook"]'
PASSING XML_COLUMN)
AS CLOB(2K))
FROM GRAPH02.XML_TABLE
WHERE XMLEXISTS('/rss/channel/item
[title="Worlds Cheekiest Burglar Hunted On Facebook"]'
PASSING XML_COLUMN)
> Now, rows will only be returned where the XPath in
XMLEXISTS() finds data
27
Searching
> You can see though that the arguments to XMLQUERY
and XMLEXISTS have to be an EXACT match for the
content we are searching for
> What if we want to do a wildcarded sort of search
> XPath has no concept of “%” or “_”, but it does have a
series of functions that may help
> One useful one is contains
 Like this:
28
Searching
SELECT KEY_COLUMN,
XMLSERIALIZE(XMLQUERY('/rss/channel
[contains(title,“Facebook")]'
PASSING XML_COLUMN)
AS CLOB(8K))
FROM GRAPH02.XML_TABLE
WHERE XMLEXISTS('/rss/channel
[contains(title,“Facebook")]'
PASSING XML_COLUMN)
29
Searching
>This should be clear, but we are
 Wanting data returned that has
“Facebook” in the /rss/channel/title node
 ONLY for rows that contain “Facebook” in
an /rss/channel/title node
30
Not just SELECT
> Of course, we could also say something like
DELETE
FROM GRAPH02.XML_TABLE
WHERE XMLEXISTS('/rss/channel
[contains(title,“Favebook")]'
PASSING XML_COLUMN)
> Delete all the rows that contain “Facebook” in a title node
31
XML indexes
> Using XPath notation, you can create indexes
on your XML column
CREATE UNIQUE INDEX XML_INDEX
ON GRAPH02.XML_TABLE(XML_COLUMN)
GENERATE KEY USING XMLPATTERN
'/rss/channel/item/title'
AS SQL VARCHAR(128)
> Yes, this IS a unique index
 And it DOES constrain the content of the
node specified to VARCHAR(128)
32
XML indexes
> So you can also uses indexes to constrain the
CONTENT of nodes
> In the previous example, we said
GENERATE KEY USING XMLPATTERN
'/rss/channel/item/title'
AS SQL VARCHAR(128)
> Any attempt to insert a document with a /title/ longer
than 128 characters will fail
33
XML indexes
> Here’s something unusual
> I have two rows in my table
 SELECT COUNT(*) does indeed return 2
> So why does REBUILD INDEX say
DSNUCRUL - UNLOAD PHASE STATISTICS - NUMBER OF
RECORDS PROCESSED=34 ?




34
Because each row has MULTIPLE index keys!
Look again at the CREATE INDEX XPath statement
We’re indexing INTO an XML document
Each document (in this case) has 17 occurrences of
/rss/channel/item/title
More on XPath
> XPath arguments are case sensitive
> Be VERY careful about how you code them!
> Also, errors in XPath specifications can be hard to debug
 Syntax errors are easy DB2 tells you
 XML errors aren’t so simple
– Spelling errors
– Capitalization errors
– Etc.
35
More on XPath
> Why does this not return any data?
SELECT KEY_COLUMN,
XMLSERIALIZE(XMLQUERY('/rss/chanel/item
[title="ITV Profits Take A Dive In First Half"]'
PASSING XML_COLUMN)
AS CLOB(2K))
FROM GRAPH02.XML_TABLE
WHERE XMLEXISTS('/rss/chanel/item
[title="ITV Profits Take A Dive In First Half"]'
PASSING XML_COLUMN)
36
More on Xpath
> It’s because we spelled “channel” with one
“n”
 NO error is returned even though the node “chanel”
does not exist in the XML
> Just because no node of that name exists
TODAY, that does not mean one will not be
there tomorrow
37
Further thoughts on XML
> Firstly, remember that the XML data is effectively free
form
> What is in your XML column could be ANY valid XML data
 Each row in the table does not have to contain similar data
for the XML column
38
Further thoughts on XML
> My examples just happen to contain two almost
identical rows
 But I could add a third, very different, document
 From a business perspective, this would not be
sensible
 But DB2 would allow it
39
New Features by APAR
> PK51571, 51572 and 51573
XMLTABLE() and XMLCAST()
40
XMLTABLE()
> Turns a “repeating group” in an XML document into
rows in a “table”
SELECT X.*
FROM GRAPH02.XML_TABLE G,
XMLTABLE('/rss/channel/item' PASSING
G.XML_COLUMN
COLUMNS "SEQ" FOR ORDINALITY,
"TITLE"
AS X;
41
CHAR(64) PATH 'title')
XMLTABLE()
> Returns
1
Worlds Cheekiest Burglar Hunted On Facebook
2
Deadly Petrol Roller Skates Seized
3
Why Panda Poo Will Play A Part In Olympics
4
Surgeons Operate By Mobile Phone Light
5
Great White Shark Seen In Cornwall
6
Lightning Strike No Flash In The Pan For Survivor
7
Spooky Scamp Has Skill For Sniffing Death
42
XMLTABLE() in a View
CREATE VIEW XML_VIEW AS
SELECT X.*
FROM GRAPH02.XML_TABLE G,
XMLTABLE('/rss/channel/item' PASSING
G.XML_COLUMN
COLUMNS "SEQ" FOR ORDINALITY,
"TITLE"
AS X
43
CHAR(64) PATH 'title')
XMLTABLE() in a View
SELECT * FROM
XML_VIEW
WHERE TITLE LIKE '%Facebook%‘
> Now we can use wildcards and column names to access
our XML data!!
> Do be careful of performance though
 This will require materialisation of the data
 BEFORE the predicate can be applied
44
New Features by APAR
> PK55585 and PK55831 (still open)
13 new XPATH functions
e.g. fn.lower-case, fn.upper-case, fn.matches,
fn.position, fn.replace & fn.tokenize
> PK47594 and PK58766
XML Load performance improvement
45
Questions??
Bibliography
Bibliography
> Look out for
 GC18-9856
“DB2 Version 9.1 for z/OS – What’s New”
 SG24-7330
“DB2 9 for z/OS Technical Overview”
 SG24-7239
“Enhancing SAP by Using DB2 9 for z/OS ”
 SC18-9858
“DB2 Version 9.1 for z/OS – XML Guide”
> SG24-7315 “DB2 9 pureXML Guide” is for DB2 LUW
 NOT for z/OS
48
Descargar

Slide 1