XML Programming in .NET
Open XML Developer Workshop
Disclaimer
The information contained in this slide deck represents the current view of Microsoft Corporation on the issues discussed as of the date of
publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the
part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.
This slide deck is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE
INFORMATION IN THIS DOCUMENT.
Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this slide
deck may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic,
mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft
Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this
slide deck. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this slide deck does not give
you any license to these patents, trademarks, copyrights, or other intellectual property.
Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places and events
depicted herein are fictitious, and no association with any real company, organization, product, domain name, email address, logo,
person, place or event is intended or should be inferred.
© 2006 Microsoft Corporation. All rights reserved.
Microsoft, 2007 Microsoft Office System, .NET Framework 3.0, Visual Studio, and Windows Vista are either registered trademarks or
trademarks of Microsoft Corporation in the United States and/or other countries.
The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
Open XML Developer Workshop
Overview
Approaches to working with Open XML in .NET
Introduction to XML in .NET
Reading and Writing XML
Working with XML Namespaces
Validating XML
Querying XML
Comparing Open XML Documents
Open XML Developer Workshop
Working with Open XML in .NET: Approaches
Option 1: Open XML SDK 2.0
Provides Strongly Typed .NET Classes and Objects for interacting with
Open XML document parts and content.
Designed to make coding against the XML content much easier than
the traditional W3C XML DOM programming model.
Offers validation of Open XML documents against different variations
of the Open XML format.
The recommended approach if you just want to dive straight in to
developing for Open XML documents.
Covered later in workshop modules ??, ?? and ??
Resources
Welcome to the Open XML SDK 2.0 for Microsoft Office
http://msdn.microsoft.com/en-us/library/bb448854%28office.14%29.aspx
Open XML 2.0 SDK Lab at http://www.openxmldeveloper.org
Open XML Developer Workshop
Working with Open XML in .NET: Approaches
Option 2: Using XML in .NET
A more portable approach, applicable to many platforms and
languages (e.g. Java).
Requires a more intimate knowledge of the Open XML document
format syntax.
Provides a better understanding of how things work “under-thehood”.
Using XML Reader/Writer (XML Streams), XML DOM.
Validation using XML Schema Definition files.
Covered here in this module
Open XML Developer Workshop
Widely used W3C standards for XML
XML 1.0
Name
spaces
XSLT
W3C
XPath
Schema
Open XML Developer Workshop
Lesson: Reading and Writing XML
XML Streams
Working with the DOM
Open XML Developer Workshop
Working with XML Streams
Your
Stream
File
String
XmlReader
Code
XmlWriter
Here
Read()
ReadStartElement()
ReadElementContentAsInt()
Open XML Developer Workshop
Stream
File
String
WriteStartElement()
WriteValue()
WriteComment()
The stream based approach
Reading XML with
an XmlReader
<order>
<orderItem>
<quantity>10</quantity>
<unitPrice>34.99</unitPrice>
</orderItem>
</order>
XmlReaderSettings settings = new XmlReaderSettings();
settings.IgnoreComments = true;
using (XmlReader reader = XmlReader.Create("sample.xml", settings))
{
while (reader.Read())
{
if (reader.IsStartElement() && reader.LocalName == "orderItem")
{
reader.ReadToDescendant("quantity", xmlNamespace);
int quantity = reader.ReadElementContentAsInt();
reader.ReadToNextSibling("unitPrice", xmlNamespace);
decimal unitPrice = reader.ReadElementContentAsDecimal();
Console.WriteLine("Total: " + quantity * unitPrice);
}
}
}
Open XML Developer Workshop
The stream based approach
Writing XML with an XmlWriter
XmlWriterSettings settings = new XmlWriterSettings();
settings.CloseOutput = true;
settings.Indent = true;
using (XmlWriter writer = XmlWriter.Create(Console.Out, settings))
{
writer.WriteStartDocument(true); // standalone
writer.WriteStartElement("order", xmlNamespace);
writer.WriteStartElement("orderItem", xmlNamespace);
writer.WriteElementString("quantity", xmlNamespace,
XmlConvert.ToString(10));
writer.WriteElementString("unitPrice", xmlNamespace,
Convert.ToString(34.99));
writer.WriteEndElement();
writer.WriteEndElement();
<order>
}
<orderItem>
<quantity>10</quantity>
<unitPrice>34.99</unitPrice>
</orderItem>
</order>
See Demos\01-XML Programming\Demo 1 – Stream based XML
Open XML Developer Workshop
Working with the DOM
DOM is the W3C programming interface for XML
DOM models an XML
DocumentNode
source as a in memory
tree of nodes
ElementNode <City>
You can use the DOM to
Navigate and search
Modify content
ElementNode
<Name>
lcid=“en-US”
<city>
<name lcid=“en-US”>
Seattle
</name>
</city>
AttributeNode
TextCharacterData Seattle
Open XML Developer Workshop
Working with the DOM
Loading from and saving to an XML source
The XmlDocument class is the core structure
For data store
Use this method
String
LoadXml()
Stream
Load() and Save()
File
XmlReader / XmlWriter
XmlDocument document = new XmlDocument();
using (FileStream fs =
new FileStream("sample.xml", FileMode.Open, FileAccess.Read))
{
document.Load(fs);
}
// save using a streaming writer with indentation
using (XmlTextWriter writer = new XmlTextWriter(Console.Out))
{
writer.Formatting = Formatting.Indented;
document.Save(writer);
}
Open XML Developer Workshop
Working with the DOM
Navigating through XML
The XmlNode class serves as the base class for various
DOM elements
XmlElement and XmlAttribute are samples
<?xml version="1.0"?>
<book isbn="123456789">
<title>XML.NET</title>
<price>19.99</price>
</book>
XmlNode book = doc.FirstChild;
XmlNode priceNode = book.ChildNodes[1];
XmlNode isbnNode = book.GetAttributeNode("isbn");
string price = priceNode.FirstChild.Value;
string isbn = isbnNode.Value;
Open XML Developer Workshop
Working with the DOM
Manipulating XML
The XmlDocument class serves as a factory for new XML
elements
// first create the nodes
XmlDocument doc = new XmlDocument();
XmlElement bookNode = doc.CreateElement("book");
XmlNode priceNode = doc.CreateElement("price");
// set the node value
priceNode.Value = "19.99";
// set an attribute
bookNode.SetAttribute("title", "XML is Cool");
// add the nodes together
bookNode.AppendChild(priceNode);
doc.AppendChild(bookNode);
<book title="XML is Cool">
<price>19.99</price>
</book>
Open XML Developer Workshop
Working with the DOM
Manipulating XML
The XmlDocument allows creation of all DOM elements
XmlComment comment = doc.CreateComment("Some comment");
XmlProcessingInstruction pi = doc.CreateProcessingInstruction(
"xml-stylesheet", "type='text/xsl' href='style.xsl'");
To remove nodes, use the parent XmlNode
bookNode.RemoveChild(priceNode);
bookNode.RemoveAttribute("title");
Importing XML from another XmlDocument
XmlDocument other;
XmlNode node = doc.ImportNode(other.DocumentElement, true);
See Demos\01-XML Programming\Demo 2 – DOM based XML
Open XML Developer Workshop
Lesson: Working with XML namespaces
Basics of XML Namespaces
The XmlNamespaceManager class
Open XML Developer Workshop
Basics of XML namespaces
Many documents use the same element names
“order” for instance
To differentiate the elements you apply namespaces
http://mycompany.com/myOrderSchema
becomes
“http://mycompany.com/myOrderSchema#order”
Prefixes are used for abbreviation
xmlns:myO=“http://mycompany.com/myOrderSchema”
<p:presentation xmlns:p="http://schemas.openxml.../presentationml/2006/main">
<p:sldMasterIdLst />
</p:presentation>
Open XML Developer Workshop
The XmlNamespaceManager class
The XmlNamespaceManager stores namespaces and
prefixes
The prefixes in the source document do not have to be
equal to the one specified in the namespace manager
Create and use a namespace manager with all queries
which use XML namespaces
XmlDocument document;
XmlNamespaceManager mgr = new XmlNamespaceManager(document.NameTable);
mgr.AddNamespace("myPf", "http://xmlnamespaceInDocument");
document.SelectSingleNode("//myPf:someElement", mgr);
Open XML Developer Workshop
Lesson: Validating XML
Overview of Validation Technologies
The Schema Object Model
Validating with the XmlReader
Validating an XmlDocument
The XmlSchemaValidator
Open XML Developer Workshop
Overview of Validation Technologies
Various XML validation technologies exist today:
Document Type Definitions (DTDs)
Xml-Data Reduced schema (XDRs)
XML-Schema Definition Language (XSDs)
Regular Language for XML Next Generation (Relax-NG)
XML-Schema is the standard validation technology today
Supported through the Schema Object Model
Open XML Developer Workshop
The Schema Object Model
Represented by the XmlSchema class
There are two collection classes for XmlSchema objects
The XmlSchemaCollection, which is now obsolete
The XmlSchemaSet class is the new/improved approach
The XmlSchemaSet
Improved standards compliance and performance
Forms one big ‘logical’ schema from all contained items
Support for duplicate target namespaces
Open XML Developer Workshop
Validating with the XmlReader
Validation performed implicitly when configuring the
XmlReader.Create factory method
// Create the XML Schema
XmlSchema schema = null;
using (XmlReader reader = XmlReader.Create("sample.xsd"))
{
schema = XmlSchema.Read(reader, null);
}
// Create the settings for the XmlReader
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
// add it to the XmlSchemaSet of the settings class
settings.Schemas.Add(schema);
// Now read with the reader which also performs validation
using (XmlReader reader = XmlReader.Create("sample.xml", settings))
{
while (reader.Read()) ;
}
Open XML Developer Workshop
Validating an XmlDocument
The XmlDocument acts as a container for XML schemas
// Create the XMLSchema object
XmlSchema schema = null;
using (XmlReader reader = XmlReader.Create("sample.xsd"))
{
schema = XmlSchema.Read(reader, null);
}
// Load a document and attach the schema
XmlDocument document = new XmlDocument();
document.Load("sample.xml");
document.Schemas.Add(schema);
// Validate it using the attached schemas
document.Validate(OnValidateSchema);
See Demos\01-XML Programming\Demo 3 – Validating Documents
Open XML Developer Workshop
Open XML schemas: best practice
The Ecma spec contains 89 separate schemas
There are circular references between those schemas
XmlSchemaSet can handle circular references
BUT … only when they’re all in one Add() method call
The solution: use the provided all.xsd file, which imports
all of the Open XML schemas into a single schema
HOL Solution\Solution\Schemas\all.xsd
Open XML Developer Workshop
Advanced Validation with Open XML SDK 2.0
Open XML SDK 2.0 supports semantic validation
Use the OpenXmlValidator Class
Gives detailed error messages with semantic meaning.
E.g.
Element
'DocumentFormat.OpenXml.Wordprocessing.Footnote'
referenced by [email protected]' does not exist in part
'/word/footnotes.xml'. The reference value is '3'.
Open XML Developer Workshop
Lesson: Querying XML
The Query Process
Querying the DOM
Working with XML Namespaces
Querying with LINQ to XML
Open XML Developer Workshop
The Query Process
1. Load XPathDocument
5. Check the ReturnType
XPathDocument
Boolean
String
CreateNavigator
Evaluate( )
Select( )
XPathNavigator
Compile( )
MoveToRoot( )
MoveToNext( )
Double
NodeSet
XPathNodeIterator
3. Create the Query
MoveToPrevious( )
MoveToFirstChild( )
XPathExpression
MoveToParent( )
.ReturnType
2. Create XPathNavigator
MoveNext( )
For Each
4. Compile XPathExpression
Open XML Developer Workshop
Querying the DOM
The XmlDocument has support for X-Path in various
places
Create a new XPathNavigator using the CreateNavigator() method
Use the XmlDocument SelectNodes(xpath) or
SelectSingleNode(xpath)
Use an XPathDocument for optimized queries
XmlNode node = document.SelectSingleNode("/order/orderItems");
XPathNavigator navigator = node.CreateNavigator();
foreach (XPathNavigator itemNavigator in
navigator.Select("orderItem"))
{
}
Open XML Developer Workshop
Automatic prefix mapping
All document contain their own prefix mappings, the
System.Xml 2.0+ library allows you to use these in your
queries
Pass in an IXmlNamespaceResolver when using the
XPathNavigator
The XML namespace needs to be in scope (i.e., you must
have read past it in the XPathNavigator):
XPathNavigator navigator = document.CreateNavigator();
navigator.MoveToChild(XPathNodeType.Element);
foreach (XPathNavigator itemNavigator in
navigator.Select("o:orderItem", navigator))
{
}
Open XML Developer Workshop
Best practice: use IXmlNamespaceResolver
Sample syntax (from demo lab):
XPathNavigator navigator = document.CreateNavigator();
XmlNamespaceManager namespaceManager = new
XmlNamespaceManager(navigator.NameTable);
namespaceManager.AddNamespace("w",
"http://schemas.openxmlformats.org/wordprocessingml/2006/main");
XPathExpression searchPhrase =
XPathExpression.Compile(@"//w:p[w:r/w:t[contains(text(),""" +
toolStripTextBox1.Text + @""")]]", namespaceManager);
See Demos\01-XML Programming\Demo 4 – Querying Documents\XPath
Open XML Developer Workshop
Querying with LINQ to XML
An improved XML DOM programming interface which
utilizes .NET 3.5’s Language-Integrated Query (LINQ)
Framework for querying.
Uses XLINQ classes (XDocument, XNode, XElement) and
returns query results as an IEnumerable collection.
XDocument document = XDocument.Load(mainPart.GetStream());
XNamespace w = "http://schemas.openxmlformats.org/wordprocessingml/2006/main";
namespaceManager.AddNamespace("w",
"http://schemas.openxmlformats.org/wordprocessingml/2006/main");
var searchQuery = document.Descendants(w + ”t”)
.Where(elem => elem.Value
.IndexOf(toolStripTextBox1.Text,
StringComparison.CurrentCultureIgnoreCase) >= 0);
See Demos\01-XML Programming\Demo 4 – Querying Documents\LINQToXML
Open XML Developer Workshop
Lesson: Comparing Open XML Documents
Comparing Open XML Documents.
Comparing using the Productivity Tool.
Open XML Developer Workshop
Comparing Open XML Documents
Often the desired look for a document is known, but it
may be difficult to determine which XML tags/attributes
to tweak in order to achieve it.
Make the changes within Office 2010 itself and use a
comparison with the original document to see the
changes to the XML that were made.
Open XML Developer Workshop
Comparing using the Productivity Tool
Available as part of
the Open XML SDK
2.0 Productivity
Tool.
Highlights additions
and deletions
between the XML
content of the two
Open XML
documents.
Can also generate the SDK .NET code to turn the first
document into the second document.
Open XML Developer Workshop
Review
Introduction to XML in .NET
Reading and Writing XML
Working with XML Namespaces
Validating XML
Querying XML (XPath and LINQ)
Comparing Open XML documents.
Go through LAB 01 – XML Programming in .NET
Optional: demo labs 1-4
Open XML Developer Workshop
Open XML Developer Workshop
Descargar

XML Programming in .NET