Office Open XML Architecture
A developer’s introduction to the file formats
Open XML Developer Workshop
Disclaimer
The information contained in this slide deck represents the current view of Microsoft Corporation on the issues discussed as of the date of
publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the
part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.
This slide deck is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE
INFORMATION IN THIS DOCUMENT.
Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this slide
deck may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic,
mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft
Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this
slide deck. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this slide deck does not give
you any license to these patents, trademarks, copyrights, or other intellectual property.
Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places and events
depicted herein are fictitious, and no association with any real company, organization, product, domain name, email address, logo,
person, place or event is intended or should be inferred.
© 2006 Microsoft Corporation. All rights reserved.
Microsoft, 2007 Microsoft Office System, .NET Framework 3.0, Visual Studio, and Windows Vista are either registered trademarks or
trademarks of Microsoft Corporation in the United States and/or other countries.
The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
Open XML Developer Workshop
Objectives
• In this module, we will learn about the architecture of
the Office Open XML formats.
• Primary focus is on concepts that apply to all three main
document types.
• Details specific to word processing documents,
spreadsheets, or presentations will be covered in
separate modules for each of those document types.
Open XML Developer Workshop
Evolution of Document Authoring
Old approach: linear, static
New approach: dynamic, interactive
Temporary electronic
document, permanent paper
document
Face-to-face collaboration
using paper documents;
requires physical presence
Binary formats optimized for
the high cost of storage and
bandwidth; proprietary
Permanent electronic
document, temporary paper
document
Digital collaboration using
electronic documents;
participants in many locations
XML-based formats optimized
for flexibility, reusability, and
maintainability; open standards
Generate
Create
Print
Archive
Receive
Edit
Open XML Developer Workshop
Edit
Send
Document Formats and Applications
• Formats describe information
• Define content appearance
• Structure content for business processes
Enable machines (software) to use information
Software applications use information
Provide functionality for authoring, organizing,
developing, representing, evaluating, reviewing,
collaborating, validating, calculating, protecting,
and printing information
Formats can influence application design, and vice versa
Open Document Format (ODF) and OpenOffice functionality
Office Open XML and Microsoft Office functionality
Open XML Developer Workshop
-5-
Features of Office Open XML
Open
• Open XML Formats
standardization:
Ecma, ISO/IEC in
process
• XML-based formats
for predictable longterm interoperability
• Royalty-free licenses
enable broad access
to technology
Forward-looking
Compatible
• ZIP compression of
the format reduces
file sizes
• Full support for
Microsoft Office
functionality
• Segmented storage
improves data
recovery and
programmatic access
• Compatibility with
Office 2000, XP, and
2003
• Full accessibility
support
Open XML Developer Workshop
• Bulk document
conversion tools
available
Demo: Open XML In Action
1. Generate Document
Application
3. Edit Document
2. Download
4. Upload
Tomcat JSP
DB
Word 2007
Windows OS
IE
5. Publish to Web
6. View in Browser
Desktop
Web Server
VM
Linux OS
Server
DEMO
Open XML Developer Workshop
Levels of Interoperability
Reference and Custom-defined Schemas
XML Reference Schemas
Display-oriented
(Bold, Italics, Tables,
Paragraphs, Styles,…)
Document Format
Enable Archival and File
Formats Interoperability
Custom-defined Schemas
Data-oriented
(e.g.: Price, Invoice)
business information
Enable System Integration
Open XML Developer Workshop
Levels of Interoperability
Technical Interoperability
XML Reference Schemas
Display-oriented
(for example, Bold, Italics,
Tables, Paragraphs, Styles)
Document Format
Enable Archival and File
Formats Interoperability
<w:p>
<w:r>
<w:rPr><w:b /></w:rPr>
<w:t>John Doe</w:t>
</w:r>
<w:r>
<w:rPr><w:i /></w:rPr>
<w:t>Health Agency</w:t>
</w:r>
</w:p>
Open XML Developer Workshop
Levels of Interoperability
Semantic interoperability
<ConferenceReport>
<Date>3/24/2004</Date>
<Attendees>
<Attendee Name=“John Doe”>
<Department>
Health Agency
</Department>
<Potential>
<Sales>100</Sales>
<Growth>25%</Growth>
…
</Attendee>
Custom-defined Schemas
Data-oriented
(for example, Price, Invoice)
business information
Enable System Integration
Open XML Developer Workshop
Word: a 24-year evolution
Word 5.1 1992
(UNIX)
Office 2007
Word 12
Multi-Tool Word 1983
(Xenix)
Office 2003
Office XP
Word 6.0 1993
Word 5.5 1991
Office 2000
Office 97
Word v.X 2001
(OS X)
Word 3.0 1987
Word 1.0 1983
(DOS)
Word for Windows 1.0
(Windows) 1989
Word 5.1 1992
Word for OS/2 1992
Word for Mac1985 (Mac)
.DOC
.RTF 1990 (by DEC 1987)
Open XML Developer Workshop
XML 2003
XML in Office: a 10-year evolution
Office 2003
WordProcessingML
SpreadsheetML
Custom schemas
2007 Office system
PresentationML
XML-based formats
Office 2000
XML Document Properties
Office XP
Spreadsheet XML
Office 97
Binary formats
Open XML Developer Workshop
User View of Open XML Files
Single file
Compact
Corruption resistant
Segmented architecture
Corruption of any part would not prohibit opening
Separation of macro-enabled content
Macro-enabled extension end with “m” instead of “x” (e.g. .docm)
VBA, Excel Macro-Sheets, PowerPoint Action Commands
Enforced at runtime by 2007 Office programs
Open XML Developer Workshop
Programmer View of Open XML Files
ZIP Archive
Document Parts
XML Parts
Binary Parts
Typed (RFC 2616)
Relationships
Connections between parts
Content Type Stream
A specially-named stream
Defines mappings from part names to content types
Not itself a part, not URI addressable
Folder structure for convenience only
Open XML Developer Workshop
Hello World
Creating the minimal WordprocessingML document:
• 3 parts: document body, content types, relationships
• Each part is simple XML (text)
• Parts are packaged in a ZIP archive
• Result: a well-formed Open XML document
Open XML Developer Workshop
Ecma Office Open XML Specifications
Markup Languages
WordprocessingML
SpreadsheetML
PresentationML
Vocabularies
DrawingML
Custom XML
Bibliography
VML (legacy)
Metadata
Equations
Open Packaging Convention
Relationships
Content Types
Digital
Signatures
Core Technologies
ZIP
XML + Unicode
Open XML Developer Workshop
Ecma Office Open XML Specifications
Markup Languages
WordprocessingML
SpreadsheetML
PresentationML
Module 03, 04
Module 06, 07A
Module 08
Vocabularies
DrawingML
Module 07B
VML (legacy)
Custom XML
Bibliography
Module 05
Metadata
Equations
Module 02
Open Packaging Convention
Relationships
Content Types
Module 02
Module 02
Digital
Signatures
Core Technologies
ZIP
XML + Unicode
Module 01, 09
Open XML Developer Workshop
Office Open XML File Formats Extensions
Macro-Free
Document
Macro-Enabled
Template
Document
Template
docx
dotx
docm
dotm
pptx
potx
pptm
potm
xlsx
xltx
xlsm
xltm
Open Packaging Convention
Open XML Developer Workshop
Developer Scenario: Styling Content
Enforce organizational standards for document formatting.
Document
Standardized
look and feel
Style part
Open XML Developer Workshop
Developer Scenario: Content Inspection
Remove confidential information, tracked changes or
metadata from outbound documents:
Open XML
Processing
Remove macros, inappropriate language, or other content
from inbound documents:
Open XML
Processing
Open XML Developer Workshop
Development Scenario: Consuming Documents
Create expense reports as spreadsheet documents, which
are loaded into a back-end system on the server:
Authoring environment
(Microsoft Office, etc.)
Open XML
Processing
Back-end system
(LOB/CRM/etc.)
Open XML Developer Workshop
Development Scenario: Document Assembly
Create sales reports from financial and forecast data stored
in a CRM system:
Manual
entries
Web client or rich client
allows user to select or
enter content criteria
Existing
content
Calculated
data
Open XML Developer Workshop
Development Scenario: Custom XML Markup
Tagging document content with custom semantics for
processing by a back-end system.
Authoring environment
Open XML
Processing
Open XML Developer Workshop
Custom XML Data Store
Custom-defined XML part
Stored separately from document body
Any XML can be stored
Document properties
WSS meta-data
Custom XML (with or without XML schema)
Doc/Template
Doc
Parts
XML
External App
External applications can easily read or write the custom
XML part
True separation of data and presentation
Open XML Developer Workshop
XML Data Binding
Link content controls to nodes in the XML data store
Mappings use standard XPath expressions
Office offers built-in support for mapping to metadata
Developers can bind custom XML to content controls
2-way binding between user changes and custom XML
Customers
Open XML Developer Workshop
XML Data Binding
Open XML Developer Workshop
Open XML Interoperability
Linux
Minizip
ZIP Library
zLib
XML Library
Apache Xerces
Java
Microsoft
COM
J2SE
java.util.zip
.NET Framework 3.0
System.IO. Packaging*
Microsoft SDK for
Open XML Formats **
Xceed .NET controls
Xceed ActiveX controls
JAXP
.NET Framework 3.0
System.Xml
MSXML
* Includes abstractions for OPC concepts
** Includes classes for package parts
(strongly typed parts)
Open XML Developer Workshop
The Ecma Spec
Where to get the final draft
OpenXmlDeveloper.org home page has latest link
Organization of the spec
1. Fundamentals
2. Open Packaging Conventions
3. Primer
4. Markup Language Reference
5. Markup Compatibility and Extensibility
Reference Schemas (XSD, RelaxNG)
Open XML Developer Workshop
The Ecma Spec: Where To Start
Where to get the final draft
OpenXmlDeveloper.org home page has latest link
Organization of the spec
Read 1st
1. Fundamentals
2. Open Packaging Conventions
Read 2nd
3. Primer
4. Markup Language Reference
5. Markup Compatibility and Extensibility
Reference Schemas (XSD, RelaxNG)
Open XML Developer Workshop
Reference
materials
OpenXmlDeveloper.org
Formed by 40 companies to share developer
information about the Office Open XML file formats
Articles with full source code for C#, VB, Java, XSLT
Forums for posting technical questions
Open XML Developer Workshop
Open XML Developer Workshop
Descargar

Office Open XML Packaging