Introduction to Python, COM and
Mark Hammond
Skippi-Net, Melbourne, Australia
[email protected]
Introduction to Python, COM and
The plan
• Section I - Intro to Python
– 30-45 mins
• Section II - Intro to COM
– 30-45 mins
• Python and COM
– The rest!
Section I
Introduction to Python
What Is Python?
• Created in 1990 by Guido van Rossum
– While at CWI, Amsterdam
– Now hosted by centre for national research
initiatives, Reston, VA, USA
• Free, open source
– And with an amazing community
• Object oriented language
– “Everything is an object”
Why Python?
• Designed to be easy to learn and master
– Clean, clear syntax
– Very few keywords
• Highly portable
– Runs almost anywhere - high end servers and
workstations, down to windows CE
– Compiled to machine independent byte-codes
• Extensible
– Designed to be extensible using C/C++, thereby
allowing access to many external libraries
Most obvious and notorious
• Clean syntax plus high-level data types
– Leads to fast coding
• Uses white-space to delimit blocks
– Humans generally do, so why not the language?
– Try it, you will end up liking it
• Variables do not need declaration
– Although not a type-less language
• We are using Pythonwin
– Only available on Windows
– GUI toolkit using Tkinter available for most
– Standard console Python available on all
• Has interactive mode for quick testing of code
• Includes debugger and Python editor
Interactive Python
• Starting Python.exe, or any of the GUI
environments present an interactive mode
–>>> prompt indicates start of a statement or
– If incomplete, ... prompt indicates second and
subsequent lines
– All expression results printed back to interactive
Variables and Types (1 of 3)
• Variables need no declaration
• >>> a=1
• As a variable assignment is a statement, there
is no printed result
• >>> a
• Variable name alone is an expression, so the
result is printed
Variables and Types (2 of 3)
• Variables must be created before they can be
• >>> b
Traceback (innermost last):
File "<interactive input>", line
1, in ?
NameError: b
• Python uses exceptions - more detail later
Variables and Types (3 of 3)
• Objects always have a type
• >>> a = 1
>>> type(a)
<type 'int'>
>>> a = "Hello"
>>> type(a)
<type 'string'>
>>> type(1.0)
<type 'float'>
Assignment versus Equality
• Assignment performed with single =
• Equality testing done with double = (==)
– Sensible type promotions are defined
– Identity tested with is operator.
• >>> 1==1
>>> 1.0==1
>>> "1"==1
Simple Data Types
• Strings
– May hold any data, including embedded NULLs
– Declared using either single, double, or triple
– >>> s = "Hi there"
>>> s
'Hi there'
>>> s = "Embedded 'quote'"
>>> s
"Embedded 'quote'"
Simple Data Types
– Triple quotes useful for multi-line strings
– >>> s = """ a long
... string with "quotes" or
anything else"""
>>> s
' a long\012string with "quotes"
or anything else'
>>> len(s)
Simple Data Types
• Integer objects implemented using C longs
– Like C, integer division returns the floor
– >>> 5/2
• Float types implemented using C doubles
• Long Integers have unlimited size
– Limited only by available memory
– >>> long = 1L << 64
>>> long ** 5
High Level Data Types
• Lists hold a sequence of items
– May hold any object
– Declared using square brackets
• >>>
l = []# An empty list
l.append("Hi there")
High Level Data Types
• >>> l
[1, 'Hi there']
>>> l = ["Hi there", 1, 2]
>>> l
['Hi there', 1, 2]
>>> l.sort()
>>> l
[1, 2, 'Hi there']
High Level Data Types
• Tuples are similar to lists
– Sequence of items
– Key difference is they are immutable
– Often used in place of simple structures
• Automatic unpacking
• >>> point = 2,3
>>> x, y = point
>>> x
High Level Data Types
• Tuples are particularly useful to return
multiple values from a function
• >>> x, y = GetPoint()
• As Python has no concept of byref
parameters, this technique is used widely
High Level Data Types
• Dictionaries hold key-value pairs
– Often called maps or hashes. Implemented using
– Keys may be any immutable object, values may
be any object
– Declared using braces
• >>> d={}
>>> d[0] = "Hi there"
>>> d["foo"] = 1
High Level Data Types
• Dictionaries (cont.)
• >>> len(d)
>>> d[0]
'Hi there'
>>> d = {0 : "Hi there", 1 :
>>> len(d)
• Blocks are delimited by indentation
– Colon used to start a block
– Tabs or spaces may be used
– Maxing tabs and spaces works, but is discouraged
• >>> if 1:
print "True"
• Many people hate this when they first see it
– Almost all Python programmers come to love it
• Humans use indentation when reading code to
determine block structure
– Ever been bitten by the C code?:
• if (1)
• The for statement loops over sequences
• >>> for ch in "Hello":
print ch
• Built-in function range() used to build
sequences of integers
• >>> for i in range(3):
print i
• while statement for more traditional loops
• >>> i = 0
>>> while i < 3:
print i
i = i + 1
• Functions are defined with the def
• >>> def foo(bar):
return bar
• This defines a trivial function named foo that
takes a single parameter bar
• A function definition simply places a function
object in the namespace
• >>> foo
<function foo at fac680>
• And the function object can obviously be
• >>> foo(3)
• Classes are defined using the class
• >>> class Foo:
def __init__(self):
self.member = 1
def GetMember(self):
return self.member
• A few things are worth pointing out in the
previous example:
– The constructor has a special name __init__,
while a destructor (not shown) uses __del__
– The self parameter is the instance (ie, the this
in C++). In Python, the self parameter is explicit
(c.f. C++, where it is implicit)
– The name self is not required - simply a
• Like functions, a class statement simply adds
a class object to the namespace
• >>> Foo
<class __main__.Foo at 1000960>
• Classes are instantiated using call syntax
• >>> f=Foo()
>>> f.GetMember()
• Most of Python’s power comes from modules
• Modules can be implemented either in
Python, or in C/C++
• import statement makes a module available
• >>> import string
>>> string.join( ["Hi", "there"] )
'Hi there'
• Python uses exceptions for errors
– try / except block can handle exceptions
• >>> try:
... except ZeroDivisionError:
print "Eeek"
• try / finally block can guarantee execute
of code even in the face of exceptions
• >>> try:
... finally:
print "Doing this anyway"
Doing this anyway
Traceback (innermost last): File "<interactive
input>", line 2, in ?
ZeroDivisionError: integer division or modulo
• Number of ways to implement threads
• Highest level interface modelled after Java
• >>> class DemoThread(threading.Thread):
def run(self):
for i in range(3):
print i
>>> t = DemoThread()
>>> t.start()
>>> t.join()
1 <etc>
Standard Library
• Python comes standard with a set of modules,
known as the “standard library”
• Incredibly rich and diverse functionality
available from the standard library
– All common internet protocols, sockets, CGI, OS
services, GUI services (via Tcl/Tk), database,
Berkeley style databases, calendar, Python parser,
file globbing/searching, debugger, profiler,
threading and synchronisation, persistency, etc
External library
• Many modules are available externally
covering almost every piece of functionality
you could ever desire
– Imaging, numerical analysis, OS specific
functionality, SQL databases, Fortran interfaces,
XML, Corba, COM, Win32 API, etc
• Way too many to give the list any justice
Python Programs
• Python programs and modules are written as
text files with traditionally a .py extension
• Each Python module has its own discrete
• Python modules and programs are
differentiated only by the way they are called
– .py files executed directly are programs (often
referred to as scripts)
– .py files referenced via the import statement are
Python Programs
• Thus, the same .py file can be a
program/script, or a module
• This feature is often used to provide
regression tests for modules
– When module is executed as a program, the
regression test is executed
– When module is imported, test functionality is not
Python “Protocols”
• Objects can support a number of different
• This allows your objects to be treated as:
a sequence - ie, indexed or iterated over
A mapping - obtain/assign keys or values
A number - perform arithmetic
A container - perform dynamic attribute fetching
and setting
– Callable - allow your object to be “called”
Python “Protocols”
• Sequence and container example
• >>> class Protocols:
def __getitem__(self, index):
return index ** 2
def __getattr__(self, attr):
return "A big " + attr
>>> p=Protocols()
>>> p[3]
>>> p.Foo
'A big Foo’
More Information on Python
• Can’t do Python justice in this short time
– But hopefully have given you a taste of the
• Comes with extensive documentation,
including an excellent tutorial and library
– Also a number of Python books available
• Visit for more details
Section II
Introduction to COM
What is COM?
• Acronym for Component Object Model, a
technology defined and implemented by
• Allows “objects” to be shared among many
applications, without applications knowing
the implementation details of the objects
• A broad and complex technology
• We can only provide a brief overview here
What was COM
• COM can trace its lineage back to DDE
• DDE was expanded to Object Linking and
Embedding (OLE)
• VBX (Visual Basic Extensions) enhanced
OLE technology for visual components
• COM was finally derived as a general
purpose mechanism
– Initially known as OLE2
COM Interfaces
• COM relies heavily on interfaces
• An interface defines functionality, but not
– Each object (or more correctly, each class)
defines implementation of the interface
– Each implementation must conform to the
• COM defines many interfaces
– But often does not provide implementation of
these interfaces
COM Interfaces
• Interfaces do not support properties
– We will see how COM properties are typically
defined later
• Interfaces are defined using a vtable scheme
similar to how C++ defines virtual methods
• All interfaces have a unique ID (an IID)
– Uses a Universally Unique Identifer (UUID)
– UUIDs used for many COM IDs, including IIDs
• COM defines the concept of a class, used to
create objects
– Conceptually identical to a C++ or Python class
– To create an object, COM locates the class
factory, and asks it to create an instance
• Classes have two identifiers
– Class ID (CLSID) is a UUID, so looks similar to
an IID
– ProgID is a friendly string, and therefore not
guaranteed unique
IUnknown interface
• Base of all COM interfaces
– By definition, all interfaces also support the
IUnknown interface
• Contains only three methods
– AddRef() and Release() for managing COM
• COM lifetimes are based on reference counts
– QueryInterface() for obtaining a new interface
from the object
Creating objects, and obtaining
• To create an object, the programmer specifies
either the ProgID, or the CLSID
• This process always returns the requested
– or fails!
• New interfaces are obtained by using the
IUnknown::QueryInterface() method
– As each interface derives from IUnknown, each
interface must support QI
Other standard interfaces
• COM defines many interfaces, for example:
• IStream
– Defines file like operations
• IStorage
– Defines file system like semantics
• IPropertyPage
– Defines how a control exposes a property page
• etc - many many interfaces are defined
– But not many have implementations
Custom interfaces
• COM allows you to define your own
• Interfaces are defined using an Interface
Definition Language (IDL)
• Tools available to assign unique IIDs for the
• Any object can then implement or consume
these interfaces
IDispatch - Automation objects
• IDispatch interface is used to expose dynamic
object models
• Designed explicitly for scripting languages, or
for those languages that can not use normal
COM interfaces
– eg, where the interface is not known at compile
time, or there is no compile time at all
• IDispatch is used extensively
– Microsoft Office, Netscape, Outlook, VB, etc almost anything designed to be scripted
IDispatch - Automation objects
• Methods and properties of the object model
can be determined at runtime
– Concept of Type Libraries, where the object
model can be exposed at compile time
• Methods and properties are used indirectly
– GetIDsOfNames() method is used to get an ID for
a method or property
– Invoke() is used to make the call
IDispatch - Automation objects
• Languages usually hide these implementation
details from the programmer
• Example: object.SomeCall()
– Behind the scenes, your language will:
id = GetIDsOfNames("SomeCall")
• Example: object.SomeProp
– id = GetIDsOfNames("SomeProp")
• The IDispatch interface uses VARIANTs as
its primary data type
• Simply a C union that supports the common
data types
– and many helper functions for conversion etc
• Allows the single Invoke() call to accept
almost any data type
• Many languages hide these details,
performing automatic conversion as necessary
Implementation models
• Objects can be implemented in a number of
– InProc objects are implemented as DLLs, and
loaded into the calling process
• Best performance, as no marshalling is required
– LocalServer/RemoteServer objects are
implemented as stand-alone executables
• Safer due to process isolation, but slower due to
• Can be both, and caller can decide, or let
COM choose the best
Distributed COM
• DCOM allows objects to be remote from their
• DCOM handles all marshalling across
machines and necessary security
• Configuration tools allow an administrator to
configure objects so that neither the object
nor the caller need any changes
– Although code changes can be used to explicitly
control the source of objects
The Windows Registry
• Information on objects stored in the Windows
ProgID to CLSID mapping
Name of DLL for InProc objects
Name of EXE for LocalServer objects
Other misc details such as threading models
• Lots of other information also maintained in
– Remoting proxies, object security, etc.
Conclusion for Section II
• COM is a complex and broad beast
• Underlying it all is a fairly simple Interface
– Although all the bits around the edges combine to
make it very complex
• Hopefully we have given enough framework
to put the final section into some context
Section III
Python and COM
PythonCOM architecture
• Underpinning everything is an extension
module written in C++ that provides the core
COM integration
– Support for native COM interfaces exist in this
– Python reference counting is married with COM
reference counting
• Number of Python implemented modules that
provide helpers for this core module
Interfaces supported by
• Over 40 standard interfaces supported by the
• Extension architecture where additional
modules can add support for their own
– Tools supplied to automate this process
– Number of extension modules supplied, bringing
total interfaces to over 100
Using Automation from Python
• Automation uses IDispatch to determine
object model at runtime
• Python function
win32com.client.Dispatch() provides
this run-time facility
• Allows a native “look and feel” to these
Using Automation - Example
• We will use Excel for this demonstration
• Excel ProgId is Excel.Application
• >>> from win32com.client import Dispatch
>>> xl=Dispatch("Excel.Application")
>>> xl
<COMObject Excel.Application>
Using Automation - Example
• Now that we have an Excel object, we can
call methods and set properties
• But we can’t see Excel running?
– It has a visible property that may explain things
• >>> xl.Visible
• Excel is not visible, let’s make it visible
>>> xl.Visible = 1
Automation - Late vs. Early Bound
• In the example we just saw, we have been
using late bound COM
– Python has no idea what properties or methods
are available
– As we attempt a method or property access,
Python dynamically asks the object
– Slight performance penalty, as we must resolve
the name to an ID (GetIDsOfNames()) before
calling Invoke()
Automation - Late vs. Early Bound
• If an object provides type information via a
Type Library, Python can use early bound
• Implemented by generating a Python source
file with all method and property definitions
– Slight performance increase as all names have
been resolved to IDs at generation time, rather
than run-time
Automation - Late vs. Early Bound
• Key differences between the 2 techniques:
– Late bound COM often does not know the
specific types of the parameters
• Type of the Python object determines the VARIANT
type created
• Does not know about ByRef parameters, so no
parameters are presented as ByRef
– Early bound COM knows the types of the
• All Python types are coerced to the correct type
• ByRef parameters work (returned as tuples)
Playing with Excel
• >>> xl.Workbooks.Add()
<COMObject <unknown>>
>>> xl.Range("A1:C1").Value = "Hi",
"From", "Python"
>>> xl.Range("A1:C1").Value
((L'Hi', L'From', L'Python'),)
>>> xl.Range("A1:C1").PrintOut()
How did we know the methods?
• Indeed, how did we know to use
• No easy answer - each application/object
defines their own object model and ProgID
• Documentation is the best answer
– Note that MSOffice does not install the COM
documentation by default - must explicitly select
it during setup
• COM browsers can also help
Native Interfaces from Python
• Examples so far have been using using
– win32com.client.Dispatch() function
hides the gory details from us
• Now an example of using interfaces natively
– Need a simple example - Windows Shortcuts fits
the bill
• Develop some code that shows information
about a Windows shortcut
Native Interfaces from Python
• 4 main steps in this process
– Import the necessary Python modules
– Obtain an object that implements the IShellLink
– Obtain an IPersistFile interface from the object,
and load the shortcut
– Use the IShellLink interface to get information
about the shortcut
Native Interfaces from Python
Step 1: Import the necessary Python modules
– We need the pythoncom module for core COM
– We need the
• This is a PythonCOM extension that exposes
additional interfaces
• >>> import pythoncom
>>> from import shell
Native Interfaces from Python
Step 2: Obtain an object that implements the
IShellLink interface
• Use the COM function
• Use the published CLSID for the shell
• The shell requires that the object request be
for an InProc object.
• Request the IShellLink interface
Native Interfaces from Python
Step 2: Obtain an object that implements the
IShellLink interface
• >>> sh =
>>> sh
<PyIShellLink at 0x1630b04 with obj at 0x14c9d8>
Native Interfaces from Python
Step 3: Obtain an IPersist interface from the
object, and load the shortcut
• Use QueryInterface to obtain the new interface
from the object
• >>> pe=
Native Interfaces from Python
Step 4: Use the IShellLink interface to get
information about the shortcut
• >>> sh.GetWorkingDirectory()
>>> sh.GetArguments()
Implementing COM using Python.
• Final part of our tutorial is implementing
COM objects using Python
• 3 main steps we must perform
– Implement a Python class that exposes the
– Annotate the class with special attributes required
by the COM framework.
– Register our COM object
Implementing COM using Python.
Step 1: Implement a Python class that exposes
the functionality
• We will build on our last example, by
providing a COM object that gets information
about a Windows shortcut
• Useful to use from VB, as it does not have
direct access to these interfaces.
• We will design a class that can be initialised
to a shortcut, and provides methods for
obtaining the info.
Implementing COM using Python
Step 1: Implement a Python class that exposes
the functionality
• Code is getting to big for the slides
• Code is basically identical to that presented
before, except:
– The code has moved into a class.
– We store the IShellInfo interface as an instance
Implementing COM using Python
Step 1: Implement a Python class that exposes
the functionality
• Provide an Init method that takes the name of
the shortcut.
• Also provide two trivial methods that simply
delegate to the IShellInfo interface
• Note Python would allow us to automate the
delegation, but that is beyond the scope of this!
Implementing COM using Python
Step 2: Annotate the class with special attributes
• PythonCOM requires a number of special
– The CLSID of the object (a UUID)
• Generate using
print pythoncom.CreateGuid()
– The ProgID of the object (a friendly string)
• Make one up!
– The list of public methods
• All methods not listed as public are not exposed via
Implementing COM using Python
Step 2: Annotate the class with special attributes
• class PyShellLink:
Implementing COM using Python
Step 3: Register our COM object
• General technique is that the object is
registered when the module implementing the
COM object is run as a script.
– Recall previous discussion on modules versus
• Code is trivial
– Call the UseCommandLine() method, passing
the class objects we wish to register
Implementing COM using Python
Step 3: Register our COM object
• Code is simple
– note we are passing the class object, not a class
• if __name__=='__main__':
• Running this script yields:
• Registered: Python.ShellDemo
Testing our COM object
• Our COM object can be used by any
automation capable language
– VB, Delphi, Perl - even VC at a pinch!
• We will test with a simple VBScript program
– VBScript is free. Could use full blown VB, or
VBA. Syntax is identical in all cases
Testing our COM object
• set shdemo =
WScript.Echo "Working dir is " +
• Yields
• Working dir is L:\src\python-cvs\tools\idle
• Python can make COM simple to work with
– No reference count management
– Many interfaces supported.
– Natural use for IDispatch (automation) objects.
• Simple to use COM objects, or implement
COM objects
More Information
• Python itself
– news:comp.lang.python
• Python on Windows
• Mark Hammond’s Python extensions

Introduction to Python, COM and PythonCOM