Object Oriented Database Management
Outline
•
•
•
•
•
•
Motivation
Embedding SQL in host language
Object Data Model
Persistent Programming Languages
Object Query Language
Object-orientation in SQL
Motivation of ODBMSs
Application
data structures
Copy and
Relational
representation
translation
Transparent
ODBMS
data transfer
RDBMS
• Complex objects in emerging DBMS applications cannot be
effectively represented as records in relational model.
• Representing information in RDBMSs requires complex and
inefficient conversion into and from the relational model to the
application programming language
• ODBMSs provide a direct representation of objects to DBMSs
overcoming the impedance mismatch problem
Embedded SQL
•
Access to database from a general purpose programming language required since:
– Not all queries can be expressed in SQL --e.g., recursive queries cannot be written
in SQL.
– Non declarative actions -- e.g., printing reports cannot be done from SQL.
•
General purpose language in which SQL is embedded called host language.
•
SQL structures permitted in host language called embedded SQL.
C compiler
SQL+ C
SQL library calls + C
precompiler
.o file
loader
SQL library
object
code
Embedded SQL Compilation
Embedded SQL
• SQL commands embedded in the host programming language
• Data exchanged between host language and DBMS using cursors
• SQL query passed from host language to DBMS which computes
the answer set
• A cursor can be viewed as a pointer into the answer set
• DBMS returns the cursor to the programming language
• Programming language can use the cursor to get a record at a time
access to materialized answer.
Example of Embedded SQL
:dname = “toy”;
raise = 0.1;
EXEC SQL SELECT dnum into :dnum
FROM
Department
WHERE dname= :dname;
EXEC SQL DECLARE Emp CURSOR FOR
SELECT * FROM Employee
WHERE dno = :dnum
FOR UPDATE
EXEC SQL OPEN Emp;
EXEC SQL FETCH Emp INTO :E.ssn, :E.dno, :E.name, :E.sal;
while (SQLCODE == 0) {
EXEC SQL UPDATE WHERE CURRENT OF CURSOR
SET sal = sal * (1 + ::raise);
EXEC SQL FETCH Emp INTO :E.ssn, :E.dno, :E.name, :E.sal;
}
EXEC SQL CLOSE CURSOR Emp
/* SQL embedded in C to read the list of employees who work for
the toy department and give them a 10 percent raise */
Object Oriented Database Management
• Object Oriented databases have evolved along two different paths:
• Persistent Object Oriented Programming Languages: (pure ODBMSs)
– Start with an OO language (e.g., C++, Java, SMALLTALK) which
has a rich type system
– Add persistence to the objects in programming language where
persistent objects stored in databases
• Object Relational Database Management Systems (SQL3 Systems)
– Extend relational DBMSs with the rich type system and user-defined
functions.
– Provide a convenient path for users of relational DBMSs to migrate to
OO technology
– All major vendors (e.g., Informix, Oracle) will/are supporting features
of SQL3.
Object Database Management Group (ODMG)
• Special interest group to develop standards that allow ODBMS
customers to write portable applications
• Standards include:
– Object Model
– Object Specification Languages
• Object Definition Language (ODL) for schema definition
• Object Interchange Format (OIF) to exchange objects between databases
– Object Query Language
• declarative language to query and update database objects
– Language Bindings (C++, Java, Smalltalk)
• Object manipulation language
• Mechanisms to invoke OQL from language
• Procedures for operation on databases and transactions
Object Model
• Object:
– observable entity in the world being modeled
– similar to concept to entity in the E/R model
• An object consists of:
– attributes: properties built in from primitive types
– relationships: properties whose type is a reference to some other object
or a collection of references
– methods: functions that may be applied to the object.
Class
• Similar objects with the same set of properties and describing
similar real-world concepts are collected into a class.
• Class definition:
interface Employee {
attribute string name;
attribute integer salary;
attribute date date-of-birth;
attribute integer empid;
relationship Projects works-for
inverse Projects::team;
age-type age();
}
Interface Projects{
attribute string name;
attribute integer projid;
relationship Employee team
inverse Emplolyee works-for;
int number-of-employees();
}
Class Extents
• For each ODL class, an extent may be declared.
• Extent is the current set of objects belonging to the class.
– Similar notion to the relation in the relational model.
– Queries in OQL refer to the extent of a class and not the class directly.
interface Employee (extent Emp-set)
{ attribute string name;
attribute integer salary;
attribute date date-of-birth;
attribute integer empid;
relationship Projects works-for
inverse Projects::team;
age-type age();
}
Subclasses and Inheritance
• A class can be declared to be a subclass of another class.
• Subclasses inherit all the properties
– attributes
– relationships
– methods
from the superclass.
Interface Married-Employee: Employees {
string spouse-name;
}
• Substitutability: any method of superclass can be invoked over
objects of any subclass (code reuse)
Class Hierarchy
person
employee
staff
student
grad
student
assistant
faculty
RA
TA
undergrad
Multiple Inheritance
• A class may have more than one superclass.
• A class inherits properties fromeach of its superclasses.
• There is a potential of ambiguity -- variable with same name
inherited from two superclasses:
– flag and error
– rename variable
– choose one
Object Identity
• Each object has an identity which it maintains even if some or all
of its attributes change.
• Object identity is a stronger notion of identity than in relational
DBMSs.
• Identity in relational DBMSs is value based (primary key).
• Identity in ODBMSs built into data model
– no user specified identifier is required
• OID is a similar notion as pointer in programming language
• Object identifier (OID) can be stored as attribute in object to refer
to another object.
• References to other objects via their OIDs can result in a
containment hierarchy
• Note: containment hierarchy different from class hierarchy
Containment Hierarchy
bicycle
wheel
tire
rim
spoke
brake
lever
gear
frame
pad
Links in containment hierarchy should be read as is-part-of instead of is-a
Persistence
• Objects created may have different lifetimes:
– transient: allocated memory managed by the programming language
run-time system.
• E.g., local variables in procedures have a lifetime of a procedure execution
• global variables have a lifetime of a program execution
– persistent: allocated memory and stored managed by ODBMS runtime
system.
• Classes are declared to be persistence-capable or transient.
• Different languages have different mechanisms to make objects
persistent:
– creation time: Object declared persistent at creation time (e.g., in C++
binding) (class must be persistent-capable)
– persistence by reachability: object is persistent if it can be reached from
a persistent object (e.g., in Java binding) (class must be persistentcapable).
Persistent Object-Oriented Programming Languages
• Persistent objects are stored in the database and accessed from the
programming language.
• Classes declared in ODL mapped to the programming language
type system (ODL binding).
• Single programming language for applications as well as data
management.
– Avoid having to translate data to and from application programming
language and DBMS
• efficient implementation
• less code
– Programmer does not need to write explicit code to fetch data to and
from database
• persistent objects to programmer looks exactly the same as transient
objects.
• System automatically brings the objects to and from memory to storage
device. (pointer swizzling).
Disadvantages of ODBMS Approach
• Low protection
– since persistent objects manipulated from applications directly, more
changes that errors in applications can violate data integrity.
• Non-declarative interface:
– difficult to optimize queries
– difficult to express queries
• But …..
– Most ODBMSs offer a declarative query language OQL to overcome
the problem.
– OQL is very similar to SQL and can be optimized effectively.
– OQL can be invoked from inside ODBMS programming language.
– Objects can be manipulated both within OQL and programming
language without explicitly transferring values between the two
languages.
– OQL embedding maintains simplicity of ODBMS programming
language interface and yet provides declarative access.
OQL Example
interface Employee {
attribute string name;
relationship
setof(Projects) works-for
inverse Projects::team;
Interface Projects{
attribute string name;
relationship setof(Employee) team
inverse Emplolyee works-for;
int number-of-employees();
}
}
Select number-of-employees()
From Employee e, e.works-for
where name = “sharad”
Find number of employees working on each project “sharad” works on
Migration of RDBMSs towards OO Technologies
• SQL3 standard incorporates OO concepts in the relational model.
• A row in a table considered as an object
• SQL3 allows a type to be declared for tuples (similar to class in
ODBMSs)
• Relations are collection of tuples of a row type (similar to extent in
ODBMSs)
• Rows in a relation can refer to each other using a reference type
(similar to object identity in ODBMSs)
• A reference can be dereferenced to navigate among tables
• Attributes in a relation can belong to abstract data types
• Methods and functions (expressed in SQL as well as host
programming language) can be associated with abstract data
types
SQL-3 Example
CREATE ROW TYPE Employee-type {
name CHAR(30)
works-for REF(Projects-type)
}
CREATE ROW TYPE Projects-type {
name CHAR(30)
team setof(REF(Employee-type))
}
CREATE TABLE Emp OF TYPE Employee-type
CREATE TABLE Project of TYPE Project-type
Select works-for --> name
From Emp
Where name = ‘sharad’
Return name of the project
sharad works for
OQL
CMSC-461
Database Management Systems
OQL -- Motivation
• Relational languages suffer from impedance mismatch when we
try to connect them to conventional languages like C or C++.
–
The data models of C and SQL are radically different, e.g. C does not
have relations, sets, or bags as primitive types; C is tuple-at-a-time,
SQL is relation-at-a-time.
OQL -- Motivation (II)
• OQL is an attempt by the OO community to extend languages like
C++ with SQL-like, relation-at-a-time dictions.
• OQL is query language paired with schema-definition language
ODL.
OQL Types
• Basic types: strings, ints, reals, etc., plus class names.
• Type constructors:
–
–
Struct for structures.
Collection types: set, bag, list, array.
• Like ODL, but no limit on the number of times we can apply a
type constructor.
• Set(Struct()) and Bag(Struct()) play special roles akin to relations.
OQL Uses ODL as its Schema-Definition Portion
• For every class we can declare an extent = name for the current set
of objects of the class.
–
Remember to refer to the extent, not the class name, in queries.
Example
• interface Bar
(extent Bars)
{
attribute string name;
attribute string addr;
relationship Set<Sell> beersSold
inverse Sell::bar;
}
Example (II)
• interface Beer
(extent Beers)
{
attribute string name;
attribute string manf;
relationship Set<Sell> soldBy
inverse Sell::beer;
}
Example (III)
• interface Sell
(extent Sells)
{
attribute float price;
relationship Bar bar
inverse Bar::beersSold;
relationship Beer beer
inverse Beer::soldBy;
}
Path Expressions
• Let x be an object of class C.
• If a is an attribute of C, then x.a = the value of a in the x object.
• If r is a relationship of C, then x.r = the value to which x is
connected by r.
–
Could be an object or a collection of objects, depending on the type of
r.
• If m is a method of C , then x.m (...) is the result of applying m to x.
Examples
• Let s be a variable whose type is Sell.
• s.price = the price in the object s.
• s.bar.addr = the address of the bar mentioned in s .
–
Note: cascade of dots OK because s.bar is an object, not a collection.
Example of Illegal Use of Dot
• b.beersSold.price, where b is a Bar object.
• Why illegal? Because b.beersSold is a set of objects, not a single
object.
OQL Select-From-Where
• SELECT < list of values >
FROM < list of collections and typical members >
WHERE < condition >
OQL Select-From-Where (II)
• Collections in FROM can be:
1. Extents.
2. Expressions that evaluate to a collection.
• Following a collection is a name for a typical member, optionally
preceded by AS.
Example
• Get the menu at Joe's.
SELECT s.beer.name, s.price
FROM Sells s
WHERE s.bar.name = "Joe's Bar"
• Notice double-quoted strings in OQL.
• Result is of type
Bag(Struct(name: string, price: float))
Example
•
•
Another way to get Joe's menu, this time focusing on the Bar objects.
SELECT s.beer.name, s.price
FROM Bars b, b.beersSold s
WHERE b.name = "Joe's Bar"
Notice that the typical object b in the first collection of FROM is used to
help define the second collection.
–
Typical usage: if x.a is an object, you can extend the path expression; if x.a is a
collection, you use it in the FROM list.
Tailoring the Type of the Result
• Default: bag of structs, field names taken from the ends of path
names in SELECT clause.
• Example
SELECT s.beer.name, s.price
FROM Bars b, b.beersSold s
WHERE b.name = "Joe's Bar"
has result type:
Bag(Struct( name: string, price: real))
Rename Fields
• Prefix the path with the desired name and a colon.
• Example
SELECT beer: s.beer.name, s.price
FROM Bars b, b.beersSold s
WHERE b.name = "Joe's Bar"
Change the Collection Type
• Use SELECT DISTINCT to get a set of structs.
Example
• SELECT DISTINCT s.beer.name, s.price
FROM Bars b, b.beersSold s
WHERE b.name = "Joe's Bar"
• Use ORDER BY clause to get a list of structs.
Example
•
•
•
joeMenu =
SELECT s.beer.name, s.price
FROM Bars b, b.beersSold s
WHERE b.name = "Joe's Bar"
ORDER BY s.price ASC
ASC = ascending (default); DESC = descending.
We can extract from a list as if it were an array, e.g. cheapest =
joeMenu[1].name;
Subqueries
• Used mainly in FROM clauses and with quantifiers EXISTS and
FORALL.
Example: Subquery in FROM
• Find the manufacturers of the beers served at Joe's.
SELECT b.manf
FROM (
SELECT s.beer
FROM Sells s
WHERE s.bar.name = "Joe's Bar"
)b
Quantifiers
• Boolean-valued expressions for use in WHERE-clauses.
FOR ALL x IN < collection > :
< condition >
EXISTS x IN < collection > :
< condition >
• The expression has value TRUE if the condition is true for all
(resp. at least one) elements of the collection.
Example
• Find all bars that sell some beer for more than $5.
SELECT b.name
FROM Bars b
WHERE EXISTS s IN b.beersSold :
s.price > 5.00
• Problem
How would you find the bars that only sold beers for more than
$5?
Example
•
Find the bars such that the only beers they sell for more than $5 are
manufactured by Pete's.
SELECT b.name
FROM Bars b
WHERE FOR ALL be IN (
SELECT s.beer
FROM b.beersSold s
WHERE s.price > 5.00
):
be.manf = "Pete's"
Extraction of Collection Elements
• a) A collection with a single member: Extract
the member with ELEMENT.
Example
• Find the price Joe charges for Bud and put the result in a variable
p.
• p = ELEMENT(
SELECT s.price
FROM Sells s
WHERE s.bar.name = "Joe's Bar"
AND s.beer.name = "Bud"
)
Extraction of Collection Elements (II)
• b) Extracting all elements of a collection, one at a time:
–
–
1. Turn the collection into a list.
2. Extract elements of a list with <list name>[i].
Example
• Print Joe's menu, in order of price, with beers of the same price
listed alphabetically.
Example (II)
• L=
SELECT s.beer.name, s.price
FROM Sells s
WHERE s.bar.name = "Joe's Bar"
ORDER BY s.price, s.beer.name;
printf("Beer\tPrice\n\n");
for(I = 1; I <= COUNT(L); i++)
printf("%s\t%f\n", L[i].name, L[i].price );
Aggregation
• The five operators avg, min, max, sum, count apply to any
collection, as long as the operators make sense for the element
type.
Example
•
•
•
Find the average price of beer at Joe's.
x = AVG(
SELECT s.price
FROM Sells s
WHERE s.bar.name = "Joe's Bar"
);
Note coercion: result of SELECT is technically a bag of 1-field structs,
which is identified with the bag of the values of that field.
Grouping
• Recall SQL grouping, for example:
SELECT bar, AVG(price)
FROM Sells
GROUP BY bar;
• Is the bar value the "name" of the group, or the common value for
the bar component of all tuples in the group?
Grouping (II)
• In SQL it doesn't matter, but in OQL, you can create groups from
the values of any function(s), not just attributes.
–
–
Thus, groups are identified by common values, not \name."
Example: group by first letter of bar names (method needed).
Outline of OQL Group-By
Collection Defined
by FROM, WHERE
Group by values
of function(s)
Collection with
function values and
partition
Terms from
SELECT clause
Output collection
Example

Find the average price of beer at each
bar.
SELECT barName, avgPrice: AVG(
SELECT p.s.price
FROM partition p
)
FROM Sells s
GROUP BY barName: s.bar.name
Example (II)

1. Initial collection = Sells.
» But technically, it is a bag of structs of the
form
Struct(s: s1)
Where s1 is a Sells object. Note, the lone
field is named s; in general, there are fields
for all of the tuple variables in the FROM
clause.
Example (II)

2. Intermediate collection:
» One function: s.bar.name maps Sells
objects s to the value of the name of the
bar referred to by s.
» Collection is a set of structs of type:
Struct{barName: string, partition: Set<
Struct{s: Sell} >
}
Example (III)
» For example:
Struct(barName = "Joe's Bar", partition = {s1,…, sn})
where s1,…, sn are all the structs with one field,
named s, whose value is one of the Sells objects
that represent Joe's Bar selling some beer.
Example (IV)

3. Output collection: consists of beeraverage price pairs, one for each struct
in the intermediate collection.
» Type of structures in the output:
Struct{barName: string, avgPrice: real}
Example (V)
» Note that in the subquery of the SELECT clause:
SELECT barName, avgPrice: AVG(
SELECT p.s.price
FROM partition p
)
We let p range over all structs in partition. Each of these
structs contains a single field named s and has a Sells
object as its value. Thus, p.s.price extracts the price from
one of the Sells tuples.
» Typical output struct:
Struct(barName = "Joe's Bar", avgPrice = 2.83)
Another, Less Typical Example
Find, for each beer, the number of bars
that charge a "low" price ( 2.00) and a
"high" price ( 4.00) for that beer.
 Strategy: group by three things:

» 1. The beer name,
» 2. A boolean function that is true iff the
price is low.
» 3. A boolean function that is true iff the
price is high.
The Query

SELECT beerName, low, high, count:
COUNT(partition)
FROM Beers b, b.soldBy s
GROUP BY beerName: b.name, low:
s.price <= 2.00, high: s.price >= 4.00
The Query (II)

1. Initial collection: Pairs (b; s), where b
is a beer, and s is a Sells object
representing the sale of that beer at
some bar.
» Type of collection members: Struct{b: Beer,
s: Sell}
2. Intermediate collection
Quadruples consisting of a beer name,
booleans telling whether this group is
for high, low, or neither prices for that
beer, and the partition for that group.
 The partition is a set of structs of the
type:

Struct{b: Beer, s: Sell}
A typical value:
Struct(b: "Bud" object, s: a Sells object
involving Bud)
2. Intermediate collection (II)
» Type of quadruples in the intermediate
collection:
Struct{
beerName: string,
low: boolean,
high: boolean,
partition: Set<Struct{
b: Beer,
s: Sell
}>
}
2. Intermediate collection (III)
» BeerName low
» Bud
Bud
Bud
high
partition
TRUE FALSE Slow
FALSE TRUE
Shigh
FALSE FALSE Smid
» where Slow Shigh, and Smid are the sets of beer-sells pairs (b;
s) where the beer is Bud and s has, respectively, a low (
2:00), high ( 4:00) and medium (between 2.00 and 4.00)
price.
» Note the partition with low = high = TRUE must be empty
and will not appear.
3. Output collection:

The first three components of each
group's struct are copied to the output,
and the last (partition) is counted.
The result:
beerName low
high
count
Bud
Bud
Bud
TRUE FALSE
FALSE TRUE
FALSE FALSE
27
14
36
SQL3 Objects
Objects in SQL3
• OQL extends C++ with database concepts, while SQL3 extends
SQL with OO concepts.
Objects in SQL3 (II)
• Ullman's personal opinion: the relation is so fundamental to data
manipulation that retaining it as the core, as SQL3 does, is
"right."
• Systems using the SQL3 philosophy are called object-relational.
Objects in SQL3 (III)
–
All the major relational vendors have something of this kind, allowing
any class to become the type of a column.
•
•
•
•
Informix
Data Blades
Oracle Cartridges
Sybase Plug-Ins
IBM/DB2
Extenders
Two Levels of SQL3 Objects
• 1. For tuples of relations = "row types."
• 2. For columns of relations = "types."
–
But row types can also be used as column types.
References
• Row types can have references.
• If T is a row type, then REF(T) is the type of a reference to a T
object.
• Unlike OO systems, refs are values that can be seen by queries.
Example of Row Types
• CREATE ROW TYPE BarType (
name CHAR(20) UNIQUE,
addr CHAR(20)
);
• CREATE ROW TYPE BeerType (
name CHAR(20) UNIQUE,
manf CHAR(20)
);
Example of Row Types (II)
• CREATE ROW TYPE MenuType (
bar REF(BarType),
beer REF(BeerType),
price FLOAT
);
Creating Tables
• Row-type declarations do not create tables.
–
They are used in place of element lists in CREATE TABLE
statements.
• Example
–
–
–
CREATE TABLE Bars OF TYPE BarType
CREATE TABLE Beers OF TYPE BeerType
CREATE TABLE Sells OF TYPE MenuType
Dereferencing
• A  B = the B attribute of the object referred to by reference A.
• Example
–
Find the beers served by Joe.
SELECT beer -> name
FROM Sells
WHERE bar -> name = 'Joe''s Bar';
OID's as Values
• A row type can have a reference to itself.
–
Serves as the OID for tuples of that type.
• Example
CREATE ROW TYPE BarType (
name CHAR(20),
addr CHAR(20),
barID REF(BarType)
);
CREATE TABLE Bars OF TYPE BarType
VALUES FOR barID ARE SYSTEM GENERATED
OID's as Values (II)
• VALUES... clause forces the barID of each tuple to refer to the
tuple itself.
Name
addr
Joe's Maple St.
barID
Example: Using References as Values
• Find the menu at Joe's.
SELECT Sells.beer->name, Sells.price
FROM Bars, Sells
WHERE Bars.name = 'Joe''s Bar' AND
Bars.barID = Sells.bar;
ADT's in SQL3
• Allows a column of a relation to have a type that is a "class,"
including methods.
• Intended application: data that doesn't fit relational model well,
e.g., locations, signals, images, etc.
• The type itself is usually a multi-attribute tuple.
ADT's in SQL3 (II)
• Type declaration:
CREATE TYPE <name> (
attributes
method declarations or definitions
);
• Methods defined in a PL/SQL-like language.
Example
CREATE TYPE BeerADT ( name CHAR(20), manf CHAR(20),
FUNCTION newBeer( :n CHAR(20), :m CHAR(20))
RETURNS BeerADT;
:b BeerADT; /* local decl. */
BEGIN
:b := BeerADT(); /* built-in constructor */
:b.name := :n;
:b.manf := :m;
RETURN :b;
END;
FUNCTION getMinPrice(:b BeerADT) RETURNS FLOAT; );
Example (II)
• getMinPrice is declaration only; newBeer is definition.
• getMinPrice must be defined somewhere where relation Sells is
available.
Example (III)
• FUNCTION getMinPrice(:b BeerADT)
RETURNS FLOAT;
:p FLOAT;
BEGIN
SELECT MIN(price) INTO :p
FROM Sells
WHERE beer->name = :b.name;
RETURN :p;
END;
Built-In Comparison Functions
• We can define for each ADT two functions EQUAL and
LESSTHAN that allow values of this ADT to participate in
WHERE clauses involving =, <=, etc.
Example: A "Point" ADT
•
CREATE TYPE Point ( x FLOAT, y FLOAT,
FUNCTION EQUALS( :p Point, :q Point )
RETURNS BOOLEAN;
BEGIN
IF :p.x = :q.x AND :p.y = :q.y THEN
RETURN TRUE
ELSE
RETURN FALSE;
END;
Example: A "Point" ADT (II)
•
FUNCTION LESSTHAN( :p Point, :q Point )
RETURNS BOOLEAN;
BEGIN
IF :p.x > :q.x THEN
RETURN FALSE
ELSIF :p.x < :q.x THEN
IF :p.y <= :q.y THEN
RETURN TRUE
ELSE RETURN FALSE
ELSE /* :p.x = :q.x
IF :p.y < :q.y THEN
RETURN TRUE
ELSE RETURN FALSE
END;
);
Using the Comparison Functions
• Here is a query that computes the lower convex hull of a set of
points.
• Assumes MyPoints(p) is a relation with a single column p of type
Point.
–
SELECT p
FROM MyPoints
WHERE NOT p > ANY MyPoints;
Using the Comparison Functions (II)
Descargar

Object Oriented Database Management