```Relational Algebra
Relational Query Languages
 Query = “retrieval program”
 Language examples:
 Theoretical:
1. Relational Algebra
2. Relational Calculus
a. tuple relational calculus (TRC)
b. domain relational calculus (DRC)
 Practical
1. SQL (SEQUEL from System R)
2. QUEL (Ingres)
3. Datalog (Prolog-like)
 Theoretical QL’s:
 give semantics to practical QL’s
 key to understand query optimization in relational DBMSs
Relational Algebra
 Basic operators
 select ( )
 project (p )
 union ( )
 set difference ( - )
 cartesian product ( x )
 rename ( )
 The operators take one or two relations as inputs and give a
new relation as a result.
relation
relation
relational
operator
relation
R1 sid
Example Instances
bid
101
102
103
104
bname
Interlake
Interlake
Clipper
Marine
color
blue
red
green
red
Boats
Schema:
Boats(bid, bname, color)
Sailors(sid, sname, rating, age)
Reserves( sid, bid, day)
22
58
sid
bid
day
101
103
10/10/96
11/12/96
sname
rating
age
dustin
7
45.0
31
58
lubber
rusty
8
10
55.5
35.0
sid
S2 28
31
44
58
sname
yuppy
lubber
guppy
rusty
rating
9
8
5
10
age
35.0
55.5
35.0
35.0
S1 22
Projection
 Examples:
p age ( S 2 )
;
p
sname , rating
( S 2)
 Retains only attributes that are in the “projection list”.
 Schema of result:
 exactly the columns in the projection list, with the same names that they
had in the input relation.
 Projection operator has to eliminate duplicates
(How do they
arise? Why remove them?)
 Note: real systems typically don’t do duplicate elimination unless the
user explicitly asks for it. (Why not?)
Projection
sid
28
31
44
58
sname
yuppy
lubber
guppy
rusty
S2
rating
9
8
5
10
age
35.0
55.5
35.0
35.0
sname
rating
yuppy
9
lubber
8
guppy
rusty
5
10
p sname , rating
age
35.0
55.5
p age ( S 2 )
( S 2)
Selection ()
 Selects rows that satisfy selection condition.
 Result is a relation.
Schema of result is same as that of the input relation.
 Do we need to do duplicate elimination?
sid
28
31
44
58
sname
yuppy
lubber
guppy
rusty

rating
9
8
5
10
rating  8
(S 2)
age
35.0
55.5
35.0
35.0
p
sname
rating
yuppy
9
rusty
10
snam e, rating
(
rating  8
( S 2 ))
Selection
 Notation:
 p(r)
 p is called the selection predicate , r can be the name of a table,
or another query
 Predicate:
1. Simple
 attr1 = attr2
 Attr = constant value
 (also, <, > , etc)
2. Complex
 predicate1 AND predicate2
 predicate1 OR predicate2
 NOT (predicate)
Union and Set-Difference
 All of these operations take two input relations, which
must be union-compatible:
 Same number of columns (attributes).
 `Corresponding’ columns have the same type.
 For which, if any, is duplicate elimination required?
Union
sid
sname
rating
age
22
dustin
7
45.0
31
58
lubber
rusty
8
10
55.5
35.0
rating
9
8
5
10
age
35.0
55.5
35.0
35.0
S1
sid
28
31
44
58
sname
yuppy
lubber
guppy
rusty
S2
sid
sname
rating
age
22
31
58
44
28
dustin
lubber
rusty
guppy
yuppy
7
8
10
5
9
45.0
55.5
35.0
35.0
35.0
S1  S 2
Set Difference
sid
sname
rating
age
sid
sname
rating
age
22
dustin
7
45.0
22
dustin
7
45.0
31
58
lubber
rusty
8
10
55.5
35.0
rating
9
8
5
10
age
35.0
55.5
35.0
35.0
S1  S 2
S1
sid
28
31
44
58
sname
yuppy
lubber
guppy
rusty
S2
sid
28
44
sname
yuppy
guppy
rating
9
5
S2 – S1
age
35.0
35.0
Cartesian-Product
 S1  R1: Each row of S1 paired with each row of R1.
Like the c.p for mathematical relations: every tuple of S1 “appended” to every
tuple of R1
 Q: How many rows in the result?
 Result schema has one field per field of S1 and R1, with field
names `inherited’ if possible.
 May have a naming conflict: Both S1 and R1 have a field with the
same name.
 In this case, can use the renaming operator…
Cartesian Product Example
sid
sname
rating
age
22
dustin
7
45.0
31
lubber
8
55.5
58
rusty
10
35.0
sid
bid
day
22
58
101
103
10/10/96
11/12/96
R1
S1
(sid) sname rating age
R1 X S1 =
(sid) bid
day
22
dustin
7
45.0
22
101
10/10/96
22
dustin
7
45.0
58
103
11/12/96
31
lubber
8
55.5
22
101
10/10/96
31
lubber
8
55.5
58
103
11/12/96
58
rusty
10
35.0
22
101
10/10/96
58
rusty
10
35.0
58
103
11/12/96
Rename ( )
 Allows us to refer to a relation by more than one name
and to rename conflicting names
Example:
 x (E)
returns the expression E under the name X
 If a relational-algebra expression E has arity n, then
x (A1, A2, …, An) (E)
returns the result of expression E under the name X, and with the
attributes renamed to A1, A2, …., An.
Ex.
temp1(sid1,sname,rating, age, sid2, bid, day) (R1 x S1)
Compound Operator: Intersection
 In addition to the 6 basic operators, there are
several additional “Compound Operators”
 These add no computational power to the language, but are
useful shorthands.
 Can be expressed solely with the basic ops.
 Intersection takes two input relations, which must be
union-compatible.
 Q: How to express it using basic operators?
R  S = R  (R  S)
Intersection
sid
sname
rating
age
22
dustin
7
45.0
31
58
lubber
rusty
8
10
55.5
35.0
S1
sid
28
31
44
58
sname
yuppy
lubber
guppy
rusty
S2
rating
9
8
5
10
age
35.0
55.5
35.0
35.0
sid
sname
rating
age
31
58
lubber
rusty
8
10
55.5
35.0
S1  S 2
Compound Operator: Join
 Joins are compound operators involving cross product,
selection, and (sometimes) projection.
 Most common type of join is a “natural join” (often
just called “join”). R
S conceptually is:
 Compute R  S
 Select rows where attributes that appear in both relations have equal values
 Project all unique atttributes and one copy of each of the common ones.
 Note: Usually done much more efficiently than this.
 Useful for putting “normalized” relations back
together.
Natural Join Example
sid
bid
day
sid
sname
rating
age
22
58
101
103
10/10/96
11/12/96
22
dustin
7
45.0
31
58
lubber
rusty
8
10
55.5
35.0
R1
R1
S1
S1 =
sid
sname
rating
age
bid
day
22
58
dustin
rusty
7
10
45.0
35.0
101
103
10/10/96
11/12/96
Other Types of Joins
 Condition Join (or “theta-join”):
R  c S   c ( R  S )
(sid)
sname
rating
age
(sid)
bid
day
22
31
dustin
lubber
7
8
45.0
55.5
58
58
103
103
11/12/96
11/12/96
S 1
S 1 .sid  R1 .sid
R1
 Result schema same as that of cross-product.
 May have fewer tuples than cross-product.
 Equi-join: special case: condition c contains only conjunction of
equalities.
Compound Operator: Division
 Useful for expressing “for all” queries like:
Find sids of sailors who have reserved all boats.
 For A/B attributes of B are subset of attrs of A.
 May need to “project” to make this happen.
 E.g., let A have 2 fields, x and y; B have only field y:
A B 
x
 y
 B ( x, y
 A)

A/B contains all tuples (x) such that for every y tuple in B, there is an
xy tuple in A.
Examples of Division A/B
sno
pno
pno
s1
s1
s1
s1
s2
s2
s3
p1
p2
p3
p4
p1
p2
p2
p2
s4
p2
s4
p4
A
B1
pno
p2
p4
B2
sno
s1
s2
s3
s4
A/B1
pno
p1
p2
p4
B3
sno
s1
s4
sno
A/B2
A/B3
s1
Expressing A/B Using Basic Operators
 Division is not essential op; just a useful shorthand.
 (Also true of joins, but joins are so common that systems implement joins
specially.)
 Idea: For A/B, compute all x values that are not `disqualified’
by some y value in B.
 x value is disqualified if by attaching y value from B, we obtain an xy tuple
that is not in A.
Disqualified x values = p x (( p x ( A) B ) A)
A/B = p x ( A )  Disqualified x values
Banking Example
branch (branch-name, branch-city, assets)
customer (customer-name, customer-street, customer-only)
account (account-number, branch-name, balance)
loan (loan-number, branch-name, amount)
depositor (customer-name, account-number)
borrower (customer-name, loan-number)
Example Queries
loan (loan-number, branch-name, amount)
 Find all loans of over \$1200
amount >1200 (loan)
 Find the loan number for each loan of an amount greater than
\$1200
ploan-number (amount > 1200 (loan))
Example Queries
depositor (customer-name, account-number)
borrower (customer-name, loan-number)
 Find the names of all customers who have a loan, an
depositor account, or both, from the bank
pcustomer-name (borrower)  pcustomer-name (depositor)
 Find the names of all customers who have a loan and
a depositor account at bank.
pcustomer-name (borrower)  pcustomer-name (depositor)
Example Queries
loan (loan-number, branch-name, amount)
depositor (customer-name, account-number)
borrower (customer-name, loan-number)
 Find the names of all customers who have a loan at the Perryridge
branch but do not have an depositor account at any branch of the bank.
pcustomer-name (branch-name = “Perryridge” (borrower
– pcustomer-name (depositor)
loan))
Example Queries
account (account-number, branch-name, balance)
Find the largest account balance
 Rename account relation as d
 The query is:
p balance(account) -
paccount.balance( account.balance < d.balance (account x d (account)))
Example Queries
account (account-number, branch-name, balance)
depositor (customer-name, account-number)
 Find all customers who have an account from at least
the “Downtown” and the Uptown” branches.
 Query 1
pCN(BN=“Downtown”(depositor
account)) 
pCN(BN=“Uptown”(depositor
account))
where CN denotes customer-name and BN denotes
branch-name.
 Query 2
customer-name, branch-name (depositor
account)
 temp(branch-name) ({(“Downtown”), (“Uptown”)})
Example Queries
 Find all customers who have an account at all
branches located in Boston.
pcustomer-name, branch-name (depositor account)
 pbranch-name (branch-city = “Boston” (branch))
Extended Relational Operations
 Additional Operators that extend the power of the
language
 Based on SQL… make the language less clean
 Generalized projections
 Outer Joins
 Update
General Projection
Notation:
p e1, e2, …, en (Relation)
ei: can include any arithmetic operation – not only attributes
Example:
credit =
cname
limit
balance
Jones
Turner
5000 2000
3000 2500
Smith
4000 3000
Then:
p cname, limit – balance =
cname
limit -balance
Jones
Turner
3000
500
Smith
1000
Outer Joins
Motivation:
loan
lno
amt
Jones
L-170
Turner L-230
Downtown L-170
Redwood L-230
3000
4000
Smith
Perry
1700
bname
loan
cname
borrower =
lno
L-260
bname
lno
Downtown L-170
Redwood L-230
Join result loses:
•any record of Perry
•any record of Smith
L-155
borrower
amt
cname
3000
4000
Jones
Turner
Outer Join (
 Left outer Join (
)
)
preserves all tuples in left relation
loan
borrower =
 Right outer Join (
bname
lno
amt
cname
Downtown L-170
Redwood L-230
3000
4000
Jones
Turner
Perry
1700
NULL
L-260
bname
)
lno
Downtown L-170
preserves all tuples in right relation Redwood L-230
NULL
L-155
amt
cname
3000
4000
Jones
Turner
NULL
Smith
Outer Join (cont)
 Full Outer Join
(
)
 preserves all tuples in both relations
bname
lno
amt
cname
Downtown L-170
Redwood
L-230
3000
4000
Jones
Turner
Perry
L-260
1700
NULL
NULL
L-155
NULL
Smith
Update ()
1) Deletion: r  r – s
 account  account –  bname = Perry (account)
2) Insertion: r  r  s
 branch  branch  {( BU, Boston, 9M)}
3) Update: r 
 account 
pe1, e2, …, en (r)
p bname, acct_no, bal * 1.05 (account)
```