Part II
Instruction-Set Architecture
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 1
About This Presentation
This presentation is intended to support the use of the textbook
Computer Architecture: From Microprocessors to Supercomputers,
Oxford University Press, 2005, ISBN 0-19-515455-X. It is updated
regularly by the author as part of his teaching of the upper-division
course ECE 154, Introduction to Computer Architecture, at the
University of California, Santa Barbara. Instructors can use these
slides freely in classroom teaching and for other educational
purposes. Any other use is strictly prohibited. © Behrooz Parhami
Edition
Released
Revised
Revised
Revised
Revised
First
June 2003
July 2004
June 2005
Mar. 2006
Jan. 2007
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 2
A Few Words About Where We Are Headed
Performance = 1 / Execution time
simplified to 1 / CPU execution time
CPU execution time = Instructions  CPI / (Clock rate)
Performance = Clock rate / ( Instructions  CPI )
Try to achieve CPI = 1
with clock that is as
high as that for CPI > 1
designs; is CPI < 1
feasible? (Chap 15-16)
Design memory & I/O
structures to support
ultrahigh-speed CPUs
Jan. 2007
Define an instruction set;
make it simple enough
to require a small number
of cycles and allow high
clock rate, but not so
simple that we need many
instructions, even for very
simple tasks (Chap 5-8)
Computer Architecture, Instruction-Set Architecture
Design hardware
for CPI = 1; seek
improvements with
CPI > 1 (Chap 13-14)
Design ALU for
arithmetic & logic
ops (Chap 9-12)
Slide 3
II Instruction Set Architecture
Introduce machine “words” and its “vocabulary,” learning:
• A simple, yet realistic and useful instruction set
• Machine language programs; how they are executed
• RISC vs CISC instruction-set design philosophy
Topics in This Part
Chapter 5 Instructions and Addressing
Chapter 6 Procedures and Data
Chapter 7 Assembly Language Programs
Chapter 8 Instruction Set Variations
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 4
5 Instructions and Addressing
First of two chapters on the instruction set of MiniMIPS:
• Required for hardware concepts in later chapters
• Not aiming for proficiency in assembler programming
Topics in This Chapter
5.1 Abstract View of Hardware
5.2 Instruction Formats
5.3 Simple Arithmetic / Logic Instructions
5.4 Load and Store Instructions
5.5 Jump and Branch Instructions
5.6 Addressing Modes
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 5
5.1 Abstract View of Hardware
. ..
m  2 32
Loc 0 Loc 4 Loc 8
M e m o ry
4 B / loc ation
up to 2 30 w ords
Loc
m  8
Loc
m  4
. ..
EIU
$0
(M ain proc.)
$1
$2
$31
A LU
E x ec ution
& integer
unit
Integ er
m ul/di v
Hi
FPU
$0
(C oproc. 1)
$1
$2
$31
FP
arith
Lo
TM U
(C oproc. 0)
B adV addr
Status
Caus e
C h a p te r
10
Figure 5.1
Jan. 2007
C h a p te r
11
F loating point unit
C h a p te r
12
T ra p &
m em o ry
unit
EPC
Memory and processing subsystems for MiniMIPS.
Computer Architecture, Instruction-Set Architecture
Slide 6
Data Types
Byte =B8y te
bits
Used only for floating-point data,
so safe to ignore in this course
Hal fwor d= 2 bytes
Halfword
Word =W4ord
bytes
Doubleword
= ble
8 bytes
Dou
wor d
Quadword (16 bytes) also used occasionally
MiniMIPS registers hold 32-bit (4-byte) words. Other common
data sizes include byte, halfword, and doubleword.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 7
$0
$1
$2
$3
$4
$5
$6
$7
$8
$9
$10
$11
$12
$13
$14
$15
$16
$17
$18
$19
$20
$21
$22
$23
$24
$25
$26
$27
$28
$29
$30
$31
0
Jan. 2007
$zero
$at
$v0
$v1
$a0
$a1
$a2
$a3
$t0
$t1
$t2
$t3
$t4
$t5
$t6
$t7
$s0
$s1
$s2
$s3
$s4
$s5
$s6
$s7
$t8
$t9
$k0
$k1
$gp
$sp
$fp
$ra
R es er ved fo r as s em bl er us e
P roc edu re r es ults
P roc edu re
arg um e nts
S a ved
A 4 -b yte w o rd
s its in co ns e cu tive
m em o ry a d d ress es
a cco rd ing to th e
b ig -e nd ian o rd e r
(m os t sig n ifica n t
b yte h a s th e
lo w es t a d d ress )
B yte n u m b e ring :
3
3
2
1
0
2
1
Register
Conventions
0
W h e n lo a d in g
a b yte in to a
re g is te r, it g o e s
in th e lo w e nd By te
T em p or ary
valu es
W or d
Doublew or d
O pe ran ds
S a ved
ac ros s
proc e dur e
c alls
M ore
tem po rari es
R es e r ve d for O S (k e rn el)
G lobal p ointer
S tac k pointer
F ram e pointe r
R etur n ad dres s
S a ved
A d o u b le w o rd
s its in co ns e cu tive
re g is te rs o r
m em o ry lo ca tio ns
a cco rd ing to th e
b ig -e nd ian o rd e r
(m os t sig n ifica n t
w o rd com es firs t)
Computer Architecture, Instruction-Set Architecture
Figure 5.2
Registers and
data sizes in
MiniMIPS.
Slide 8
$4
$5
$6
$7
$8
$9
$10
$11
$12
$13
$14
$15
$16
$17
$18
$19
$20
$21
$22
$23
$24
$25
$26
$27
$28 Jan. 2007
$29
$a0
$a1
$a2
$a3
$t0
$t1
$t2
$t3
$t4
$t5
$t6
$t7
$s0
$s1
$s2
$s3
$s4
$s5
$s6
$s7
$t8
$t9
$k0
$k1
$gp
$sp
b ig -e nd ian o rd e r
(m os t sig n ifica n t
b yte h a s th e
lo w es t a d d ress )
P roc edu re
arg um e nts
a ved Chapter
Registers Used in SThis
10 temporary registers
B yte n u m b e ring :
T em p or ary
valu es
8 operand registers
Change
3
2
1
W h e n lo a d in g
a b yte in to a
re g is te r, it g o e s
in th e lo w e nd By
Wallet
W or d
Keys
Doublew or d
O pe ran ds
S a ved
ac ros s
proc e dur e
c alls
M ore
tem po rari es
Figure
(partial)
R
es e r ve5.2
d for O
S (k e rn el)
Computer
Architecture,
G lobal p
ointer Instruction-Set Architecture
S tac k pointer
A d o u b le w o rd
s its in co ns e cu tive
Analogy for
re gregister
is te rs o r
m em o ry lo ca tio ns
usage conventions
a cco rd ing to th e
b ig Slide
-e nd9ian o rd e r
(m os t sig n ifica n t
5.2 Instruction Formats
H igh -le vel lang uag e s tatem ent:
a = b + c
A s s em bly languag e ins truc tion:
ad d $t8 , $s2 , $s1
M ac hine lan gua ge ins truc tion:
000000 10010 10001 11000 00000 100000
A LU- ty pe
ins tr uc tion
Ins truc tion
c ac he
P
C
Jan. 2007
R egis ter
file
$17
$18
D ata c ac he
(not us ed )
A ddition
opc ode
Unus ed
R egis ter
file
A LU
$24
Ins truc tion
fetc h
Figure 5.3
Regis ter Regis ter Regis ter
18
17
24
R egis ter
rea dout
O pe ration
D ata
rea d/s tore
R egis ter
w riteb ac k
A typical instruction for MiniMIPS and steps in its execution.
Computer Architecture, Instruction-Set Architecture
Slide 10
Add, Subtract, and Specification of Constants
MiniMIPS add & subtract instructions; e.g., compute:
g = (b + c)  (e + f)
add
add
sub
$t8,$s2,$s3
$t9,$s5,$s6
$s7,$t8,$t9
# put the sum b + c in $t8
# put the sum e + f in $t9
# set g to ($t8)  ($t9)
Decimal and hex constants
Decimal
Hexadecimal
25, 123456, 2873
0x59, 0x12b4c6, 0xffff0000
Machine instruction typically contains
an opcode
one or more source operands
possibly a destination operand
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 11
MiniMIPS Instruction Formats
31
R
31
I
31
J
op
25
rs
20
rt
15
6 bits
5 bits
5 bits
O pc od e
S ourc e
regis ter 1
S ourc e
regis ter 2
op
25
rs
20
rt
rd
sh
10
5 bits
D es tination
regis ter
15
fn
5
5 bits
6 bits
S hift
am ou nt
O pc od e
ex tens ion
o p e ra n d / o ffs e t
6 bits
5 bits
5 bits
16 bits
O pc od e
S ourc e
or bas e
D es tination
or data
Im m ediate o pe ran d
or ad dres s o ffs et
op
25
0
0
ju m p ta rg e t a d d re s s
0
6 bits
1 0 0 0 0 0 0 0 0 0 0 0 260 bits
0 0 0 0 0 0 0 1 1 1 1 0 1
O pc od e
M em o ry w ord a ddr es s (by te addr es s di vided by 4)
Figure 5.4 MiniMIPS instructions come in only three formats:
register (R), immediate (I), and jump (J).
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 12
5.3 Simple Arithmetic/Logic Instructions
Add and subtract already discussed; logical instructions are similar
add
sub
and
or
xor
nor
31
R
$t0,$s0,$s1
$t0,$s0,$s1
$t0,$s0,$s1
$t0,$s0,$s1
$t0,$s0,$s1
$t0,$s0,$s1
op
25
rs
20
#
#
#
#
#
#
rt
set
set
set
set
set
set
15
$t0
$t0
$t0
$t0
$t0
$t0
rd
to
to
to
to
to
to
($s0)+($s1)
($s0)-($s1)
($s0)($s1)
($s0)($s1)
($s0)($s1)
(($s0)($s1))
sh
10
5
fn
0
0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 0 x 0
A LU
ins truc tion
S ourc e
regis ter 1
S ourc e
regis ter 2
D es tination
regis ter
U nus e d
add = 32
s ub = 34
Figure 5.5 The arithmetic instructions add and sub have a format that
is common to all two-operand ALU instructions. For these, the fn field
specifies the arithmetic/logic operation to be performed.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 13
Arithmetic/Logic with One Immediate Operand
An operand in the range [32 768, 32 767], or [0x0000, 0xffff],
can be specified in the immediate field.
addi
andi
ori
xori
$t0,$s0,61
$t0,$s0,61
$t0,$s0,61
$t0,$s0,0x00ff
#
#
#
#
set
set
set
set
$t0
$t0
$t0
$t0
to
to
to
to
($s0)+61
($s0)61
($s0)61
($s0) 0x00ff
For arithmetic instructions, the immediate operand is sign-extended
31
I
op
25
rs
20
rt
15
o p e ra n d / o ffs e t
0
0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1
addi = 8
1 0 Errors 0 1
S ourc e
D es tination
Im m ediate o pe ran d
Figure 5.6 Instructions such as addi allow us to perform an
arithmetic or logic operation for which one operand is a small constant.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 14
5.4 Load and Store Instructions
op
31
I
25
rs
20
rt
15
o p e ra n d / o ffs e t
0
1 0 x 0 1 1 1 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0
lw = 35
s w = 43
M e m o ry
A [0]
A [1]
A [2]
.
.
.
A [i]
B as e
regis ter
lw
lw
D ata
regis ter
$t0,40($s3)
$t0,A($s3)
A ddres s in
bas e re gis ter
O ffs et = 4i
E lem ent i
of a rr ay A
O ffs et rel ati ve to bas e
N o te o n b a se a n d o ff se t:
T he m e m o ry addres s is the s um
of (r s ) an d an im m ediate value.
C alling o ne o f thes e the b as e
and the oth er the o ffs et is quite
arbitr ary . It w ould m ak e pe rfec t
s ens e to interp ret the add res s
A ( $ s 3 ) as ha ving the b as e A
and the o ffs et ( $ s 3 ) . H ow e ve r,
a 16- bit bas e c on fines us to a
s m all portion o f m em o ry s pac e.
Figure 5.7 MiniMIPS lw and sw instructions and their memory
addressing convention that allows for simple access to array elements
via a base address and an offset (offset = 4i leads us to the i th word).
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 15
lw, sw, and lui Instructions
lw
sw
$t0,40($s3)
$t0,A($s3)
lui
$s0,61
op
31
I
25
rs
# load mem[40+($s3)] in $t0
# store ($t0) in mem[A+($s3)]
# “($s3)” means “content of $s3”
# The immediate value 61 is
# loaded in upper half of $s0
# with lower 16b set to 0s
20
rt
15
o p e ra n d / o ffs e t
0
0 0 1 1 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1
lui = 15
U nus e d
D es tination
Im m ediate o pe ran d
0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
C onte nt of $s 0 a fte r the ins truc tion is ex ec uted
Figure 5.8 The lui instruction allows us to load an arbitrary 16-bit
value into the upper half of a register while setting its lower half to 0s.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 16
Initializing a Register
Example 5.2
Show how each of these bit patterns can be loaded into $s0:
0010 0001 0001 0000 0000 0000 0011 1101
1111 1111 1111 1111 1111 1111 1111 1111
Solution
The first bit pattern has the hex representation: 0x2110003d
lui
ori
$s0,0x2110
$s0,0x003d
# put the upper half in $s0
# put the lower half in $s0
Same can be done, with immediate values changed to 0xffff
for the second bit pattern. But, the following is simpler and faster:
nor
Jan. 2007
$s0,$zero,$zero # because (0  0) = 1
Computer Architecture, Instruction-Set Architecture
Slide 17
5.5 Jump and Branch Instructions
Unconditional jump and jump through register instructions
j
jr
$ra is the
symbolic
name for
reg. $31
(return
address)
verify
$ra
op
31
J
# go to mem loc named “verify”
# go to address that is in $ra;
# $ra may hold a return address
ju m p ta rg e t a d d re s s
25
0 0 0 0 1 0
0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1
j = 2
x
x
x
x
0 0 0 0 0 0 1 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
F rom P C
31
R
E ffec ti ve tar get add res s (3 2 bits )
op
25
rs
20
rt
15
rd
10
sh
5
fn
0
0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
A LU
ins truc tion
S ourc e
regis ter
U nus e d
U nus e d
U nus e d
jr = 8
Figure 5.9 The jump instruction j of MiniMIPS is a J-type instruction which
is shown along with how its effective target address is obtained. The jump
register (jr) instruction is R-type, with its specified register often being $ra.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 18
Conditional Branch Instructions
Conditional branches use PC-relative addressing
bltz $s1,L
beq $s1,$s2,L
bne $s1,$s2,L
31
I
op
25
rs
20
rt
15
o p e ra n d / o ffs e t
0
0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1
bltz = 1
31
I
# branch on ($s1)< 0
# branch on ($s1)=($s2)
# branch on ($s1)($s2)
op
S ourc e
25
rs
Z ero
20
rt
R elati ve br anc h dis tanc e in w ords
15
o p e ra n d / o ffs e t
0
0 0 0 1 0 x 1 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1
beq = 4
bne = 5
S ourc e 1
Figure 5.10 (part 1)
Jan. 2007
S ourc e 2
R elati ve br anc h dis tanc e in w ords
Conditional branch instructions of MiniMIPS.
Computer Architecture, Instruction-Set Architecture
Slide 19
Comparison Instructions for Conditional Branching
slt
$s1,$s2,$s3
slti
$s1,$s2,61
31
R
op
20
if ($s2)<($s3), set $s1 to 1
else set $s1 to 0;
often followed by beq/bne
if ($s2)<61, set $s1 to 1
else set $s1 to 0
rt
15
rd
10
sh
5
fn
0
0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 1 1 0 0 0 1 0 0 0 0 0 1 0 1 0 1 0
A LU
ins truc tion
31
I
rs
25
#
#
#
#
#
op
S ourc e 1
regis ter
rs
25
S ourc e 2
regis ter
20
rt
D es tination
15
U nus e d
s lt = 42
o p e ra n d / o ffs e t
0
0 0 1 0 1 0 1 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1
s lti = 10
S ourc e
Figure 5.10 (part 2)
Jan. 2007
D es tination
Im m ediate o pe ran d
Comparison instructions of MiniMIPS.
Computer Architecture, Instruction-Set Architecture
Slide 20
Examples for Conditional Branching
If the branch target is too far to be reachable with a 16-bit offset
(rare occurrence), the assembler automatically replaces the branch
instruction beq $s0,$s1,L1 with:
bne
j
L2: ...
$s1,$s2,L2
L1
# skip jump if (s1)(s2)
# goto L1 if (s1)=(s2)
Forming if-then constructs; e.g., if (i == j) x = x + y
bne $s1,$s2,endif
add $t1,$t1,$t2
endif: ...
# branch on ij
# execute the “then” part
If the condition were (i < j), we would change the first line to:
slt
beq
Jan. 2007
$t0,$s1,$s2
$t0,$0,endif
# set $t0 to 1 if i<j
# branch if ($t0)=0;
# i.e., i not< j or ij
Computer Architecture, Instruction-Set Architecture
Slide 21
Compiling if-then-else Statements
Example 5.3
Show a sequence of MiniMIPS instructions corresponding to:
if (i<=j) x = x+1; z = 1; else y = y–1; z = 2*z
Solution
Similar to the “if-then” statement, but we need instructions for the
“else” part and a way of skipping the “else” part after the “then” part.
slt
bne
addi
addi
j
else: addi
add
endif:...
Jan. 2007
$t0,$s2,$s1
$t0,$zero,else
$t1,$t1,1
$t3,$zero,1
endif
$t2,$t2,-1
$t3,$t3,$t3
#
#
#
#
#
#
#
j<i? (inverse condition)
if j<i goto else part
begin then part: x = x+1
z = 1
skip the else part
begin else part: y = y–1
z = z+z
Computer Architecture, Instruction-Set Architecture
Slide 22
5.6 Addressing Modes
A ddres s ing
Ins truc tion
O ther el em e nts in vol ved
O pe ran d
Some p lac e
in the mac hine
Im pli ed
Ex tend,
if r equir ed
Im m ediate
Reg s pec
Reg f ile
Reg data
R egis ter
Cons tant offs et
B as e
Reg bas e
P C -r elati ve
Reg f ile
Reg
data
A dd
Me m
addr
A dd
Me m
addr
Cons tant offs et
Me mor y
Me m
data
Me mor y
Me m
data
Me mor y
Me m
data
PC
P s eudodi rec t
PC
Me m
addr
Figure 5.11 Schematic representation of addressing modes in MiniMIPS.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 23
Finding the Maximum Value in a List of Integers
Example 5.5
List A is stored in memory beginning at the address given in $s1.
List length is given in $s2.
Find the largest integer in the list and copy it into $t0.
Solution
Scan the list, holding the largest element identified thus far in $t0.
lw
addi
loop: add
beq
add
add
add
lw
slt
beq
addi
maximum
j
done: ...
Jan. 2007
$t0,0($s1)
$t1,$zero,0
$t1,$t1,1
$t1,$s2,done
$t2,$t1,$t1
$t2,$t2,$t2
$t2,$t2,$s1
$t3,0($t2)
$t4,$t0,$t3
$t4,$zero,loop
$t0,$t3,0
#
#
#
#
#
#
#
#
#
#
initialize maximum to A[0]
initialize index i to 0
increment index i by 1
if all elements examined, quit
compute 2i in $t2
compute 4i in $t2
form address of A[i] in $t2
load value of A[i] into $t3
maximum < A[i]?
if not, repeat with no change
# if so, A[i] is the new
loop
# change completed; now repeat
# continuation of the program
Computer Architecture, Instruction-Set Architecture
Slide 24
The 20 MiniMIPS
Instructions
Covered So Far
Copy
Arithmetic
Logic
Memory access
Control transfer
Table 5.1
Jan. 2007
Instruction
Usage
Load upper immediate
Add
Subtract
Set less than
Add immediate
Set less than immediate
AND
OR
XOR
NOR
AND immediate
OR immediate
XOR immediate
Load word
Store word
Jump
Jump register
Branch less than 0
Branch equal
Branch not equal
lui
add
sub
slt
addi
slti
and
or
xor
nor
andi
ori
xori
lw
sw
j
jr
bltz
beq
bne
Computer Architecture, Instruction-Set Architecture
rt,imm
rd,rs,rt
rd,rs,rt
rd,rs,rt
rt,rs,imm
rd,rs,imm
rd,rs,rt
rd,rs,rt
rd,rs,rt
rd,rs,rt
rt,rs,imm
rt,rs,imm
rt,rs,imm
rt,imm(rs)
rt,imm(rs)
L
rs
rs,L
rs,rt,L
rs,rt,L
op fn
15
0
0
0
8
10
0
0
0
0
12
13
14
35
43
2
0
1
4
5
Slide 25
32
34
42
36
37
38
39
8
6 Procedures and Data
Finish our study of MiniMIPS instructions and its data types:
• Instructions for procedure call/return, misc. instructions
• Procedure parameters and results, utility of stack
Topics in This Chapter
6.1 Simple Procedure Calls
6.2 Using the Stack for Data Storage
6.3 Parameters and Results
6.4 Data Types
6.5 Arrays and Pointers
6.6 Additional Instructions
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 26
6.1 Simple Procedure Calls
Using a procedure involves the following sequence of actions:
1.
2.
3.
4.
5.
6.
Put arguments in places known to procedure (reg’s $a0-$a3)
Transfer control to procedure, saving the return address (jal)
Acquire storage space, if required, for use by the procedure
Perform the desired task
Put results in places known to calling program (reg’s $v0-$v1)
Return control to calling point (jr)
MiniMIPS instructions for procedure call and return from procedure:
Jan. 2007
jal
proc
# jump to loc “proc” and link;
# “link” means “save the return
# address” (PC)+4 in $ra ($31)
jr
rs
# go to loc addressed by rs
Computer Architecture, Instruction-Set Architecture
Slide 27
Illustrating a Procedure Call
main
P repa re
to c all
PC
jal
pr oc
P repa re
to c ontinue
proc
S a ve, etc .
R es tor e
jr
Figure 6.1
Jan. 2007
$ra
Relationship between the main program and a procedure.
Computer Architecture, Instruction-Set Architecture
Slide 28
$0
$1
$2
$3
$4
$5
$6
$7
$8
$9
$10
$11
$12
$13
$14
$15
$16
$17
$18
$19
$20
$21
$22
$23
$24
$25
$26
$27
$28
$29
$30
$31
0
Jan. 2007
$zero
$at
$v0
$v1
$a0
$a1
$a2
$a3
$t0
$t1
$t2
$t3
$t4
$t5
$t6
$t7
$s0
$s1
$s2
$s3
$s4
$s5
$s6
$s7
$t8
$t9
$k0
$k1
$gp
$sp
$fp
$ra
R es er ved fo r as s em bl er us e
P roc edu re r es ults
P roc edu re
arg um e nts
S a ved
A 4 -b yte w o rd
s its in co ns e cu tive
m em o ry a d d ress es
a cco rd ing to th e
b ig -e nd ian o rd e r
(m os t sig n ifica n t
b yte h a s th e
lo w es t a d d ress )
B yte n u m b e ring :
3
3
2
1
0
2
1
Recalling
Register
Conventions
0
W h e n lo a d in g
a b yte in to a
re g is te r, it g o e s
in th e lo w e nd By te
T em p or ary
valu es
W or d
Doublew or d
O pe ran ds
S a ved
ac ros s
proc e dur e
c alls
M ore
tem po rari es
R es e r ve d for O S (k e rn el)
G lobal p ointer
S tac k pointer
F ram e pointe r
R etur n ad dres s
S a ved
A d o u b le w o rd
s its in co ns e cu tive
re g is te rs o r
m em o ry lo ca tio ns
a cco rd ing to th e
b ig -e nd ian o rd e r
(m os t sig n ifica n t
w o rd com es firs t)
Computer Architecture, Instruction-Set Architecture
Figure 5.2
Registers and
data sizes in
MiniMIPS.
Slide 29
A Simple MiniMIPS Procedure
Example 6.1
Procedure to find the absolute value of an integer.
$v0  |($a0)|
Solution
The absolute value of x is –x if x < 0 and x otherwise.
abs: sub
$v0,$zero,$a0
bltz $a0,done
add $v0,$a0,$zero
done: jr
$ra
#
#
#
#
#
put -($a0) in $v0;
in case ($a0) < 0
if ($a0)<0 then done
else put ($a0) in $v0
return to calling program
In practice, we seldom use such short procedures because of the
overhead that they entail. In this example, we have 3-4
instructions of overhead for 3 instructions of useful computation.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 30
Nested Procedure Calls
main
PC
jal
ab c
P re p a re
to ca ll
P re p a re
to co n tin u e
abc
P roc edu re
abc
S a ve
xyz
jal
Text version
is incorrect
Figure 6.2
Jan. 2007
P roc edu re
x yz
xy z
R es tor e
jr
$ra
jr
$ra
Example of nested procedure calls.
Computer Architecture, Instruction-Set Architecture
Slide 31
6.2 Using the Stack for Data Storage
sp
P us h c
sp
c
b
a
Figure 6.4
push: addi
sw
Jan. 2007
Analogy:
Cafeteria
stack of
plates/trays
b
a
P op x
b
a
sp
sp = sp – 4
m em [s p] = c
x = m em [s p]
sp = sp + 4
Effects of push and pop operations on a stack.
$sp,$sp,-4
$t4,0($sp)
pop: lw
addi
Computer Architecture, Instruction-Set Architecture
$t5,0($sp)
$sp,$sp,4
Slide 32
Memory
Map in
MiniMIPS
H ex add res s
00000000
R es e r ve d
1 M w ords
P rogr am
T ex t s egm ent
63 M w or ds
00400000
10000000
A ddres s a ble
w ith 16- bit
s igned o ffs et
S tatic data
10008000
1000ffff
D ata s egm ent
D y nam ic data
$gp
$28
$29
$30
448 M w o rds
$sp
$fp
S tac k
S tac k s egm ent
7ffffffc
80000000
S ec ond hal f o f ad dr es s
s pac e res e r ved for
m em o ry -m a ppe d I/O
Figure 6.3
Jan. 2007
Overview of the memory address space in MiniMIPS.
Computer Architecture, Instruction-Set Architecture
Slide 33
6.3 Parameters and Results
Stack allows us to pass/return an arbitrary number of values
$sp
Loc al
vari ables
z
y
..
.
S a ved
regis ters
F ram e for
c urre nt
proc e dur e
O ld ($fp)
$sp
$fp
c
b
F ram e for
c urre nt
proc e dur e
a
.
..
c
b
a
.
..
F ram e for
pre vious
proc e dur e
$fp
B efo re c alling
Figure 6.5
Jan. 2007
A fter c alling
Use of the stack by a procedure.
Computer Architecture, Instruction-Set Architecture
Slide 34
Example of Using the Stack
Saving $fp, $ra, and $s0 onto the stack and restoring
them at the end of the procedure
$sp
$sp
$fp
$fp
proc: sw
addi
addi
sw
sw
.
($s0)
.
($ra)
.
($fp)
lw
lw
addi
lw
jr
Jan. 2007
$fp,-4($sp)
$fp,$sp,0
$sp,$sp,–12
$ra,-8($fp)
$s0,-12($fp)
#
#
#
#
#
save the old frame pointer
save ($sp) into $fp
create 3 spaces on top of stack
save ($ra) in 2nd stack element
save ($s0) in top stack element
$s0,-12($fp)
$ra,-8($fp)
$sp,$fp, 0
$fp,-4($sp)
$ra
#
#
#
#
#
put top stack element in $s0
put 2nd stack element in $ra
restore $sp to original state
restore $fp to original state
return from procedure
Computer Architecture, Instruction-Set Architecture
Slide 35
6.4 Data Types
Data size (number of bits), data type (meaning assigned to bits)
Signed integer:
Unsigned integer:
Floating-point number:
Bit string:
byte
byte
byte
word
word
word
word
doubleword
doubleword
Converting from one size to another
Type
8-bit number Value
32-bit version of the number
Unsigned 0010 1011
Unsigned 1010 1011
43
171
0000 0000 0000 0000 0000 0000 0010 1011
0000 0000 0000 0000 0000 0000 1010 1011
Signed
Signed
+43
–85
0000 0000 0000 0000 0000 0000 0010 1011
1111 1111 1111 1111 1111 1111 1010 1011
Jan. 2007
0010 1011
1010 1011
Computer Architecture, Instruction-Set Architecture
Slide 36
ASCII Characters
Table 6.1
ASCII (American standard code for information interchange)
0
0
NUL
1
DLE
2
SP
3
0
4
@
5
P
6
`
7
p
1
SOH
DC1
!
1
A
Q
a
q
2
STX
DC2
“
2
B
R
b
r
3
ETX
DC3
#
3
C
S
c
s
4
EOT
DC4
$
4
D
T
d
t
5
ENQ
NAK
%
5
E
U
e
u
6
ACK
SYN
&
6
F
V
f
v
7
BEL
ETB
‘
7
G
W
g
w
8
BS
CAN
(
8
H
X
h
x
9
HT
EM
)
9
I
Y
i
y
a
LF
SUB
*
:
J
Z
j
z
b
VT
ESC
+
;
K
[
k
{
c
FF
FS
,
<
L
\
l
|
d
CR
GS
-
=
M
]
m
}
e
SO
RS
.
>
N
^
n
~
f
SI
US
/
?
O
_
o
DEL
Jan. 2007
Computer Architecture, Instruction-Set Architecture
8-9
a-f
More
More
controls
symbols
8-bit ASCII code
(col #, row #)hex
e.g., code for +
is (2b) hex or
(0010 1011)two
Slide 37
Loading and Storing Bytes
Bytes can be used to store ASCII characters or small integers.
MiniMIPS addresses refer to bytes, but registers hold words.
lb
$t0,8($s3)
lbu
$t0,8($s3)
sb
$t0,A($s3)
op
31
I
#
#
#
#
#
rs
25
load rt with mem[8+($s3)]
sign-extend to fill reg
load rt with mem[8+($s3)]
zero-extend to fill reg
LSB of rt to mem[A+($s3)]
rt
20
im m e d ia te / o ffs e t
15
0
1 0 x x 0 0 1 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
lb = 32
lbu = 36
s b = 40
Figure 6.6
Jan. 2007
B as e
regis ter
D ata
regis ter
A ddres s o ffs et
Load and store instructions for byte-size data elements.
Computer Architecture, Instruction-Set Architecture
Slide 38
Meaning of a Word in Memory
B it pattern
(02 114 020 )
0000 0010 0001 0001 0100 0000 0010 0000
hex
00000010000100010100000000100000
A dd ins truc tion
00000010000100010100000000100000
P os itive inte ger
00000010000100010100000000100000
F our -c ha rac ter s trin g
Figure 6.7
A 32-bit word has no inherent meaning and can be
interpreted in a number of equally valid ways in the absence of
other cues (e.g., context) for the intended meaning.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 39
6.5 Arrays and Pointers
Index: Use a register that holds the index i and increment the register in
each step to effect moving from element i of the list to element i + 1
Pointer: Use a register that points to (holds the address of) the list element
being examined and update it in each step to point to the next element
A rray index i
B as e
A rray A
A dd 1 to i;
C om pute 4i;
A dd 4i to bas e
A [i]
A [i + 1]
P ointer to A [i]
A dd 4 to get
the add res s
of A [i + 1]
A rray A
A [i]
A [i + 1]
Figure 6.8 Stepping through the elements of an array using the
indexing method and the pointer updating method.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 40
Selection Sort
Example 6.4
To sort a list of numbers, repeatedly perform the following:
Find the max element, swap it with the last item, move up the “last” pointer
A
A
firs t
firs t
m ax
A
firs t
x
y
la s t
la s t
la s t
S tart of iteratio n
Figure 6.9
Jan. 2007
y
x
M ax im um ide ntifie d
E nd of iter ation
One iteration of selection sort.
Computer Architecture, Instruction-Set Architecture
Slide 41
Selection Sort Using the Procedure max
Example 6.4 (continued)
A
A
firs t
Inputs to
proc max
firs t
In $a0
m ax
In $v1
In $a1
y
Outputs from
proc max
la s t
la s t
S tart of iteratio n
Jan. 2007
x
In $v0
la s t
sort: beq
jal
lw
sw
sw
addi
j
done: ...
A
firs t
$a0,$a1,done
max
$t0,0($a1)
$t0,0($v0)
$v1,0($a1)
$a1,$a1,-4
sort
#
#
#
#
#
#
#
#
y
x
M ax im um ide ntifie d
E nd of iter ation
single-element list is sorted
call the max procedure
load last element into $t0
copy the last element to max loc
copy max value to last element
decrement pointer to last element
repeat sort for smaller list
continue with rest of program
Computer Architecture, Instruction-Set Architecture
Slide 42
6.6 Additional Instructions
MiniMIPS instructions for multiplication and division:
mult
div
$s0, $s1
$s0, $s1
mfhi
mflo
#
#
#
#
#
$t0
$t0
31
R
op
25
rt
15
rd
10
sh
5
fn
Reg
file
Mul/Div
unit
Hi
0
S ourc e
regis ter 1
S ourc e
regis ter 2
U nus e d
U nus e d
m ult = 24
di v = 26
The multiply (mult) and divide (div) instructions of MiniMIPS.
31
R
20
Hi,Lo to ($s0)($s1)
Hi to ($s0)mod($s1)
Lo to ($s0)/($s1)
$t0 to (Hi)
$t0 to (Lo)
0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 x 0
A LU
ins truc tion
Figure 6.10
rs
set
set
and
set
set
op
25
rs
20
rt
15
rd
10
sh
5
fn
0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 x 0
A LU
ins truc tion
U nus e d
U nus e d
D es tination
regis ter
U nus e d
m fhi = 16
m flo = 18
Figure 6.11 MiniMIPS instructions for copying the contents of Hi and Lo
registers into general registers .
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 43
Lo
Logical Shifts
MiniMIPS instructions for left and right shifting:
sll
srl
sllv
srlv
$t0,$s1,2
$t0,$s1,2
$t0,$s1,$s0
$t0,$s1,$s0
31
R
op
25
20
rt
15
left-shifted by 2
right-shifted by 2
left-shifted by ($s0)
right-shifted by ($s0)
rd
10
sh
fn
5
0
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 x 0
A LU
ins truc tion
31
R
rs
# $t0=($s1)
# $t0=($s1)
# $t0=($s1)
# $t0=($s1)
op
U nus e d
25
rs
S ourc e
regis ter
20
rt
D es tination
regis ter
15
rd
S hift
am ou nt
10
sh
s ll = 0
s rl = 2
fn
5
0
0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 x 0
A LU
ins truc tion
Figure 6.12
Jan. 2007
A m ount
regis ter
S ourc e
regis ter
D es tination
regis ter
U nus e d
s llv = 4
s rl v = 6
The four logical shift instructions of MiniMIPS.
Computer Architecture, Instruction-Set Architecture
Slide 44
Unsigned Arithmetic and Miscellaneous Instructions
MiniMIPS instructions for unsigned arithmetic (no overflow exception):
addu
subu
multu
divu
$t0,$s0,$s1
$t0,$s0,$s1
$s0,$s1
$s0,$s1
addiu $t0,$s0,61
#
#
#
#
#
#
#
#
set $t0 to ($s0)+($s1)
set $t0 to ($s0)–($s1)
set Hi,Lo to ($s0)($s1)
set Hi to ($s0)mod($s1)
and Lo to ($s0)/($s1)
set $t0 to ($s0)+61;
the immediate operand is
sign extended
To make MiniMIPS more powerful and complete, we introduce later:
sra
$t0,$s1,2
srav $t0,$s1,$s0
syscall
Jan. 2007
# sh. right arith (Sec. 10.5)
# shift right arith variable
# system call (Sec. 7.6)
Computer Architecture, Instruction-Set Architecture
Slide 45
The 20 MiniMIPS
Instructions
Copy
from Chapter 6
(40 in all so far)
Arithmetic
Table 6.2 (partial)
Shift
Memory access
Control transfer
Jan. 2007
Instruction
Usage
Move from Hi
Move from Lo
Add unsigned
Subtract unsigned
Multiply
Multiply unsigned
Divide
Divide unsigned
Add immediate unsigned
Shift left logical
Shift right logical
Shift right arithmetic
Shift left logical variable
Shift right logical variable
Shift right arith variable
Load byte
Load byte unsigned
Store byte
Jump and link
System call
mfhi rd
mflo rd
addu rd,rs,rt
subu rd,rs,rt
mult rs,rt
multu rs,rt
div
rs,rt
divu rs,rt
addiu rs,rt,imm
sll
rd,rt,sh
srl
rd,rt,sh
sra
rd,rt,sh
sllv rd,rt,rs
srlv rt,rd,rs
srav rd,rt,rd
lb
rt,imm(rs)
lbu
rt,imm(rs)
sb
rt,imm(rs)
jal
L
syscall
Computer Architecture, Instruction-Set Architecture
op fn
0
0
0
0
0
0
0
0
9
0
0
0
0
0
0
32
36
40
3
0
Slide 46
16
18
33
35
24
25
26
27
0
2
3
4
6
7
12
Table 6.2 The 37 + 3 MiniMIPS Instructions Covered So Far
Instruction
Usage
Instruction
Usage
Load upper immediate
Add
Subtract
Set less than
Add immediate
Set less than immediate
AND
OR
XOR
NOR
AND immediate
OR immediate
XOR immediate
Load word
Store word
Jump
Jump register
Branch less than 0
Branch equal
Branch not equal
lui
add
sub
slt
addi
slti
and
or
xor
nor
andi
ori
xori
lw
sw
j
jr
bltz
beq
bne
Move from Hi
Move from Lo
Add unsigned
Subtract unsigned
Multiply
Multiply unsigned
Divide
Divide unsigned
Add immediate unsigned
Shift left logical
Shift right logical
Shift right arithmetic
Shift left logical variable
Shift right logical variable
Shift right arith variable
Load byte
Load byte unsigned
Store byte
Jump and link
mfhi
mflo
addu
subu
mult
multu
div
divu
addiu
sll
srl
sra
sllv
srlv
srav
lb
lbu
sb
jal
System call
syscall
Jan. 2007
rt,imm
rd,rs,rt
rd,rs,rt
rd,rs,rt
rt,rs,imm
rd,rs,imm
rd,rs,rt
rd,rs,rt
rd,rs,rt
rd,rs,rt
rt,rs,imm
rt,rs,imm
rt,rs,imm
rt,imm(rs)
rt,imm(rs)
L
rs
rs,L
rs,rt,L
rs,rt,L
Computer Architecture, Instruction-Set Architecture
rd
rd
rd,rs,rt
rd,rs,rt
rs,rt
rs,rt
rs,rt
rs,rt
rs,rt,imm
rd,rt,sh
rd,rt,sh
rd,rt,sh
rd,rt,rs
rd,rt,rs
rd,rt,rs
rt,imm(rs)
rt,imm(rs)
rt,imm(rs)
L
Slide 47
7 Assembly Language Programs
Everything else needed to build and run assembly programs:
• Supply info to assembler about program and its data
• Non-hardware-supported instructions for convenience
Topics in This Chapter
7.1 Machine and Assembly Languages
7.2 Assembler Directives
7.3 Pseudoinstructions
7.4 Macroinstructions
7.5 Linking and Loading
7.6 Running Assembler Programs
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 48
7.1 Machine and Assembly Languages
$2, $5,$5
$2, $2,$2
$2, $4,$2
$15 ,0($2 )
$16 ,4($2 )
$16 ,0($2 )
$15 ,4($2 )
$31
00 a5102 0
00 42102 0
00 82102 0
8c 62000 0
8c f2000 4
ac f2000 0
ac 62000 4
03 e0000 8
E xe c uta b le
m a chi ne
la ng ua g e
p ro g ra m
Loader
ad d
ad d
ad d
lw
lw
sw
sw
jr
M a chi ne
la ng ua g e
p ro g ra m
L inke r
A sse m b ly
la ng ua g e
p ro g ra m
A sse m b le r
M IP S , 80x 86,
P ow e rP C , etc .
Libra ry ro utines
(m ac hin e lang uag e)
Memory
co nte n t
Figure 7.1 Steps in transforming an assembly language program to
an executable program residing in memory.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 49
Symbol Table
A s s em bly languag e pro gr am
a ddi
s ub
a dd
te st: b ne
a ddi
a dd
j
do ne: s w
S y m bol
table
$ s0,$z ero,9
$ t0,$s 0,$s0
$ t1,$z ero,$ zero
$ t0,$s 0,don e
$ t0,$t 0,1
$ t1,$s 0,$ze ro
t est
$ t1,re sult( $gp)
done
res ult
tes t
28
248
12
Loc ation
0
4
8
12
16
20
24
28
M ac hine lan gua ge p rog ra m
00 10000 00001 00000 00000 00000 01001
00 00001 00001 00000 10000 00001 00010
00 00000 10010 00000 00000 00001 00000
00 01010 10001 00000 00000 00000 01100
00 10000 10000 10000 00000 00000 00001
00 00001 00000 00000 10010 00001 00000
00 00100 00000 00000 00000 00000 00011
10 10111 11000 10010 00000 00111 11000
op
rs
rt
rd
sh
fn
F i el d b oundari es shown to fac il i tate unders tandi ng
D e te rm in ed fro m ass em ble r
d ire ctive s n o t sh o wn h e re
Figure 7.2 An assembly-language program, its machine-language
version, and the symbol table created during the assembly process.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 50
7.2 Assembler Directives
Assembler directives provide the assembler with info on how to translate
the program but do not lead to the generation of machine instructions
tiny:
max:
small:
big:
array:
str1:
str2:
.macro
.end_macro
.text
...
.data
.byte
156,0x7a
.word
35000
.float
2E-3
.double 2E-3
.align
2
.space
600
.ascii
“a*b”
.asciiz “xyz”
.global main
Jan. 2007
#
#
#
#
#
#
#
#
#
#
#
#
#
#
start macro (see Section 7.4)
end macro (see Section 7.4)
start program’s text segment
program text goes here
start program’s data segment
name & initialize data byte(s)
name & initialize data word(s)
name short float (see Chapter 12)
name long float (see Chapter 12)
align next item on word boundary
reserve 600 bytes = 150 words
name & initialize ASCII string
null-terminated ASCII string
consider “main” a global name
Computer Architecture, Instruction-Set Architecture
Slide 51
Composing Simple Assembler Directives
Example 7.1
Write assembler directive to achieve each of the following objectives:
a. Put the error message “Warning: The printer is out of paper!” in memory.
b. Set up a constant called “size” with the value 4.
c. Set up an integer variable called “width” and initialize it to 4.
d. Set up a constant called “mill” with the value 1,000,000 (one million).
e. Reserve space for an integer vector “vect” of length 250.
Solution:
a. noppr: .asciiz “Warning: The printer is out of paper!”
b. size: .byte 4
# small constant fits in one byte
c. width: .word 4
# byte could be enough, but ...
d. mill: .word 1000000
# constant too large for byte
e. vect: .space 1000
# 250 words = 1000 bytes
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 52
7.3 Pseudoinstructions
Example of one-to-one pseudoinstruction: The following
not
$s0
# complement ($s0)
is converted to the real instruction:
nor
$s0,$s0,$zero
# complement ($s0)
Example of one-to-several pseudoinstruction: The following
abs
$t0,$s0
# put |($s0)| into $t0
is converted to the sequence of real instructions:
add
slt
beq
sub
Jan. 2007
$t0,$s0,$zero
$at,$t0,$zero
$at,$zero,+4
$t0,$zero,$s0
#
#
#
#
copy x into $t0
is x negative?
if not, skip next instr
the result is 0 – x
Computer Architecture, Instruction-Set Architecture
Slide 53
MiniMIPS
Pseudoinstructions
Copy
Arithmetic
Table 7.1
Shift
Logic
Memory access
Control transfer
Jan. 2007
Pseudoinstruction
Usage
Move
Load address
Load immediate
Absolute value
Negate
Multiply (into register)
Divide (into register)
Remainder
Set greater than
Set less or equal
Set greater or equal
Rotate left
Rotate right
NOT
Load doubleword
Store doubleword
Branch less than
Branch greater than
Branch less or equal
Branch greater or equal
move
la
li
abs
neg
mul
div
rem
sgt
sle
sge
rol
ror
not
ld
sd
blt
bgt
ble
bge
Computer Architecture, Instruction-Set Architecture
regd,regs
regd,address
regd,anyimm
regd,regs
regd,regs
regd,reg1,reg2
regd,reg1,reg2
regd,reg1,reg2
regd,reg1,reg2
regd,reg1,reg2
regd,reg1,reg2
regd,reg1,reg2
regd,reg1,reg2
reg
regd,address
regd,address
reg1,reg2,L
reg1,reg2,L
reg1,reg2,L
reg1,reg2,L
Slide 54
7.4 Macroinstructions
A macro is a mechanism to give a name to an oft-used
sequence of instructions (shorthand notation)
.macro name(args)
...
.end_macro
# macro and arguments named
# instr’s defining the macro
# macro terminator
How is a macro different from a pseudoinstruction?
Pseudos are predefined, fixed, and look like machine instructions
Macros are user-defined and resemble procedures (have arguments)
How is a macro different from a procedure?
Control is transferred to and returns from a procedure
After a macro has been replaced, no trace of it remains
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 55
Macro to Find the Largest of Three Values
Example 7.4
Write a macro to determine the largest of three values in registers and to
put the result in a fourth register.
Solution:
.macro mx3r(m,a1,a2,a3)
move
m,a1
bge
m,a2,+4
move
m,a2
bge
m,a3,+4
move
m,a3
.endmacro
#
#
#
#
#
#
#
macro and arguments named
assume (a1) is largest; m = (a1)
if (a2) is not larger, ignore it
else set m = (a2)
if (a3) is not larger, ignore it
else set m = (a3)
macro terminator
If the macro is used as mx3r($t0,$s0,$s4,$s3), the assembler replaces
the arguments m, a1, a2, a3 with $t0, $s0, $s4, $s3, respectively.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 56
7.5 Linking and Loading
The linker has the following responsibilities:
Ensuring correct interpretation (resolution) of labels in all modules
Determining the placement of text and data segments in memory
Evaluating all data addresses and instruction labels
Forming an executable program with no unresolved references
The loader is in charge of the following:
Determining the memory needs of the program from its header
Copying text and data from the executable program file into memory
Modifying (shifting) addresses, where needed, during copying
Placing program parameters onto the stack (as in a procedure call)
Initializing all machine registers, including the stack pointer
Jumping to a start-up routine that calls the program’s main routine
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 57
7.6 Running Assembler Programs
Spim is a simulator that can run MiniMIPS programs
The name Spim comes from reversing MIPS
Three versions of Spim are available for free downloading:
PCSpim
for Windows machines
xspim
for X-windows
spim
for Unix systems
You can download SPIM by visiting:
http://www.cs.wisc.edu/~larus/spim.html
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 58
Input/Output Conventions for MiniMIPS
Table 7.2
Input/output and control functions of syscall in PCSpim.
Arguments
Result
1 Print integer
Integer in $a0
Integer displayed
2 Print floating-point
Float in $f12
Float displayed
3 Print double-float
Double-float in $f12,$f13
Double-float displayed
4 Print string
Pointer in $a0
Null-terminated string displayed
Cntl
Input
Output
($v0) Function
5 Read integer
Integer returned in $v0
6 Read floating-point
Float returned in $f0
7 Read double-float
Double-float returned in $f0,$f1
8 Read string
Pointer in $a0, length in $a1 String returned in buffer at pointer
9 Allocate memory
Number of bytes in $a0
10 Exit from program
Jan. 2007
Pointer to memory block in $v0
Program execution terminated
Computer Architecture, Instruction-Set Architecture
Slide 59
PCSpim
User
Interface
PC S p im
Me n u b a r
File S im ula to r W ind ow
To o ls b a r
 
File
?
?
PC
= 00 40 000 0
S ta tus = 00 00 000 0
R0
R1
C lear R egis ters
R einitializ e
R eload
Go
B reak
C ontinue
S ingle S tep
M ultiple S tep ...
B reak points ...
S et V alue ...
Dis p Sy m bol T able
S ettings ...
W ind ow
T ile
1 M es s ages
2 T ex t S egm ent
3 Data S egm ent
4 R egis ters
5 C ons ole
C lear C ons ole
T oolbar
S tatus bar
S ta tu s b a r
Jan. 2007

R egis ters
O pen
S av e Log F ile
Ex it
S im ula tor
Figure 7.3

H elp
(r 0) = 0
(a t) = 0
E PC
= 00 00 000 0
C au se = 000 00 000
HI
= 00 00 000 0
LO
= 000 00 000
Ge ner al Re gi ste rs
R8
(t0 ) = 0
R16 ( s0) = 0
R 24
R9
(t1 ) = 0
R17 ( s1) = 0
R 25
Tex t S egm en t
[ 0x 004 00 000 ]
[ 0x 004 00 004 ]
[ 0x 004 00 008 ]
[ 0x 004 00 00c ]
[ 0x 004 00 010 ]
0x 0c1 00 008
0x 000 00 021
0x 240 20 00a
0x 000 00 00c
0x 000 00 021
jal 0 x00 40 002 0 [ma in ]
add u $0, $ 0, $0
add iu $2 , $0, 1 0
sys ca ll
add u $0, $ 0, $0
;
;
;
;
;
43
44
45
46
47
D ata S egm e n t
DA TA
[ 0x 100 00 000 ]
[ 0x 100 00 010 ]
[ 0x 100 00 020 ]
0x0 00 000 00 0x 6c 696 14 6 0 x2 020 64 65
0x6 76 e69 74 0x 44 444 12 0 0 x6 554 00 0a
0x4 44 120 67 0x 00 0a4 94 4 0 x7 473 65 54
M essages
S ee th e fil e REA DM E f or a fu ll co pyr ig ht no tic e.
M em ory a nd re gis te rs ha ve be en cl ear ed , a nd th e sim ul ato r rei
D :\ tem p\ dos \T EST S\ Alu ba re. s has b een s ucc es sfu ll y l oa ded
F or Help, press F 1
B as e=1; Ps eudo=1, M apped=1; LoadT rap=0
Computer Architecture, Instruction-Set Architecture
Slide 60
8 Instruction Set Variations
The MiniMIPS instruction set is only one example
• How instruction sets may differ from that of MiniMIPS
• RISC and CISC instruction set design philosophies
Topics in This Chapter
8.1 Complex Instructions
8.2 Alternative Addressing Modes
8.3 Variations in Instruction Formats
8.4 Instruction Set Design and Evolution
8.5 The RISC/CISC Dichotomy
8.6 Where to Draw the Line
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 61
8.1 Complex Instructions
Table 8.1 (partial) Examples of complex instructions in two popular modern
microprocessors and two computer families of historical significance
Machine
Instruction
Effect
Pentium
MOVS
Move one element in a string of bytes, words, or
doublewords using addresses specified in two pointer
registers; after the operation, increment or decrement
the registers to point to the next element of the string
PowerPC
cntlzd
Count the number of consecutive 0s in a specified
source register beginning with bit position 0 and place
the count in a destination register
IBM 360-370
CS
Compare and swap: Compare the content of a register
to that of a memory location; if unequal, load the
memory word into the register, else store the content
of a different register into the same memory location
Digital VAX
POLYD
Polynomial evaluation with double flp arithmetic:
Evaluate a polynomial in x, with very high precision in
intermediate results, using a coefficient table whose
location in memory is given within the instruction
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 62
Benefits and Drawbacks of Complex Instructions
Fewer instructions in program
(less memory)
Fewer memory accesses for
instructions
Programs may become easier
to write/read/understand
Potentially faster execution
(complex steps are still done
sequentially in multiple cycles,
but hardware control can be
faster than software loops)
Jan. 2007
More complex format
(slower decoding)
Less flexible
(one algorithm for polynomial
evaluation or sorting may not
be the best in all cases)
If interrupts are processed at
the end of instruction cycle,
machine may become less
responsive to time-critical
events (interrupt handling)
Computer Architecture, Instruction-Set Architecture
Slide 63
8.2 Alternative Addressing Modes
A ddres s ing
Ins truc tion
O ther el em e nts in vol ved
Some p lac e
in the mac hine
Im pli ed
Let’s
refresh
our
memory
(from
Chap. 5)
O pe ran d
Ex tend,
if r equir ed
Im m ediate
Reg s pec
Reg f ile
Reg data
R egis ter
Cons tant offs et
B as e
Reg bas e
P C -r elati ve
Reg f ile
Reg
data
A dd
Me m
addr
A dd
Me m
addr
Cons tant offs et
Me mor y
Me m
data
Me mor y
Me m
data
Me mor y
Me m
data
PC
P s eudodi rec t
PC
Me m
addr
Figure 5.11 Schematic representation of addressing modes in MiniMIPS.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 64
More Elaborate Addressing Modes
A ddres s ing
Ins truc tion
Ind ex ed
O ther el em e nts in vol ved
Reg f ile
Me m
addr
Me mor y
Me m
data
Me m
Inc r e- addr
ment
Me mor y
Me m
data
Me m
addr
Me mor y
Me m
data
A dd
Index r eg
O pe ran d
Bas e r eg
Inc r ement a mount
U pd ate
(w ith bas e )
Bas e r eg
U pd ate
(w ith index ed)
Reg f ile
Reg f ile
A dd
Bas e r eg
Index r eg
Inc r ement
amount
Indi rec t
Inc r ement
Me m data
PC
Me mor y
Me m addr
This par t may be r eplac ed w ith any
other f or m of addr ess spec if ic ation
Me m addr ,
2nd ac c ess
Me mor y
Me m data,
2nd ac c ess
Figure 8.1 Schematic representation of more elaborate
addressing modes not supported in MiniMIPS.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 65
Usefulness of Some Elaborate Addressing Modes
Update mode: XORing a string of bytes
loop: lb
xor
addi
bne
$t0,A($s0)
$s1,$s1,$t0
$s0,$s0,-1
$s0,$zero,loop
Indirect mode: Case statement
case: lw
add
add
la
add
lw
jr
Jan. 2007
$t0,0($s0)
$t0,$t0,$t0
$t0,$t0,$t0
$t1,T
$t1,$t0,$t1
$t2,0($t1)
$t2
#
#
#
#
get s
form 2s
form 4s
base T
# entry
One instruction with
update addressing
Branch to location Li
if s = i (switch var.)
T
T+4
T+8
T + 12
T + 16
T + 20
Computer Architecture, Instruction-Set Architecture
L0
L1
L2
L3
L4
L5
Slide 66
8.3 Variations in Instruction Formats
0-, 1-, 2-, and 3-address instructions in MiniMIPS
Categ ory
F orm at
O pc od e
Des c ription o f o pe ran d(s )
syscall
O n e im plied o p e ran d in re gis te r $v0
j
Ju m p ta rge t a d d ress e d (in p se u d od ire ct fo rm )
0-a ddr es s
0
1-a ddr es s
2
2-a ddr es s
0
rs rt
24
mult
Tw o s ou rce re g is te rs a dd ress e d , d es tina tio n im p lie d
3-a ddr es s
0
rs rt rd
32
add
D es tin a tion a n d tw o s o u rce reg is te rs a d d resse d
12
A ddress
Figure 8.2 Examples of MiniMIPS instructions with 0 to 3
addresses; shaded fields are unused.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 67
Zero-Address Architecture: Stack Machine
Stack holds all the operands (replaces our register file)
Load/Store operations become push/pop
Arithmetic/logic operations need only an opcode: they pop operand(s)
from the top of the stack and push the result onto the stack
Example: Evaluating the expression (a + b)  (c – d)
Push a
Push b
Add
Push d
Push c
Subtract
Multiply
a
b
a
a+b
d
a+b
c
d
a+b
c–d
a+b
Result
Polish string: a b + d c – 
If a variable is used again, you may have to push it multiple times
Special instructions such as “Duplicate” and “Swap” are helpful
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 68
One-Address Architecture: Accumulator Machine
The accumulator, a special register attached to the ALU, always holds
operand 1 and the operation result
Only one operand needs to be specified by the instruction
Example: Evaluating the expression (a + b)  (c – d)
Load
add
Store
load
subtract
multiply
a
b
t
c
d
t
Within branch instructions, the condition or
target address must be implied
Branch to L if acc negative
If register x is negative skip the next instruction
May have to store accumulator contents in memory (example above)
No store needed for a + b + c + d + . . . (“accumulator”)
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 69
Two-Address Architectures
Two addresses may be used in different ways:
Operand1/result and operand 2
Condition to be checked and branch target address
Example: Evaluating the expression (a + b)  (c – d)
load
add
load
subtract
multiply
$1,a
$1,b
$2,c
$2,d
$1,$2
Instructions of a hypothetical
two-address machine
A variation is to use one of the addresses as in a one-address
machine and the second one to specify a branch in every instruction
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 70
Example of a Complex Instruction Format
Instruction prefixes (zero to four, 1 B each)
Operand/address
size overwrites and
other modifiers
Mod Reg/Op R/M Scale Index Base
Opcode (1-2 B)
ModR/M
SIB
Offset or displacement (0, 1, 2, or 4 B)
Most memory
operands need
these 2 bytes
Instructions
can contain
up to 15 bytes
Immediate (0, 1, 2, or 4 B)
Components that form a variable-length IA-32 (80x86) instruction.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 71
Some of IA-32’s Variable-Width Instructions
Ty pe
F orm at (fi eld w idths s ho w n)
1-by te
5
3
2-by te
4
4
3-by te
6
4-by te
8
5-by te
4 3
6-by te
7
8
8
8
8
8
8
32
8
32
O pc od e
D es c ription o f o pe ran d(s )
PUSH
3 -b it re g is te r sp e cifica tion
JE
4 -b it co n d ition , 8 -b it jum p o ffs e t
MOV
8 -b it re g is te r/m o d e , 8 -b it o ffs e t
XOR
8 -b it re g is te r/m o d e , 8 -b it b as e /in d e x,
8 -b it o ffs e t
ADD
3 -b it re g is te r sp e c, 3 2 -b it im m e d ia te
TEST
8 -b it re g is te r/m o d e , 3 2 -bit im m ed ia te
Figure 8.3 Example 80x86 instructions ranging in width from 1 to 6
bytes; much wider instructions (up to 15 bytes) also exist
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 72
8.4 Instruction Set Design and Evolution
Desirable attributes of an instruction set:
Consistent, with uniform and generally applicable rules
Orthogonal, with independent features noninterfering
Transparent, with no visible side effect due to implementation details
Easy to learn/use (often a byproduct of the three attributes above)
Extensible, so as to allow the addition of future capabilities
Efficient, in terms of both memory needs and hardware realization
In s tru ctio n -s e t
d e fin ition
P ro ce ss o r
d e sig n
te a m
New
m a chin e
p ro je ct
Im p le m enta tio n
P e rfo rm an ce
o b je ctive s
Fa b rica tio n &
te s tin g
S a les
&
us e
?
Tu n in g &
b u g fixe s
Fe e d b a ck
Figure 8.4
Jan. 2007
Processor design and implementation process.
Computer Architecture, Instruction-Set Architecture
Slide 73
8.5 The RISC/CISC Dichotomy
The RISC (reduced instruction set computer) philosophy:
Complex instruction sets are undesirable because inclusion of
mechanisms to interpret all the possible combinations of opcodes
and operands might slow down even very simple operations.
Ad hoc extension of instruction sets, while maintaining backward
compatibility, leads to CISC; imagine modern English containing
every English word that has been used through the ages
Features of RISC architecture
1.
2.
3.
4.
Small set of inst’s, each executable in roughly the same time
Load/store architecture (leading to more registers)
Limited addressing mode to simplify address calculations
Simple, uniform instruction formats (ease of decoding)
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 74
RISC/CISC Comparison via Generalized Amdahl’s Law
Example 8.1
An ISA has two classes of simple (S) and complex (C) instructions.
On a reference implementation of the ISA, class-S instructions
account for 95% of the running time for programs of interest. A RISC
version of the machine is being considered that executes only class-S
instructions directly in hardware, with class-C instructions treated as
pseudoinstructions. It is estimated that in the RISC version, class-S
instructions will run 20% faster while class-C instructions will be
slowed down by a factor of 3. Does the RISC approach offer better or
worse performance compared to the reference implementation?
Solution
Per assumptions, 0.95 of the work is speeded up by a factor of 1.0 /
0.8 = 1.25, while the remaining 5% is slowed down by a factor of 3.
The RISC speedup is 1 / [0.95 / 1.25 + 0.05  3] = 1.1. Thus, a 10%
improvement in performance can be expected in the RISC version.
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 75
Some Hidden Benefits of RISC
In Example 8.1, we established that a speedup factor of 1.1 can be
expected from the RISC version of a hypothetical machine
This is not the entire story, however!
If the speedup of 1.1 came with some additional cost, then one might
legitimately wonder whether it is worth the expense and/or time
The RISC version of the architecture also:
Reduces the effort and team size for design
Shortens the testing and debugging phase
Cheaper product and
shorter time-to-market
Simplifies documentation and maintenance
Jan. 2007
Computer Architecture, Instruction-Set Architecture
Slide 76
8.6 Where to Draw the Line
The ultimate reduced instruction set computer (URISC):
How many instructions are absolutely needed for useful computation?
Only one!
subtract source1 from source2, replace source2 with the
result, and jump to target address if result is negative
Assembly language form:
label: urisc
dest,src1,target
Pseudoinstructions can be synthesized using the single instruction:
stop: .word
start: urisc
urisc
urisc
Corrected
urisc
version
...
Jan. 2007
0
dest,dest,+1
temp,temp,+1
temp,src,+1
dest,temp,+1
#
#
#
#
#
dest
temp
temp
dest
rest
This is the move
pseudoinstruction
= 0
= 0
= -(src)
= -(temp); i.e. (src)
of program
Computer Architecture, Instruction-Set Architecture
Slide 77
Some Useful Pseudo Instructions for URISC
Example 8.2 (2 parts of 5)
Write the sequence of instructions that are produced by the URISC
assembler for each of the following pseudoinstructions.
parta: uadd
partc: uj
dest,src1,src2
label
# dest=(src1)+(src2)
# goto label
Solution
at1 and at2 are temporary memory locations for assembler’s use
parta: urisc
urisc
urisc
urisc
urisc
partc: urisc
urisc
Jan. 2007
at1,at1,+1
at1,src1,+1
at1,src2,+1
dest,dest,+1
dest,at1,+1
at1,at1,+1
at1,one,label
# at1 = 0
# at1 = -(src1)
# at1 = -(src1)–(src2)
# dest = 0
# dest = -(at1)
# at1 = 0
# at1 = -1 to force jump
Computer Architecture, Instruction-Set Architecture
Slide 78
URISC Hardware
U R IS C ins truc tion:
W o rd 1
W o rd 2
W o rd 3
Sour c e 1
Sour c e 2 / Des t
Jump tar get
C om p
C
0
PC
in
M DR
in
MAR
in
in
0
R ea d
1
R’
R
P
C
A dder
N
R
Z
in
N
in
Figure 8.5
Jan. 2007
W rite
Z
M
D
R
M
A
R
in
1 Mu x 0
PC
M em o ry
unit
out
Instruction format and hardware structure for URISC.
Computer Architecture, Instruction-Set Architecture
Slide 79
Descargar

Adventures on the Sea of Interconnection Networks