Lecture 11 Circuits for floating-point operations addition multiplication division (only sketchy) Oct 12 Floating-Point Addition Consider a 4-digit decimal example 1. Align decimal points 9.999 × 101 + 0.016 × 101 = 10.015 × 101 3. Normalize result & check for over/underflow Shift number with smaller exponent 9.999 × 101 + 0.016 × 101 2. Add significands 9.999 × 101 + 1.610 × 10–1 1.0015 × 102 4. Round and renormalize if necessary 1.002 × 102 Floating-Point Addition Now consider a 4-digit binary example 1. Align binary points 1.0002 × 2–1 + –0.1112 × 2–1 = 0.0012 × 2–1 3. Normalize result & check for over/underflow Shift number with smaller exponent 1.0002 × 2–1 + –0.1112 × 2–1 2. Add significands 1.0002 × 2–1 + –1.1102 × 2–2 (0.5 + – 0.4375) 1.0002 × 2–4, with no over/underflow 4. Round and renormalize if necessary 1.0002 × 2–4 (no change) = 0.0625 FP Adder Hardware Much more complex than integer adder Doing it in one clock cycle would take too long Much longer than integer operations Slower clock would penalize all instructions FP adder usually takes several cycles Can be pipelined Floating point addition Start 1. Compare the exponents of the two numbers. Shift the smaller number to the right until its exponent would match the larger exponent 2. Add the significands 3. Normalize the sum, either shifting right and incrementing the exponent or shifting left and decrementing the exponent Overflow or underflow? Yes No 4. Round the significand to the appropriate number of bits No Still normalized? Yes Done Exception FP Adder Hardware Step 1 Step 2 Step 3 Step 4 Addition and Subtraction Exponents of both operands must be equal before adding or subtracting significands Significands aligned by shifting the significand of the smaller operand |E1-E2| base- positions to the right, increasing its exponent, until exponents are equal E1E2 - Exponent of larger number not decreased - this will result in a significand larger than 1 - a larger significand adder required Addition/Subtraction – post-normalization Addition - resultant significand M (sum of two aligned significands) is in range 1/ M < 2 If M >1 - a postnormalization step - shifting significand to the right to yield M3 and increasing exponent by one - is required (an exponent overflow may occur) Subtraction - Resultant significand M is in range 0 |M|< 1 - postnormalization step - shifting significand to left and decreasing exponent - is required if M<1/ (an exponent underflow may occur) In extreme cases, the postnormalization step may require a shift left operation over all bits in significand, yielding a zero result Example F1=(0.100000)16 163 ; F2=(0.FFFFFF)16 162 Short IBM format ; calculate F1 – F2 Significand of smaller number (F2) is shifted to the right - least-significant digit lost Shift is time consuming - result is wrong Example - Continued Correct result (with “unlimited" number of significand digits) Error (also called loss of significance) is 0.1 16-2 - 0.1 16-3 = 0.F 16-3 Solution to problem - guard digits - additional digits to the right of the significand to hold shiftedout digits In example - a single (hexadecimal) guard digit is sufficient Steps in Addition/Subtraction of Floating-Point Numbers Step 1: Calculate difference d of the two exponents - d=|E1 - E2| Step 2: Shift significand of smaller number by d base positions to the right Step 3: Add aligned significands and set exponent of result to exponent of larger operand Step 4: Normalize resultant significand and adjust exponent if necessary Step 5: Round resultant significand and adjust exponent if necessary Floating Point Complexities Operations are somewhat more complicated In addition to overflow we can have “underflow” Accuracy can be a big problem IEEE 754 keeps two extra bits, guard and round four rounding modes positive divided by zero yields “infinity” zero divide by zero yields “not a number” other complexities Implementing the standard can be tricky Not using the standard can be even worse see text for description of 80x86 and Pentium bug! FP Arithmetic Hardware FP multiplier is of similar complexity to FP adder FP arithmetic hardware usually does But uses a multiplier for significands instead of an adder Addition, subtraction, multiplication, division, reciprocal, square-root FP integer conversion Operations usually takes several cycles Can be pipelined Floating-Point Dividers ( s1 b e1) / ( s2 b e2) = ( s1 / s2 ) b e1-e2 Floating-point operands s1 / s2 (0.5, 2): may need postshifting Unpack Overflow or underflow can occur during division or normalization Rounding considerations XOR Subt ract Exponents Divide Signifi cands Adj ust Exponent Quotient must be produced with two extra bits (G and R), in case of the need for a normalizing left shift Normalize Round Adj ust Exponent Normalize P ack Quotient Block diagram of a floating-point divider round and sticky bits Original source: CMU 18-447 S’09 L4-21© 2009 J. C. Hoe Original source: CMU 18-447 S’09 L4-21© 2009 J. C. Hoe FP Instructions in MIPS FP hardware is coprocessor 1 Adjunct processor that extends the ISA Separate FP registers 32 single-precision: $f0, $f1, … $f31 Paired for double-precision: $f0/$f1, $f2/$f3, … FP instructions operate only on FP registers Programs generally don’t do integer ops on FP data, or vice versa More registers with minimal code-size impact FP load and store instructions lwc1, ldc1, swc1, sdc1 e.g., ldc1 $f8, 32($sp) FP Instructions in MIPS Single-precision arithmetic add.s, sub.s, mul.s, div.s Double-precision arithmetic add.d, sub.d, mul.d, div.d e.g., mul.d $f4, $f4, $f6 Single- and double-precision comparison c.xx.s, c.xx.d (xx is eq, lt, le, …) Sets or clears FP condition-code bit e.g., add.s $f0, $f1, $f6 e.g. c.lt.s $f3, $f4 Branch on FP condition code true or false bc1t, bc1f e.g., bc1t TargetLabel Floating-Point Instructions Floating-point arithmetic instructions for MIPS: add.s sub.d mul.d div.s neg.s 31 F $f0,$f8,$f10 $f0,$f8,$f10 $f0,$f8,$f10 $f0,$f8,$f10 $f0,$f8 op 25 ex # # # # # 20 set set set set set ft $f0 $f0 $f0 $f0 $f0 15 to to to to to fs ($f8) +fp ($f8) –fp ($f8) fp ($f8) /fp –($f8) 10 fd ($f10) ($f10) ($f10) ($f10) fn 5 0 0 1 0 0 0 1 0 0 0 0 x 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 x x x Floating-point instruction s=0 d=1 Source register 2 Source register 1 Destination register add.* = 0 sub.* = 1 mul.* = 2 div.* = 3 neg.* = 7 The common floating-point instruction format for MIPS and components for arithmetic instructions. The extension (ex) field distinguishes single (* = s) from double (* = d) operands. Floating-Point Format Conversions Floating-Point Data Transfers MIPS instructions for floating-point load, store, and move: lwc1 swc1 mov.s mov.d mfc1 mtc1 31 F $f8,40($s3) $f8,A($s3) $f0,$f8 $f0,$f8 $t0,$f12 $f8,$t4 op 25 20 ft load mem[40+($s3)] into $f8 store ($f8) into mem[A+($s3)] load $f0 with ($f8) load $f0,$f1 with ($f8,$f9) load $t0 with ($f12) load $f8 with ($t4) 15 fs 10 fd 5 fn 0 0 1 0 0 0 1 0 0 0 0 x 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 Floating-point instruction 31 R ex # # # # # # op s=0 d=1 25 rs Unused 20 rt Source register 15 rd Destination register 10 sh mov.* = 6 5 fn 0 0 1 0 0 0 1 0 0 x 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Floating-point instruction mfc1 = 0 mtc1 = 4 Source register Destination register Unused Unused Instructions for floating-point data movement in MIPS. Floating-Point Branches and Comparisons MIPS instructions for floating-point load, store, and move: bc1t bc1f c.eq.* c.lt.* c.le.* 31 I L L $f0,$f8 $f0,$f8 $f0,$f8 op 25 20 branch on fp flag true branch on fp flag false if ($f0)=($f8), set flag to “true” if ($f0)<($f8), set flag to “true” if ($f0)($f8), set flag to “true” rt operand / offset 15 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 x 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 Floating-point instruction 31 F rs # # # # # op bc1? = 8 25 ex true = 1 false = 0 20 ft Offset 15 fs 10 fd 5 fn 0 0 1 0 0 0 1 0 0 0 0 x 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 Floating-point instruction s=0 d=1 Source register 2 Source register 1 Unused c.eq.* = 50 c.lt.* = 60 c.le.* = 62 Floating-point branch and comparison instructions in MIPS. Floating-Point Instructions MIPS Copy Arithmetic * s/d for single/double # 0/1 for single/double Conversions Memory access Control transfer Instruction Usage Move s/d registers Move fm coprocessor 1 Move to coprocessor 1 Add single/double Subtract single/double Multiply single/double Divide single/double Negate single/double Compare equal s/d Compare less s/d Compare less or eq s/d Convert integer to single Convert integer to double Convert single to double Convert double to single Convert single to integer Convert double to integer Load word coprocessor 1 Store word coprocessor 1 Branch coproc 1 true Branch coproc 1 false mov.* fd,fs mfc1 rt,rd mtc1 rd,rt add.* fd,fs,ft sub.* fd,fs,ft mul.* fd,fs,ft div.* fd,fs,ft neg.* fd,fs c.eq.* fs,ft c.lt.* fs,ft c.le.* fs,ft cvt.s.w fd,fs cvt.d.w fd,fs cvt.d.s fd,fs cvt.s.d fd,fs cvt.w.s fd,fs cvt.w.d fd,fs lwc1 ft,imm(rs) swc1 ft,imm(rs) bc1t L bc1f L ex fn # 0 4 # # # # # # # # 0 0 1 1 0 1 rs rs 8 8 6 0 1 2 3 7 50 60 62 32 33 33 32 36 36 FP Example: °F to °C C code: float f2c (float fahr) { return ((5.0/9.0)*(fahr - 32.0)); } fahr in $f12, result in $f0, constants in addresses specified. Compiled MIPS code: f2c: lwc1 lwc2 div.s lwc1 sub.s mul.s jr $f16, $f18, $f16, $f18, $f18, $f0, $ra const5 const9 $f16, $f18 const32 $f12, $f18 $f16, $f18 FP Example: Array Multiplication X=X+Y×Z All 32 × 32 matrices, 64-bit double-precision elements C code: void mm (double x[][], double y[][], double z[][]) { int i, j, k; for (i = 0; i! = 32; i= i + 1) for (j = 0; j! = 32; j = j + 1) for (k = 0; k! = 32; k = k + 1) x[i][j] = x[i][j] + y[i][k] * z[k][j]; } Addresses of x, y, z in $a0, $a1, $a2, and i, j, k in $s0, $s1, $s2 FP Example: Array Multiplication MIPS code: li li L1: li L2: li sll addu sll addu l.d L3: sll addu sll addu l.d … $t1, 32 $s0, 0 $s1, 0 $s2, 0 $t2, $s0, 5 $t2, $t2, $s1 $t2, $t2, 3 $t2, $a0, $t2 $f4, 0($t2) $t0, $s2, 5 $t0, $t0, $s1 $t0, $t0, 3 $t0, $a2, $t0 $f16, 0($t0) # # # # # # # # # # # # # # $t1 = 32 (row size/loop end) i = 0; initialize 1st for loop j = 0; restart 2nd for loop k = 0; restart 3rd for loop $t2 = i * 32 (size of row of x) $t2 = i * size(row) + j $t2 = byte offset of [i][j] $t2 = byte address of x[i][j] $f4 = 8 bytes of x[i][j] $t0 = k * 32 (size of row of z) $t0 = k * size(row) + j $t0 = byte offset of [k][j] $t0 = byte address of z[k][j] $f16 = 8 bytes of z[k][j] FP Example: Array Multiplication … sll $t0, $s0, 5 addu $t0, $t0, $s2 sll $t0, $t0, 3 addu $t0, $a1, $t0 l.d $f18, 0($t0) mul.d $f16, $f18, $f16 add.d $f4, $f4, $f16 addiu $s2, $s2, 1 bne $s2, $t1, L3 s.d $f4, 0($t2) addiu $s1, $s1, 1 bne $s1, $t1, L2 addiu $s0, $s0, 1 bne $s0, $t1, L1 # # # # # # # # # # # # # # $t0 = i*32 (size of row of y) $t0 = i*size(row) + k $t0 = byte offset of [i][k] $t0 = byte address of y[i][k] $f18 = 8 bytes of y[i][k] $f16 = y[i][k] * z[k][j] f4=x[i][j] + y[i][k]*z[k][j] $k k + 1 if (k != 32) go to L3 x[i][j] = $f4 $j = j + 1 if (j != 32) go to L2 $i = i + 1 if (i != 32) go to L1 A simple example – with MIPS implementation Example: Write a MIPS program to load the smallest positive real number into the coprocessor, multiply it by 10 and display the result. Loading a given number into a FP register involves using the instruction l.s $fxx num # num is a memory location Code in MIPS .text .globl main main: l.s $f6, small l.s $f8, ten mul.s $f12, $f6, $f8 exit: li $v0, 4 la $a0, msg2 syscall # print message li $v0,2 syscall # print sum li $v0,4 la $a0, cr syscall li $v0,10 syscall # print an end of line .data small: .word 0x00000001 ten: .float 10.0 msg2: cr: # exit # this is number 2^(-140) .asciiz "Ten times 2^(-140) = .asciiz "\n" " MARS output Example of FP underflow In the previous example, suppose we multiply the result by 0.5 what will be the result? We can try this in MARS. This is the correct answer, since the operation has caused FP underflow. Parallel programs may interleave operations in unexpected orders Assumptions of associativity may fail (x+y)+z x+(y+z) -1.50E+38 x -1.50E+38 y 1.50E+38 0.00E+00 z 1.0 1.0 1.50E+38 1.00E+00 0.00E+00 Need to validate parallel programs under varying degrees of parallelism §3.6 Parallelism and Computer Arithmetic: Associativity Associativity Concluding Remarks ISAs support arithmetic Bounded range and precision Signed and unsigned integers Floating-point approximation to reals Operations can overflow and underflow MIPS ISA Core instructions: 54 most frequently used 100% of SPECINT, 97% of SPECFP Other instructions: less frequent

Descargar
# Chapter