Information Technology:
At least according to Webster
• Information (Latin: idea, conception)
– Knowledge communicated or received concerning a
particular fact or circumstance
– Any knowledge gained through communication,
research, instruction, etc.
– Any data that can be coded from processing by a
computer or similar device
• Technology (Greek: systematic treatment)
– The branch of knowledge that deals with industrial art,
applied science, engineering
1
Information Technology:
According to WhatIs.com
• IT (information technology) is a term that
encompasses all forms of technology used to
create, store, exchange, and use information in its
various forms (business data, voice conversations,
still images, motion pictures, multimedia
presentations, and other forms, including those not
yet conceived). It is the technology that is driving
what has often been called "the information
revolution."
2
Components of IT
•
•
•
•
Computers
Telecommunications
Software
Creativity
First IT breakthrough?
– Cave paintings? The written word?
Greatest IT development?
– Telephone, telegraph, computer, yet to happen?
3
Information Technology Timeline
Egyptian Book
of the Dead
75,000
B.C.
Rock
Carvings
1500 B.C.
<4000 B.C. Alphabetic
Hieroglyphics
Writing
2200 B.C.
Papyrus
Johannes
Gutenberg
1450 A.D.
Printing
Press
Telegraph Key
Circa 1840
1835
Photography
1876
Telephone
Flat Disk
Gramophone
1887
1895
Silent Movies
1840
1876
1894 Wireless
Telegraph Phonograph
Telegraph
Bell’s Telephone
1876
4
Information Technology Timeline (cont.)
IBM PC
1981
Sputnik
1957
1922 Radio
Broadcasts
1940 Black
and White
TV
1954
Transistor
Radio
1970s
VCR
1965 Local
Cable TV
1977 Apple II
Home Computers
1973 Fax
Machines
Fiber Optics
1977
1980s Cell
Phones
AOL has 200K Subscribers
1992
1983 CDs
1993
World Wide
Web
1990 Digital
Photography
Apple Mac
1984
1998 MP-3
(Compressed
Sound Files)
5
The IT Building Block or
“Digital” Native Language
• Binary representation
– 1’s and 0’s
– What it is physically?
• Presence of (or level of) voltage
• Let’s count:
– Binary is counting in a “base 2” scheme
– Decimal is base 10
6
Binary Conventions
• Most Significant Bit (MSB) and Least Significant Bit
(LSB)
– Decimal Example: 64
• 6 is the Most Significant Digit;
• 4 is the Least Significant Digit
– Binary Example: 1000000
• 1 is the MSB;
• 0 on the right is the LSB
• What does 10000002 represent in decimal? One method of
binary to decimal conversion
• How do I represent 1310 in binary?
7
Convert Decimal to Binary
• Consider a byte, consisting of 7 bits and a parity bit (not
shown):
– Each bit can take on a value of 1 or zero
– If a bit is 1, it’s “contribution” to the byte value is dependent upon
it place the in the byte:
64
32
16
8
4
2
1
• The total value the byte can represent (the sum of all the
contributions) is 1+2+4+8+16+32+64=127
• Thus, any byte can equal a value between 0-127
8
How to represent numerals 1-9?
• Numeral
–
–
–
–
BCD
0
1
2
3
0000
0001
0010
0011
– 8
– 9
1000
1001
9
More Examples
(Excludes parity bit)
• Convert 1210 to binary
– 0001100
• Convert 1010101 to decimal
– 1+4+16+64=85
• Convert 25510 to binary
– 11111111
• Convert 1110001110 to decimal
– 2+4+8+128+256+512=
10
Thinking Binary
• When dealing with binary, think in powers of 2
• Binary Computer conventions (need to know!)
–
–
–
–
1 bit can represent two (21) symbols: either a 0 or a 1
8 bits is a byte: one byte can represent 256 (28) symbols
1 kilobyte = 210 bytes = 1024 bytes = 8192 bits
1 Megabyte = 220 bytes = 1,048,576 bytes
• How many bits are necessary to encode the
uppercase alphabet? Upper and lower case?
11
Binary Multipliers
•
•
•
•
Kilo (K)
Mega (M)
Giga (G)
Tera (T)
210 = 1,024
220 = 1,048,576
230 = 1,073,741,824
240 = 1,099,511,627,776
• Note: You use these factors with base 2 numbers,
not base 10 numbers!
12
Something to Remember
• “Bits” are often used in terms of a data rate, or
speed of information flow:
– 56 Kilobit per second modem (56 Kbps)
– A T-1 line is 1.544 Megabits per second (1.544 Mbps or
1544 Kbps)
• “Bytes” are often used to describe storage capacity
or file size--computer memories are organized in
terms of 8 bits
– 256 Megabyte (MB) RAM
– 40 Gigabyte (GB) Hard disk
13
Let’s Be Clear! Bits v. Bytes
• 56Kbps represents 56,000 bits per second (56,000
is a base 10 number)
• 56KB represents 56 x 210 bytes (bytes--8 bits-base 2 number)
– or 56 x 1024 bytes = 57,344 bytes
– or 57,344 bytes x 8 bits = 458,752 bits
14
Practical Use
• Everyday stuff that uses 2X:
– 32-bit sound card
– 64-bit Video Accelerator card
– 128-bit encryption in your browser
• Is 64-bit twice as good as 32-bit?
– 32 bit = 232 = 4,294,967,296 bits = 4.29 x 108 bits
– 64 bit = 264 = 1.8 x 1019 bits
– 128 bit = 2128 = 3.4 x 1038 bits
15
Other Ways to Count
• Octal--base 8
– 0, 1, 2, 3, 4, 5, 6, 7
– 810 = 108
– Represents 3 bit code: see page 55 of text
• Hexadecimal--base 16
– 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
– Represents 4 bit code: see page 55 of text
• Compresses notation: 1510 = 11112 = 178 = F16
– Note on page 56 of text that 7-bits are being converted into
hex…the authors “inserted” a leading 0 into each group of 7 bits to
make a byte
– Notice that if you break each byte into a 4 bit “word” you can
easily convert binary into hex
16
Decimal – Hex – Binary Conversion
Table
Dec
0
1
2
3
4
5
6
7
8
9
10
Hex
0
1
2
3
4
5
6
7
8
9
a
Bin
00000000
00000001
00000010
00000011
00000100
00000101
00000110
00000111
00001000
00001001
00001010
11
12
13
14
15
b
c
d
e
f
00001011
00001100
00001101
00001110
00001111
17
Fixed Length Codes
• Fixed length codes represent information in a format
someone has agreed to
• American Standard Code for Information Exchange
(ASCII)--see Appendix A in text
– Structure is a 7 bit code (plus a parity bit or an “extended” bit in
some implementations)
– ASCII can represent 128 symbols (27 symbols)
– INFT 101 is:
73 78 70 84 32 49 48 49 (decimal) or
49 4E 46 54 20 31 30 31 (hex)
– 1001001 1001110 1000110 1010100 0100000 0110001 0110000
0110001 (using Appendix A)
18
Extended ASCII Chart
• You can easily convert
hex to binary
• Upper case D is 4416
• 416 is 01002
• Upper case D is then
0100 0100 in binary
• Note 8 bits used
19
Extended ASCII Continued
• Another example: You
want to represent the Yen
sign (¥)
• You see it is 9D
• 916 = 910 = 10012
• D16 = 1310 = 11012
• The ¥ sign in binary is:
1001 1101
20
Other Text Codes
• Extended Binary Coded
Decimal Interchange Code
(EBCDIC) used by IBM-8 bit (28 bits) 256 symbols
• Unicode is 16 bit (216)
65,536 symbols
– World Wide Web supports
many languages
– Unicode supports Latin,
Russian, Cherokee, other
alphabet representation
– www.unicode.org
21
Error Detection and Correction
• Errors in transmission can be caused by many
factors
• Clever code construction or additional information
added to the transmission can increase the odds of
the sent information being received correctly
22
Parity Bit
• An extra bit is appended to the code
• Even parity is when the bit is set so that the total
number of 1’s in the “word” is even
– 11  11 0
– 10  10 1
• Odd parity is when the bit is set so that the total
number of 1’s in the “word” is odd
– 11  11 1
– 10  10 0
23
Parity Errors
• Even parity is set: 10 0 is received.
– Error is present
– However, we don’t know which bit is in error
 Parity can detect errors, but cannot correct them
• Examples:
– Odd parity is set: 1111011 0 is received.
• Error? Yes
– Even parity is set: 1111011 0 is received.
• Error? No
24
Repetition
• Repetition is the provision of extra bits to ensure
the information is received correctly
• Original data: 1 0 0 1
• Transmitted:
111 000 000 111
• Received:
011 000 001 111
– Errors in the first and third bits detected
– Errors in the first and third bits can be corrected
– Note: Not 100% reliable
25
Redundancy Check
• Symbols are given a parity bit
• Total “word” is given a redundancy check
• The first bit of each symbol in the word is given
parity, then the second bit of each symbol in the
word is given parity
• The additional parity “symbol” is given its own
parity and then appended to the transmitted
information
26
Redundancy Check
•
•
•
•
•
Information to be sent:
00 01 10 11
With even parity becomes: 00 0 01 1 10 1 11 0
First bits are: 0 0 1 1
Odd parity bit: 1
Second bits are: 0 1 0 1 Odd parity bit: 1
Odd parity bits: 1 1
Even parity bit: 0
• Transmitted: 000 011 101 110 110
27
Redundancy Check Error
• Transmitted:
000 011 101 110 110
• Received:
000 111 101 110 110
• Even parity tells us that the second symbol has an
error
• Comparing Odd parity with the first bit in each
symbol shows us that the first bit in the second
symbol should be a 0
• Comparing Odd parity with the second bit in each
symbol shows us that everything is OK
28
Functions
• In mathematics, a function is an association
between two sets of values in which each element
of one set has one assigned element in the other
set.
• Any element selected becomes the independent
variable (x) and its associated element is the
dependent variable (y).
• Thus, in:
y = f(x)
y is said to be a function of x.
29
Functions
• Common functions we might deal with:
–
–
–
–
Functions of Time
Functions of Space
Functions of Frequency
Functions of Power
• Usually, functions are represented as a relationship
on an x-y axis graph
30
The Sine Wave is a Function
31
Continuous vs. Discrete Functions
• Continuous Functions are just that: over the
function, there is an infinite number of points
making up that function. Or: between any two
points you can pick another one
• Discrete Functions: A discrete function has a finite
number of elements (or values)
• Let’s draw the difference!
32
Continuous v. Discrete
• In the “real world” many phenomena can be
characterized as continuous (analog)
–
–
–
–
Mercury thermometer
Dial gauges
Sound
Perceived sight
• In the “digital world” phenomena are coded in
terms of 0’s and 1’s…explicit values
– They are “____________” (fill in the blank)
33
Temperature
• Mercury thermometer reflects a temperature that
can continuously vary over its range of
measurement
• Digital thermometer would require an infinite
number of bits to accomplish the same thing
– So: if we are building a digital thermometer, we must
make some choices
– Precision (number of bits we will use) v. cost
– Accuracy (how true is our measurement against a
standard)
34
Temperature Waveforms
35
Thermometer Engineers
• What is the range we wish to measure?
• How many bits are we willing to implement?
• We want to measure -40º F to 140º F
– Total measurement range = 180 º F
• If our thermometer measures in 1º increments, we need to
represent 180 steps
– We can do this with an 8 bit code (256 values)
• Accuracy: at what temperature does the thermometer
increment or decrement? And is it the right temperature?
(Not dependent on the coding scheme we choose!)
36
Design Review
• CEO wants the thermometer to read to the
hundredth of a degree
–
–
–
–
How many bits are required?
180 x 100 = 18000 steps
Can implement with a 15 bit code (215 = 32768)
Are the unused bits “wasted”?
• Precision now: to the hundredth degree
– Is a precise instrument the same as an accurate one?
• How do we build the code now?
37
Thermometer Coding (One Answer)
1 40 .0 0 º F
1 000 110 010 100 00
1 39 .9 9 º F
1 000 110 010 011 11
1 39 .9 8 º F
1 000 110 010 011 10
1 39 .9 7 º F
1 000 110 010 011 01
0 .00 º F
0 001 111 101 000 00
-39 .98 º F
0 000 000 000 000 10
-39 .99 º F
0 000 000 000 000 01
-40 .00 º F
0 000 000 000 000 00
38
Thermometer Coding (Another Way)
0 .04 º F
0 000 000 000 001 00
0 .03 º F
0 000 000 000 000 11
0 .02 º F
0 000 000 000 000 10
0 .01 º F
0 000 000 000 000 01
0 .00 º F
0 000 000 000 000 00
-0 .01 º F
1 111 111 111 111 11
-0 .02 º F
1 111 111 111 111 10
-0 .03 º F
1 111 111 111 111 01
39
What Was That?
• 2’s complement notation
– is a method to represent negative numbers
– Take the positive binary representation with the MSB as
a sign bit (a 0 makes the representation positive)
• 01001 (+910)
– Take the binary complement (flip the bit values)
• 10110
– Add a binary 1
• 10110 + 1 = 10111 (-910)
 I’ll specifically tell you to do something in 2’s
complement on a homework or test
40
Speaking Digital
• Digital can’t absolutely capture analog
(continuous) functions
• Trades are made in length of code-word
– Longer code-word: more precision
– Shorter code-word: less precision
• So why is digital “better”?
– Accurate reproduction
– Error detection and correction
– 0’s and 1’s are easier for electronic systems to deal with
(encryption, multiplexing, regenerating, amplifying)
41
Audio Waveform
42
Storage Needed
• Digital audio signal: 16 bits per sample
• Sampling:
every 0.0000227 sec.
• It means, we need total:
1,411,200 bits for every second of music
• 16 bit per sample
• 44,100 samples per sec.
• 2 channels (stereo signal)
• Adding error correction will multiply this storage
requirement
43
Protocols
• Believe it or not, you use protocols every
day
– When you see a red, octagonal sign, you stop
– When you pick up the telephone when it rings, you say
“hello?”
– The procedure in checking out a library book
– Using the ATM
• Protocols give structure and provide rules,
but they aren’t based on anything but
human convention, agreement and
understanding
44
Protocols
• IT is also built upon protocols:
– Hypertext Transfer Protocol (HTTP)
– Simple Mail Transfer Protocol (SMTP)
– Transmission Control Protocol/Internet Protocol
(TCP/IP)
– And just about everything else we will talk about
• We’ve already discussed the convention of the
byte
– Four (or eight) bytes is a word
– Four bits (half a byte) is called a nibble
45
Who Makes Protocols?
• International Organizations
– International Telecommunications Union (ITU)
– International Standards Organization (ISO)
• National Organizations (US)
– American National Standards Institute (ANSI)
• Professional Organizations
– Institute of Electrical and Electronic Engineers (IEEE)
• Trade Groups/Consortiums: Sony/Phillips and the CD
• Companies: Microsoft, Cisco, Bell (back in the day),
3Com, others…..
46
Why Protocols?
• Protocols provide the rules under which a desired
function can occur
• Greatly simplify production and development of
new products; compatibility with other vendors
and older equipment
• Bring order to chaos
– Text has example of 140 different formats of floppy
disk storage on the same 5 1/4 inch media
• Provide new and better functionality
47
Protocol Problems
• New protocols drive obsolescence
– MSWindows XP won’t use the 95 kernel (allegedly)
– And perhaps obsolescence of the preceding/competing
protocol
• The “best” protocol doesn’t always win
– Market forces and inertia sometimes win out
• DVD v. DiVX
• What’s the standard?
– Physical devices — 3.5” floppy disk — contains
160tracks * 18 sectors * 512B = 1.44MB
• In a word: Interoperability
48
Some Examples (Data Transmission)
• How do we know where the bit stream begins?
– Start bits; stop bits (Guess what? These are protocols)
– Flags--used in Ethernet, High-Level Data Link Control
(HDLC)
Flag = 01111110
49
HDLC Protocol Challenges
• Flag bit 01111110 (= ASCII ~)
• To fix this problem, a rule had to be incorporated
into the protocol:
– Transmitter: Whenever you have five 1’s in a row,
insert an unneeded 0, and add the flags after this
procedure
– Receiver: Whenever you receive five 1’s in a row,
followed by a 0, discard the 0
• This procedure is known as Bit Stuffing, or Zero
Bit Insertion
50
Protocol Take-a-Ways
•
•
•
•
•
Check the protocols you are using
Check them again
Make sure the other guy is using the same protocols
Check again
New protocols build on older ones to support
interoperability and are approved by standards bodies to
ensure availability and compliance
– But specific vendor implementation may cause
interoperability problems (that’s why you check)
• Some protocols are introduced as bridges between
incompatible protocols -- Rich Text Format (RTF) protocol
51
Visual Representation and Display
• “A picture is worth ten thousand words”
– But it takes a whole lot more than 10,000 bits usually!
• A visual image is a powerful way to convey
information
• The need to process a huge amount of image/video
data requires a lot of computing power
– Specialized processors
– New type of computers
• A very dynamic growth in consumer electronics
52
Image Issues
• The world we live in is “analog”
– continuously variable
• Problem for digitizing, or making discrete
• But:
–
–
–
–
We are producing information for human use
Human vision has limitations
Take advantage of this fact
Produce displays that are “good enough”
53
Digital Info for Humans
• Many digital systems take advantage of human limitations
(visual, aural, etc)
• Human gray scale acuity is 2% of full brightness
– Or: 50 gray levels can capture what most people can detect
• Human eye can resolve about 60 lines per degree of visual arc
• 8.5 x 11 sheet of paper at 1 foot (landscape) contains 49.25 degrees
horizontal and 39 degrees vertical
– 49.25 degrees x 60 = 2955 horizontal lines we can just
distinguish
– 39 degrees x 60 = 2340 vertical lines we can just distinguish
• These numbers give us a clue about how long our code will need to
be to capture images
54
Pixels and Lines
• A pixel is the smallest unit of representation for
visual information
• In order to form a black line, you need two rows
of pixels (one black and one white) to give a
visual clue of transition
• So: two rows of pixels are needed per line
• Go back to our paper example:
– Number of pixels needed to represent total image:
– 2 (2,955 horizontal) x 2 (2,340 vertical) = 27,658,800
pixels
55
Check for Yourself
• On the CD that accompanies the text, play with
the visual representation applets
• Some changes (# pixels or # colors) have little
impact upon your ability to discern the image
• After a time degradation becomes very apparent
• On pages 84 and 85 you can see the effects of
going from a 6-bit to 1-bit grayscale code
56
Picking the Codes
• The assignment of a code to a particular analog
value is called Quantization
• Quanitization results in the loss of some
information (and therefore precision)
– some values not exactly represented
• More detail on Quantization in a later lecture
57
Spatial Resolution
• Tradeoff
– When would we want more pixels?
– When would we want fewer pixels?
58
Shades of Gray
• Example:
6bit (64 gray levels)
3bit (8 gray levels) image
1bit (2 gray levels) image
59
Color Representation
• Instead of grayscale, assign coding to the levels of each of
three colors--Red, Green, Blue (RGB)
• Another approach is Hue (light spectrum), Luminance
(brightness) and Saturation (dilution by white) (HLS)
• Both systems use 3 bits per component (RGB or HLS), for
a total of 9 bits--512 different levels of representation
• Limited color scheme, so additional palettes are available,
each with a 512 level representation (Color Palette
Representation)
• Windows supports 32-bit color--much larger choice of
colors than with RGB!
60
Video
• Human perception of movement is slow
– Humans can only take in 20 different images per
second
• Movies show 24 frames per second
– We can detect higher rates of flicker, but only to about
80 cycles per second
• TV works similarly
– instead of a frame, TV refreshes in lines across the tube
• What’s going on? Make discrete a continuous function--it
is possible to capture the image with bits!
61
Your Computer Display
• Described in terms of pixels:
–
–
–
–
640 x 480
800 x 600
1024 x 768
At 60 Hz, how many pixels are being refreshed in a 1024 x 768
resolution per second? 1024 x 768 x 60 = 47,185,920
• Encoding Supports some number of colors
–
–
–
–
256 (one byte)
65,536 (two bytes)
16,777,216 (three bytes or indexed)--see p 99 of text
Using number of pixels computed above, how many bytes are
being processed per second for 65,536 colors? 47,185,920 x 2
bytes = 94,371,840 bytes per second
62
Variable Length Coding (Compression)
• ASCII is a fixed length code of 7 bits
– 8 bits for Extended ASCII
• One example of variable length coding is Morse Code:
– Certain letters are used more frequently than others in the English
language
“.” = E
“-” = T
“--..” = Z
“--.-” = Q
• Why bother?
– Reduce transmission time
– Reduce storage requirement
63
Definitions
• Message: Values or meanings to be conveyed
• Data: A stream of symbols (voltages, 1’s/0’s)
• Information: Amount of “surprise” in the
message
– Take pi: can be symbolized by π or a series of nonrepeating digits 3.141592654…….
– Easier to send π than the whole string of digits
– We already know what π’s value is
– We can take advantage of this fact and compress the
data we send
64
Basic Information Theory
• Claude E. Shannon developed modern
Information Theory at Bell Labs in 1948
• He saw the relationship between the probability of
appearance of a transmitted signal and its
information content
• This was a key concept, and allowed for
compression techniques
65
Probability
• Shannon found that information can be related to
probability (Others had done work as well, but Shannon
gets credit for Info Theory)
• Consider a Fair coin toss:
– Heads or Tails each has a probability of .50
– In two tosses, both heads probability is: (.5 x .5) = .25
– In three tosses, all tails: (.5 x .5 x .5) = .125
• We compute probability this way because the result of each
toss is independent of the results of other tosses
66
Entropy
• If the probability of a binary event is
.5 (like a fair coin toss) then, on
average you need one bit to represent
the result of this event.
• As the probability of a binary event
increases or decreases, the average
number of bits you need to represent
the result decreases
• The figure is expressing that unless an
event is totally random, you can
convey the information of the event in
fewer bits, on average, than it might
first appear
• Let’s step through the store example in
the text
67
How do you measure code effectiveness?
• The authors describe the store example
– 80% of the customers are male, 20% female
– For now, just accept the codes assigned
• M-M: 0 M-F: 10 F-M: 110 F-F: 111
– (1)(.64) + (2)(.16) + (3)(.16) + (3)(.04) = 1.56 bits/event
– The leading number is the number of bits contained in the
code
– The fractional number is the probability of that code’s
occurrence
– We achieved: 1.56 < 2 bits/event
– upper bound concept
68
Huffman Coding
•
•
•
•
Must know the probability of event occurrence
Order the events in probability of occurrence
Group the events in pairs of events with lowest probabilities
Compute the probability of occurrence for the pair (ie add
the probabilities)
• Keep pairing, even if you pair previous pairings (keep pairing
lowest unpaired probabilities!)
• Add your codes to the events--watch that your codes are
shortest for the most common occurrences and that each code
can be uniquely identified
• Use a Convention: Assign “0” to the lowest probability in any
pair (Could do the opposite also)
69
Example (1)
FG (.05)
1
A (.33)
B (.27)
C (.13)
D (.12)
E (.1)
F (.04)
0
G (.01)
70
Example (2)
EFG (.15)
0
1
FG (.05)
1
A (.33)
B (.27)
C (.13)
D (.12)
E (.1)
F (.04)
0
G (.01)
71
Example (3)
EFG (.15)
0
1
CD (.25)
1
A (.33)
B (.27)
C (.13)
FG (.05)
0
D (.12)
1
E (.1)
F (.04)
0
G (.01)
72
Example (4)
CDEFG (.4)
0
1
EFG (.15)
0
1
CD (.25)
1
A (.33)
B (.27)
C (.13)
FG (.05)
0
D (.12)
1
E (.1)
F (.04)
0
G (.01)
73
Example (5)
CDEFG (.4)
0
1
EFG (.15)
0
1
AB (.6)
1
A (.33)
CD (.25)
0
B (.27)
1
C (.13)
FG (.05)
0
D (.12)
1
E (.1)
F (.04)
0
G (.01)
74
Example (6)
0
1
CDEFG (.4)
0
1
EFG (.15)
0
1
AB (.6)
1
A (.33)
CD (.25)
0
B (.27)
1
C (.13)
FG (.05)
0
D (.12)
1
E (.1)
F (.04)
0
G (.01)
75
Example (7)
Did we achieve good coding?
A
11
• Most Common – Shortest Code
B
10
• Least Common – Longest Code
C
011
D
010
E
001
F
0001
G
0000
76
Example (8)
• Using our just constructed code, how should you
decode this string?
11011000011110001001000100
11 011 0000 11 11 0001 001 0001 00
A C
G A A
F
E
E
• How do we know this is the code? Every code is
unique in form--note the “wasted” bit characters
– Not all combinations are utilized
– That is: there is no such thing as “0” or “00” etc...
77
Universal Coding
• Huffman Coding has limited utility
– You must know a priori the probability of the
characters or symbols you are encoding
• Universal coding allows you to “compress as you
go”
– Lempel-Ziv coding is one form
– Used by WinZip to compress info
78
Lempel-Ziv
• Similar idea to Huffman Coding
– find the unique identifiers and use them
• First, you parse (break into pieces) the code string
• Then you make a table
– (table must be known by both ends communicating)
• List your parsed information, including a “Null” as the first
parse
• Assign codes to each parsed segment
– this code becomes the prefix
• Lastly, encode your information, using the prefix code
representing all but the last bit of what you are encoding and
add the last bit to the prefix
79
For Example
First, we Parse:
• 10101101101010
– 1,0,10,11,01,101,010
• 000011111110001110
– 0,00,01,1,11,111,000,1110
Then we code
– We must set up codes that uniquely identify each prefix,
to include the null
80
First Problem
• Codes
Prefix
Code
null
000
1
001
0
010
10
011
11
100
01
101
101
110
010
111
• Parse: 1,0,10,11,01,101,010
• First bit is 1, or null 1
– Coded as 000 1
• Why 3 bits? 8 things need to be encoded:
3 bits
• Entire bit string 1,0,10,11,01,101,010
is encoded as:
• 0001 0000 0010 0011 0101 0111 1010
• NOTE: No compression in this example,
but over longer strings Lempel-Ziv can
approach maximum compression
81
Second Problem
• Codes
Prefix
null
0
00
01
1
11
111
000
1110
Code
0000
0001
0010
0011
0100
0101
0110
0111
1000
• Entire bit string 0,00,01,1,11,111,000,1110
• First bit is 0, or null 0
– Coded as 0000 0
• Entire bit string 0,00,01,1,11,111,000,1110
is encoded as:
– 00000 00010 00011 00001 01001 01011 00100
01100
• Note: Different code, different number of bits (9
things to encode: 4 bits needed)
• This information must be shared at the other end
to decode!
82
Variable Codes - Summary
• Take advantage of probabilistic nature of information
• Variable codes achieve their efficiency over longer
information strings
– Lempel-Ziv achieves large efficiencies over long strings of info
– Over short strings, Lempel-Ziv is very inefficient compared to
Fixed Length or Huffman Coding
• If we had total uncertainty about the information we are
conveying, fixed length codes are better
83
Introduction Computer Architectures and
Software
• The IT world is based on Computers--you should
have a basic understanding of how the computer
works and about the software that drives it
84
The Desktop PC
Storage
Input/Output
Central Processing
Unit
Memory
85
Input/Output
• Two ways to think of I/O:
– #1 At the board/chip level (Physical Connections)
– #2 What is connected to the board (User I/O’s)
– Depends if you are a computer architect or a peripheral
device designer/user
• The board:
– Parallel, serial, network, game, digital ports
• What is connected to the board
– Printer, scanner, keyboard, joystick, modem, speaker,
monitor, network card
86
Storage
• Used by the computer to store programs, information, data (Write)
• Used by the computer to access information and programs when
necessary (Read)
• Some storage can do both read and write, some only one
• Some storage can randomly access information, some can only
sequentially access
• Magnetic
– Hard disk, floppy disk, ZIP disk, tape
• Optical
– CD, DVD
• Solid State
– Flash cards, PCMCIA cards
87
Memory
• Memory helps computers operate faster and more
efficiently
• Random Access Memory (RAM)
– Memory chips in a PC
– Virtual Memory
– Cache Memory (on the CPU)
• Read-Only Memory (ROM)
– CD-ROM
– DVD-ROM
88
The Central Processing Unit (CPU)
• The “brains” of the computer
• Performs calculations and completes instructions
• Performance based on clock speed--how many
instructions can it accomplish in a cycle
– Pentium 4 --- 2.0 GHz chip operates at 2 billion cycles
per second…can do one instruction per cycle per
pipeline (“20 stage depth pipeline” says the information
Intel provides on their website: www.intel.com)
89
The CPU
Storage
Control
Unit
Arithmetic
Logic
Unit
(ALU)
Memory
Registers
Input/Output
Flags
Cache
Memory
90
Simple Instruction
•
•
•
•
•
•
You want your computer to add 1 + 2 + 4
MOV 1, R0
(Move the number 1 into Register 0)
MOV 2, R1
(Move the number 2 into Register 1)
ADD R0, R1
(Add R0 to R1 and put the result in R1)
MOV 4, R0
(Move the number 4 into Register 1)
ADD R0, R1
(Add R0 to R1 and put the result into R1)
• Move, Add are examples of “instruction sets”
• “Real” computers have many more Registers
91
Chip Types
• Complex Instruction Set Computer (CISC)
– Many forms of instructions (some special purpose, but chip must
support them all)
– x86, Pentium
• Reduced Instruction Set Computer (RISC)
– Specific list of supported instructions (Want to do something else?
Find a combination of instructions to accomplish the task)
– Power PC (Motorola), Alpha, IBM RISC System/6000, Sun SPARC,
MIPS
• Difference is speed of execution: RISC is faster, but may require longer
instruction combinations to accomplish the same task as a CISC
92
Hierarchy of Software
(Above the Abstraction Level [i.e. 0’s and 1’s])
Application
Computer Language
(High Level Language)
Operating System
Assembly Code
93
Assembly Code
• Assembly Code: Most basic of abstraction
– ADD R1, R2
– MOV ADDR, R1
– Different computer systems (RISC v. CISC) use
different languages
– Breaks down instructions to their most basic--at the
level of implementation of the chip (or what
instructions a particular chip will understand)
94
Operating System
• OS provides for common system tasks
– Display information on a monitor
– Print services
– User interface
• Text based
• Graphical User Interface (GUI)
Key note: OS can run on
different chips--difference is
in what language the OS
speaks to the chip (Assembly
Language)
• Examples:
–
–
–
–
–
–
–
MS/DOS, Windows 3.0, Windows 95/98/2000/Me/XP
Microsoft NT/2000 Professional/Server/XP Professional
Mac OS X (and other earlier)
Linux
HP Unix
Sun Solaris
IBM OS/2
95
Computer Language
(High Level Language)
• Used to program instructions for computer use
– Eases programming, since the “Compiler” translates almost human
syntax into lower level code
• Examples:
–
–
–
–
–
–
–
Java
Fortran
C++
Visual Basic
Pascal
Ada
HTML
96
Application Software
• This is how we usually interface with the
computer
–
–
–
–
–
–
MS Word
MS PowerPoint
Lotus Notes
Quicken TurboTax
AOL Instant Messenger
Etc., etc., etc…..
97
Descargar

Introduction - George Mason University