CS311 Lecture: Representing Information in Binary; Last revised 8/27/03
Octal and Hexadecimal Shorthands
Objectives:
1. To review binary representation for unsigned integers
2. To introduce binary represenations for signed integers
3. To introduce the IEEE 754 representation floating point representation
4. To discuss the basic process of doing floating point arithmetic
5. To overview binary representations for characters, sounds, graphics
6. To introduce octal and hexadecimal shorthands
Materials:
1. Sample Editor program
2. Transparency summarizing IEEE floating point formats
3. RGB Demo Applet
I. Introduction
- ------------
A. One of the key suggestions in Von Neumann's landmark paper was that
computer systems should be based on the binary system, using just
the two bit values 0 and 1 to encode information.
1. This is consistent with the nature of the physical devices used
to construct computer systems, and has contributed to the high
reliability we associate with digital systems.
2. Although Von Neumann was thinking just in terms of representing
integers in binary, we have since learned to represent many other
kinds of information this way, including text, sounds, and graphics.
3. To this day - over 5 decades later, and through several changes
in fundamental techology - this idea still dominates the field of
computing.
B. At the outset, we should note that there is a crucial distinction between
a NUMBER (an abstraction) and a REPRESENTATION of a number (a symbol).
1. For example, the number we call four can be represented by the
following symbols:
FOUR QUATUOR 4 IV |||| 11 100 etc.
3 2
2. On the other hand, the symbol 100 could stand for 4 (as above) or for
one hundred (decimal system) or for 9 (trinary system) or for any one
of an infinite variety of possible values.
C. Most often, we symbolize numbers by a symbol based on a positional
notation system. The development of positional notation was a crucial
advance in the development of modern mathematics. In a positional
system, we have a base or radix (such as 2 for binary, 3 for trinary, or
ten for decimal). We make use of a limited set of symbols for
representing values 0 .. radix - 1.
1. For example, in binary, we use two symbols (0,1); in octal (base 8)
we use 8 (0,1,2,3,4,5,6,7); in hexadecimal (base 16) we use 16 symbols
(0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F) etc.
2. When there is any possibility of confusion, we denote the radix of a
numerical symbol by a subscript (written using decimal notation!) -
eg: 9010
16
D. Further, in a positional system the value of a symbol depends on where it
is written. The rightmost symbol (digit) has the value (symbol_value)*1;
the one next to it has value (symbol_value) * radix etc. Thus, the
radix-2 (binary) number 01000001 is equivalent to the decimal number:
0*128 + 1*64 + 0*32 + 0*16 + 0*8 + 0*4 + 0*2 + 1*1 = 65 .
10
E. In this and subsequent lectures, we will consider how various kinds
of information can be represented in binary. Today, we limit
ourselves to unsigned integers.
II. Review of Internal Encoding for Unsigned Binary Integers
-- ------ -- -------- -------- --- -------- ------ --------
A. The normal way of encoding unsigned binary integers is to use a
straightforward place-value system, in which the bits are assigned
weights 2^0, 2^1, 2^2 etc. Thus, for example, the binary integer 1011
would be interpreted as
1 * 2^0 = 1
+ 1 * 2^1 = 2
+ 0 * 2^2 = 0
+ 1 * 2^3 = 8
---
11
B. Note that with n bits we can store values in the range 0 .. 2^n - 1
C. Conversion between decimal and binary
1. To go from binary to decimal, we use the basic approach outlined above:
multiply the rightmost bit by 2^0, the next bit by 2^1, the next bit
by 2^2 etc. Add the products. (It helps if you memorize the powers
of 2)
Example: convert 10011101 to decimal (157)
Exercise: 10101010 (170)
2. To go from decimal to binary, we can use successive division:
Divide the decimal number by 2. The remainder is the rightmost bit of
the binary equivalent. Divide the quotient by 2. The new remainder
is the second rightmost bit. Divide the new quotient by 2. The new
remainder is third rightmost bit ... Continue until the quotient is 0.
Example: convert 238 to binary
238 / 2 = 119 rem 0 <- least significant bit
119 / 2 = 59 rem 1
59 / 2 = 29 rem 1
29 / 2 = 14 rem 1
14 / 2 = 7 rem 0
7 / 2 = 3 rem 1
3 / 2 = 1 rem 1
1 / 2 = 0 rem 1 <- most significant 238 => 11101110
Exercise: 252 (11111100)
D. Adding binary numbers: As a child, you learned how to do addition based
on an addition table like the following:
0 1 2 3 4 5 6 7 8 9
0 0 1 2 3 4 5 6 7 8 9
1 1 2 3 4 5 6 7 8 9 0 + carry
2 2 3 4 5 6 7 8 9 0+ca 1 + carry
3 3 4 5 6 7 8 9 0+ca 1+ca 2 + carry
etc.
For binary, the table is much simpler:
0 1
0 0 1
1 1 0 + carry
Example: 01011010 Check: 90
01101100 108
-------- ---
11000110 198
Exercise: 00111001
01011010
--------
10010011
E. One issue to be sensitive to in addition and other operations is
OVERFLOW. Computer hardware typically uses a fixed number of bits
to represent integers, so an erroneous result will be produced if the
correct sum is too big for the representation.
Example: assume a certain machine uses 8 bits for integers. Consider
the following problem:
11001000 200
11001000 200
-------- ---
10010000 144 !!
The error arises because 8 bits cannot represent the correct sum.
(Here, we detect overflow by the fact that there was carry out of the
most significant bit position.)
F. Other arithmetic operations can be done on binary numbers in a manner
analogous to the decimal operations, but using the binary tables.
III. Representations for Signed Integers
---- --------------- --- ------ --------
A. The method we have developed thus far for representing integers in binary
only allows us to represent integers >= 0. Obviously, we also need
a way to represent negative numbers.
B. In decimal, we represent negative numbers by representing their absolute
value and preceeding it by a negative sign. This, in effect, means that
we have added one more symbol to our set of numerical symbols (a 10%
increase in the size of our set of symbols). On a computer, this is
highly undesirable, since we would then have to have three different
symbols to represent numbers with, instead of two (a 50% increase in the
size of our set of symbols.)
C. Instead, we take advantage of the fact that the number of bits in
a binary number is fixed by the hardware design (the machine's word
length).
1. If we wish to represent signed numbers, then, we can reserve the
leftmost bit as a sign bit, and interpret it as follows:
0 --> the number is >= 0 1 --> the number is negative
2. A consequence of reserving one bit for the sign is that the largest
positive number we can represent is about half as big as what we
could have represented with the same number of bits using unsigned
notation (we've given up one bit for the sign.) (This is why some
languages (e.g. C++) allow integers to explicitly be declared as
unsigned - to allow all bits to be used for representing magnitude.)
3. An important ramification of a choice like this is that we have
to be careful about OVERFLOW.
a. With a fixed number of bits allocated for representing an integer,
an addition can produce a result that is too big to be represented
in the specified number of bits.
Example: represent 100 (decimal) as an 8 bit signed binary number:
01100100 (rightmost bit is sign - 0)
Now take the sum 100 + 100 = 200. This results in overflow,
because we only have seven bits available for representing the
magnitude of the number (since one bit is reserved for the sign)
b. With the representation schemes we will be using, undetected
overflow will typically produce a result of the wrong sign -
e.g the binary representation for 200 in 8 bits is
11001000
but this looks like a negative number since the rightmost bit
is 1!
D. With this basic encoding, there are actually three possible schemes for
representing signed numbers. All agree on using the leftmost bit
in the number to encode the sign - with 0 encoding + and 1 encoding -.
They differ in what they do with the remaining bits.
1. Sign magnitude
2. One's complement
3. Two's complement
We will only discuss the first and last schemes, since one's complement
is rarely used.
E. For the following examples, we will assume the use of 8 bits to
represent an integer - 1 for the sign and 7 for the value. This is
just so we don't have to write lots of 0's or 1's on the board -
most typically we use 32 bits (or even 64 bits) to represent an integer.
IV. Sign-Magnitude Representation
-- -------------- --------------
A. The simplest way to represent signed numbers is called sign-magnitude.
It is based on the method we use with decimal numbers:
1. To represent a positive number in (say) 8 bits, represent its
magnitude as seven bits and prefix a 0.
2. To represent a negative number in (say) 8 bits, represent its magnitude
as seven bits and prefix a 1.
Example: +65 --> 1000001 (7 bits) --> 01000001
-65 --> 1000001 (7 bits) --> 11000001
Exercise: + 72, -72 (01001000, 11001000)
3. To change the sign of a number, invert the sign bit.
B. Some Ramifications of this Representation
1. There are two representations for zero: 00000000, 10000000
(The latter can be used as an "undefined value" to be stored in
unitialized variables. If the hardware ever reads it from memory,
a trap can be generated to indicate that the progammer has used
an undefined variable.)
n-1 n-1
2. Range of values representable with n bit: -(2 - 1) .. 2 - 1
3. Unfortunately, while simple for us, sign-magnitude is awkward in
hardware. For example, the algorithm to add two sign magnitude numbers
looks like this:
If signs agree: add magnitudes, retaining signs. Overflow
occurs if the sum magnitude exceeds 2^(n-1) - 1
- i.e. if there is carry out of the most
significant bit of the magnitude sum.
If signs differ: find number with greater magnitude and retain
its sign. Form magnitude of result by
subtracting smaller magnitude from larger.
(Overflow cannot occur.)
Examples: 0 0000001 1
+ 0 0000001 1
--------- -
0 0000010 1
1 1000000 -64
+ 0 0000001 1
becomes: 1 1000000
- 1 0000001
--------- ---
1 0111111 -63
1 0000001 -1
1 1111111 -127
--------- ----
1 0000000 OVERFLOW - CARRY OUT OF MOST
SIGNIFICANT BIT OF MAGNITUDE SUM
Exercise: 00100000
+ 10000010
--------
00011110
4. Actually, multiplication and division are more straightforward:
multiply/divide magnitudes and set sign = xor of original signs.
C. Sign-magnitude is rarely used for integers, but is used as part of
the representation for reals as we shall see later.
V. Two's Complement Representation
- ----- ---------- --------------
A. The most commonly used scheme for representing signed numbers is
called two's complement.
1. The basic idea is this: when we write a positive number, we
are really writing an abbreviation for the representation, in the
sense that the number could be thought of as having infinitely many
0's to the left - e.g.
The representation for 42 (using 8 bits) is 00101010 - but we could
think of this as standing for
... 00000000000000000101010
infinitely many 0's
2. To form the representation for a negative number, think of what would
happen if we subtracted its absolute value from 0 (-X = 0 - X). E.g.
if we subtracted the representation for 42 from 0, we would get
... 00000000000000000000000
- ... 00000000000000000101010
---------------------------
... 11111111111111111010110
infinitely many 1's
a. If we abbreviate down to 8 bits, we get 11010110
b. Actually, we don't have to work with infinitely many bits - to
negate an n bit number, it suffices to subtract it from the
(n+1)-bit number 2^n and then truncate to n bits - e.g.
100000000 Representation for 2^8
- 00101010 42
---------
011010110 = 11010110 in 8 bits
^
+---- Discard this bit
3. This is called the TWO'S-COMPLEMENT representation - i.e. we say
that 11010110 is the two's complement of 00101010 - and vice versa.
Observe that if we add the n-bit representation of a number and its
two's complement, without treating the sign bit specially, the result
is 2^n. However, if we truncate the result to n bits, it becomes
0 - which is surely what we want since X + (-X) = 0
00101010
+ 11010110
--------
100000000 = 00000000 in 8 bits
B. An Easy Way to Represent Numbers in Two's complement
1. To represent a positive number in n bits, represent it as an
unsigned number using n-1 bits and prefix a 0. (If this cannot be
done, then the number cannot be represented as an n-bit two's
complement number.)
2. To represent a negative number in n bits, represent its absolute
value as above, then invert all the bits (including the sign bit)
and add 1.
Example: +65 --> 01000001
(8 bits) -65 --> 10111111
Exercise: + 72, -72 (01001000, 10111000)
3. To change the sign of a number, invert all the bits, add 1
4. Observe: if we are using n bits, then X, if negative, looks like the
unsigned number
n
(2 ) + X
C. Some Ramifications of this Representation
1. There is one representation for 0: 00000000, so X + (-X) = 00000000
n-1 n-1
2. Range of values representable with n bits: -2 .. 2 - 1
(Note that this is assymetric)
3. To add two 2's complement numbers: add them as if they were unsigned
(ie treat the sign as if it were the most-significant bit of an
unsigned number.) Discard any carry out of the sign position.
To see why this works (in the absence of overflow), observe the
following:
a. If we add two non-negative numbers X and Y, the operation is the
same as unsigned addition in any case.
b. If we add negative number (X) to a non-negative number (Y),
X looks like the unsigned number X + 2^n and the sum looks like
the unsigned number X + Y + 2^n.
i. If the final sum is negative, this is the correct
representation.
ii. If the final sum is positive, there is (of necessity) a
carry out of the most significant position which cancels
the 2^n term, so the result looks like X + Y.
c. If we add two negative numbers X and Y, X looks like the
unsigned number X + 2^n and Y looks like the unsigned number
Y + 2^n. The sum therefore looks like X + Y + 2^n + 2^n.
However, since there is (of necessity) a carry out of the
most significant position, one of the 2^n's is cancelled and
the sum looks like X + Y + 2^n.
4. There is overflow if carry out of sign != carry in to sign
Examples: 00001000 8
11111111 -1
-------- --
00000111 7
11111110 -2
11111110 -2
-------- --
1 11111100 -4
\
carry out of sign position discarded
01100100 100
01100100 100
--------
11001000 (-56) OVERFLOW - CARRY-IN TO SIGN BUT
NONE OUT OF SIGN
Exercise: 11000001 -63
01000000 64
-------- --
00000001 1
Proof of overflow rule:
Carry in to sign != carry out of sign <=> overflow
a. If we add two positive numbers, there will never be carry out
of the sign (since both signs are zero). The sign of the result
will be 1 (which must be an overflow) iff there is carry in to
the sign - i.e. the two carries differ.
b. If we add two negative numbers, there will always be carry out
of the sign (since both signs are one). The sign of the result
will be 0 (which must be an overflow) iff there is no carry in to
the sign - i.e. the two carries differ.
c. If we add two numbers of unlike sign, there can never be
overflow. Further, there will be carry out of the sign iff
there is carry in to the sign (since just one sign is one) -
i.e. the two carries will always be the same.
5. To subtract: Negate (flip all the bits, add 1) the subtrahend and
add
6. Multiplication and division are more complex. One approach is
to complement negative values so as to do the whole operation
with positive values, then if the two original operands were of
opposite sign then complement the result. Or, the whole
operation can be done in 2's comp - we'll discuss this later.
D. Two's complement is the preferred scheme for integer arithmetic on
most machines, though many (including MIPS) use Sign-Magnitude for
floating point. (Note that, since the arithmetic algorithms are
wired into the hardware, the choice is made by the hardware designer
and cannot easily be altered in software.)
VI. Internal encoding for reals (floating point numbers)
-- -------- -------- --- ----- ------------------------
A. Thus far we have confined our discussion to integers. How can we
represent real numbers? It turns out that this is an area where
architectures diverge widely. We will discuss general principles, and
then a particular approach taht has become widely accepted.
B. A real number is stored internally as a mantissa times a power of some
radix - i.e.
e
m * r
1. CPU architects have several basic options regarding the format of the
mantissa, m:
a. It may be a pure fraction. That is, there is an assumed binary
point just to the LEFT of the first bit of the mantissa.
b. It may be a number between 1 and 2, with an assumed binary point
between the first and second bits.
c. It may be an integer, with an assumed binary point just to the
RIGHT of its last bit.
2. The exponent e specifies a power of some radix, r, by which the
mantissa is to be multiplied.
a. The radix is often 2, but in some machines is 4, 8, or 16.
Example: If a particular machine uses 2 as the radix of its
exponent, then a number with mantissa .1000 and exponent
2 is interpreted as
2
0.5 * 2 = 2
10
However, if the exponent radix is 16 (as on the IBM
360/370), then the interpretation is
2
0.5 * 16 = 128
10
b. Of course, the choice of what radix to use for the exponent is an
architectural choice, involving tradeoffs between RANGE and
PRECISION.
C. In an attempt to deal with data exchange problems arising from a
diversity of floating point representations used by different
manufacturers, the IEEE has developed a floating point standard (Standard
754) that is used by most systems
1. Older architectures developed prior to this standard may use a
different format (in fact, the proliferation of such formats was
what led to the development of IEEE 754).
a. Two important architectures that are still in use that utilize
a pre-IEEE 754 floating point format are the IBM mainframe
architecture and the VAX architecture.
b. All current microprocessor architectures in wide use utilize the
IEEE 754 format, including MIPS, IA32, and PowerPC.
c. The Java Virtual Machine utilizes the IEEE 754 format.
2. The standard provides two different formats for single and double
precision numbers. We will consider the single precision
representation in this system in some detail.
31 30 23 22 0
-------------------------------------
|s|exp |fraction |
-------------------------------------
a. s = sign of mantissa: 0 = +, 1 = -
b. exp = exponent as power of 2, stored in excess 127 form - i.e.
value in this field stored is 127 + true value of exponent.
ex: true exponent = 0; stored exponent = 127
true exponent = 127; stored exponent = 254 (largest possible value)
true exponent = -126; stored exponent = 1 (smallest possible value)
The extreme values of the stored exponent (0 and 255) are reserved
for special purposes to be described below.
c. The significand (magnitude of mantissa) is normalized to lie in the
range 1.0 <= |m| < 2.0. This implies that the leftmost bit is a 1.
It is not actually stored - the 23 bits allocated are used to store
the bits to the right of the binary point, and a 1 is inserted to
the left of the binary point by the hardware when doing arithmetic.
(This is called hidden-bit normalization, and is why this field is
labelled "fraction".) In effect, we get 24 bits of precision by
storing 23 bits.
Example: stored significand = 00000000000000000000000
true significand = 1.0
stored significand = 11111111111111111111111
true significand = 1.11111111111111111111111
d. As noted above, certain exponent values are reserved for special
purposes, and when they appear the interpretation of the signficand
changes:
i. Stored exponent = 0
- If the significand is zero, then the number represented is 0.0
(Since an all-zero stored significand represents 1.0, a special
case is needed to represent zero.)
- If the significand is non-zero, then we have a denormalized
number; no hidden bit is inserted and the true exponent is
taken as -126
ii. Stored exponent = 255
- If the significand is zero, then the number represented is
+/- infinity (depending on the sign bit.) The hardware
correctly propagates infinity in arithmetic - e.g. infinity +
anything is infinity.
- If the significand is non-zero, then the representation is
not a number (NAN). Any use of NAN in an arithmetic operation
always produces NAN as the result.
3. Examples: 0.0: 0 00000000 00000000000000000000000
1.0: 0 01111111 00000000000000000000000
(1.0 x 2^0)
0.5 (0.1 binary): 0 01111110 00000000000000000000000
(1.0 x 2^-1)
0.75 (0.11 binary): 0 01111110 10000000000000000000000
(1.1 x 2^-1)
3.0 (11 binary): 0 10000000 10000000000000000000000
(1.1*2^1)
-0.375 (-0.011 binary): 1 01111101 10000000000000000000000
(-1.1*2^-2)
1 10000011 01000000000000000000000 = - 1.01 * 2^4 = -20.0
4. Range of values:
- largest finite positive: 0 11111110 11111111111111111111111 =
1.11111111111111111111111 * 2^127 or ~2^129 (about 5 * 10^38)
- smallest normalized pos.: 0 00000001 00000000000000000000000 =
1.00000000000000000000000 * 2^-126 (about 1 * 10^-38)
(The precision of both of the above is 24 bits = ~ 7 decimal places)
- smallest positive: 0 00000000 00000000000000000000001 =
= .000000000000000000000001 * 2^-126 or 2^-149 (about ~2 * 10^-45)
(But precision is only one bit!)
5. IEEE 754 also defines a double precision floating point standard,
which represents a number using 64 bits: 1 for the sign, 11 for the
exponent (excess 2047), and 52 for the fraction.
6. Summary: TRANSPARENCIES - MOTOROLA MC68881 MANUAL PP 2-16, 3-24
VII. Floating point Arithmetic
--- -------- ----- ----------
A. Arithmetic on floating point numbers is, of course, much more complex
than integer (or fixed-point) arithmetic.
1. It is not necessary to have hardware provisions for doing floating
point arithmetic - it is possible to code subroutines to perform
basic floating point operations using a combination of integer
arithmetic and shift operations.
a. Historically, when integrated circuit techology was more limited,
that was often the case.
b. It still is the case for low-end microprocessors used in embedded
systems.
2. Historically, many CPU's relegated floating point arithmetic to
a separate processor, often called a "floating point coprocessor".
On older systems, this was often a separate chip, which may or may
not be installed in a given computer. (If not, floating point
arithmetic would be done in software.) Today, the "coprocessor"
is actually often part of the main CPU chip. The term coprocessor
remains in use for historical reasons, and because floating point
operations often use their own register set.
3. What we have to say in this section is applicable regardless of how
floating point operations are physically implemented.
B. We will briefly consider the basic task facing floating point processors,
but will not look at the algorithms in detail.
C. Recall that a floating point number is actually represented internally
by two fixed-point numbers: a mantissa and an exponent. That is, it is
of the form:
e
m * r
We will assume use of the IEEE standard - i.e. 1 <= m < 2, with r = 2.
D. Floating point addition or subtraction entails the following steps:
1. Reinsertion of the hidden bit. Though normalized floating point
numbers are STORED without the 1 to the left of the binary point,
the arithmetic unit can work on an internal form of the number
with the hidden bit explicitly present.
(Of course, if an operand is zero or a denormalized number, a 0 is
inserted in the hidden bit position.)
2. Denormalization: if the exponents of the two operands differ, then
the operand with the smaller exponent must be shifted right to line
up the implied binary points. The larger exponent will then be the
exponent of the result.
0 -1
Example: 1.00 * 2 + 1.00 * 2 must be converted to:
0 0
1.00 * 2 + 0.10 * 2 before adding
3. The addition/subtraction proper.
4. Renormalization: There are three possibilities for the result of
of the addition/subtraction step:
a. There could be carry out from the leftmost mantissa bit:
0 0 0
Example: 1.10 * 2 + 1.00 * 2 yields 0.10 * 2 plus a carry out
In this case, the mantissa is shifted right (bringing the
carry bit in), and the exponent is increased.
1
Example: The final result of the above is 1.01 * 2
b. There could be no carry out, and the leftmost bit of the mantissa
could be zero - i.e. the result could be unnormalized. (This only
occurs when adding numbers of unlike signs or subtracting numbers
of like sign.)
0 0 0
Example: 1.10 * 2 - 1.00 * 2 yields 0.10 * 2 (with no carry out)
In this case, the mantissa must be shifted left one or more places
to renormalize it, and the exponent must be decreased for each
shift.
-1
Example: The final result of the above is 1.00 * 2
Note: To reduce the loss of precision in cases like this, the
floating point unit often includes one or two GUARD BITS to
the right of the mantissa which "catch" bits shifted out
during denormalization and make them available for use in
renormalization
Note: If the exponent would be reduced below the smallest
permissible value, the result is left in denormalized form.
c. The result could be correct and normalized as it stands.
5. Preparation for storage.
a. If the number has been shifted right during renormalization, then
a bit will have been shifted out, and will be caught by the
guard bits. Moreover, the guard bits may contain bits that were
shifted out during initial denormalization which are properly part
of the infinite-precision result.
b. IEEE 754 defines various ROUNDING MODES that control how to handle
the guard bits:
i. Round toward zero: the guard bits are discarded. (Also called
truncation.)
ii. Round to nearest: round the result to the nearest representable
value - e.g. if the guard bits are 11, then add 1 to the least
significant bit of the result. Ties are broken by rounding
toward zero.
iii. Round toward plus infinity: if the result is positive, and
the guard bits are non-zero, add one to the least significant
bit of the result - else discard the guard bits.
iv. Round toward minus infinity: same, but round if result is
negative.
c. In any case, the the hidden bit is removed prior to storing the
result.
E. Floating point division and multiplication are - relatively speaking -
simpler than addition and subtraction.
1. The basic rule for multiplication is
a. Reinsert the hidden bit.
b. Multiply the mantissas
c. Add the exponents
d. If necessary, normalize the product by shifting right and increase
the exponent by 1. (Note that if the mantissas are normalized,
they will lie in the range: 1 <= m < 2
Therefore, the product of the mantissas will lie in the range:
1 <= m < 4
So at most one right shift is needed.
e. Store the result less the hidden bit after appropriate rounding.
2. The basic rule for division is
a. Reinsert the hidden bit.
b. Divide the mantissas
c. Subtract the exponents
d. If necessary, normalize the quotient by shifting left and decrease
the exponent by 1. (Note that if the mantissas are normalized,
they will lie in the range: 1 <= m < 2
Therefore, the quotient of the mantissas will lie in the range
0.1 < m < 2.0
2 2
So at most one left shift is needed.
e. Store the result less the hidden bit after appropriate rounding.
F. As can be seen, a floating point arithmetic unit needs to be able to
add and subtract exponents, and to shift, add, and subtract mantissas.
The latter can be done by using the same hardware as for the integer
multiply/divide operations, or special, dedicated hardware.
VIII. Representing Characters, Sounds, and Graphics
---- ------------ ---------- ------ --- --------
A. In principle, any information that can be represented as integers
(unsigned or signed) can be represented in binary by converting the
integer representation into binary.
B. Real numbers can be represented as a pair of integers - a mantissa
(perhaps with an implied binary point) and an exponent.
C. We have previously seen how textual information can be represented
by assigning integer codes to individual characters - either:
1. ASCII: each character is assigned an 8 bit code in the range 1 .. 127
2. Unicode: each character is assigned a 16 bit code in the range
1..65535.
D. Sounds
1. Computers can store and reproduce sounds by storing digitized
SAMPLES of the sound signal intensity.
DEMO: Sample Editor
2. For high quality, these samples must be taken tens of thousands of
times per second.
a. There is an important theorem, called the sampling theorem,
that says that any sound can be accurately reproduced given
samples taken at twice the highest frequency present in it.
- The human ear can hear pitches up to about 20 KHz. CD
quality audio is based on 44,000 samples per second.
- To conserve storage, computer systems often use 22,000 or
11,000 samples per second. This loses the upper-end of the
frequency spectrum, but is adequate for many purposes
b. The precision with which the samples are stored is also important.
- Music CD's use 12 bit samples, which gives a precision of one
part in 4096, or about .025 %
- Many computer systems use 8 bit samples, which gives a
precision of one part in 256, or about 0.4%
E. Graphics
1. Pictorial information is displayed by breaking the screen into
individual dots, known as PIXELS. The quality of the image is
in part determined by the number of pixels per inch (often
abbreviated dpi = dots per inch.) This is called the RESOLUTION
of the image.
a. Computer monitors typically use a resolution of around 72 dpi.
b. Typical laser printers use 300-600 dpi; some publication-quality
printers go to 600 or 1200 dpi or more.
2. For black and white graphics, each pixel can be represented by a
single bit in memory.
3. For gray scale graphics, each pixel can be represented by a
small integer (often a single byte) representing a degree of
lightness or darkness. For example, using one byte per pixel:
0 = black 255 = white 128 = medium gray
4. For color graphics, each pixel is represented by three small
integers, representing the intensity of each of the three primary
colors (red, green, and blue) at that point.
a. The most sophisticated systems currently available store 24 bits
for each pixel - 8 for each color. This allows for over
16 million colors, ranging from black (all 0's - all colors
totally off) to white (all 1's - all colors totally on.)
Examples
R G B
11111111 00000000 00000000 Pure red
11111111 11111111 00000000 Pure yellow
10000000 10000000 11111111 Light blue
DEMO: RGB Applet on course page
b. To conserve storage, some systems store only 8 bits per pixel,
where each possible value selects one of 256 predefined colors.
c. Other systems store 16 bits per pixel, allowing a choice of one
of 65536 predfined colors.
F. Movies
1. Current television technology is based on redrawing the screen
30 times per second. Each such image is called a FRAME.
2. Thus, video can be represented by a series of individual frames
(using graphics representation) - 30 per second - plus
an associated sound track.
3. As you can imagine, the storage requirements for video information
can be huge. The storage (and transmission time) requirements can
be significantly reduced by various compression techniques we will
not discuss here.
IX. Octal Numbers
-- ----- -------
A. By now you are probably tired of writing 1's and 0's all the time.
Writing numbers in binary is tiring, and it is very easy to make mistakes.
On the other hand, converting numbers between decimal and binary is
itself a painful process, so at the hardware level we like to work
with the binary form.
B. Consider, for a moment, the radix-8 (octal) number system. Since there
are 8 different symbols in this system, octal numbers are about as easy
to write as decimal numbers. Further, because 8 is a power of 2, it is
very easy to convert between binary and octal notations.
1. Binary to octal: group binary number into groups of three bits,
starting from the right. Each group will now represent a value in
the range 0 .. 7 - i.e. an octal digit.
Example: 11 000 111 --> 307
8
Exercise: 10 101 100 (254)
2. Octal to binary: convert each digit to three bits:
Example: 146 --> 001 100 110
Exercise: 321 (011 010 001)
C. In a sense, then octal becomes a form of shorthand for binary. Any
given bit can be quickly recovered from the octal representation.
Example: What is bit 4 of the number represented by 246 ?
8
Observe: octal digit 0 is bits 0..2; digit 1 is bits 3..5.
So bit 4 is middle bit of middle digit - i.e. middle bit of 4 -
i.e. middle bit of 100 - i.e. 0/
Exercise: Bit 5 of 172? (1)
X. Hexadecimal numbers
- ----------- -------
A. Another system often used as a shorthand for binary is hexadecimal - base
16. The hex digits are written 0 .. 9, A, B, C, D, E, F
Example: A13E --> 1010 0001 0011 1110
Exercise: 4FB7 (0100 1111 1011 0111)
Exercise: 1100 0101 1111 0100 (C5F4)
B. Historically, the instruction set architecture of the DEC PDP-11
minicomputer was such that using the octal shorthand was the more
natural notation. (Octal digit boundaries fell at field boundaries
within the instruction.) As a result, octal notation found its way
into the C programming language and its descendants. However, for
most modern architectures (including MIPS) the documentation convention
is to use hexadecimal, rather than octal, as a shorthand for binary.
Therefore, hexadecimal is the main shorthand we will use in this course.
C. Always remember, though, that octal and hexadecimal are simply
shorthands for making the writing and reading of binary numbers
easier. The internal representation is always binary.
Copyright ©2003 - Russell C. Bjork