CS122 Lecture: Recursion - Last revised 3/9/98
Materials:
1. Towers of Hanoi toy and demo program (in [.DEMOS.PASCAL])
2. Pascal syntax graphs transparency
3. Quicksort and permute transparencies
Objectives:
1. To review defininition of "recursive"
2. To give examples of uses of recursion
3. To review how to implement recursion in Pascal
4. To note some dangers/limitations of recursion
I. Introduction
A. Today we review a programming technique that is very powerful.
The technique is known as recursion.
B. Some definitions:
1. A recursive DEFINITION is one that defines some entity partially in
terms of itself.
2. A recursive PROCEDURE/FUNCTION is one that can call itself.
3. Recursive procedures/functions often arise in programs when one is
dealing with entities that are defined recursively.
C. An example:
1. The factorial operation can be defined mathematically as follows:
For all integers N >=0: N! = IF N <= 1 THEN 1 ELSE N*(N-1)!
2. In Pascal, this can be coded as follows:
function fact(n: NonNegInt): NonNegInt;
begin
if n <= 1 then
fact := 1
else
fact := n * fact(n-1)
end;
3. Trace operation for fact(3), showing how a stack of n values and
return points is maintained. (Note: we will discuss the use of
stacks in recursion shortly)
"n" Stack Current "fact" Return point stack
3 ? main
3 2 ? main fact
3 2 1 ? main fact fact
1
3 2 1 main fact
2
3 2 main
6
D. Note well that a recursive definition/procedure differs from a circular
definition/procedure. A recursive definition/procedure must have at
least one non-recursive component; and it must be guaranteed that
the definition/procedure will terminate after a finite number of
recursions for any (finite) input.
1. The non-recursive case(s) is/are called BASE CASE(S). There must
be at least one, but often are more.
Example: For factorial the base case is n <= 1.
2. Thm: The recursive definition/function for fact will terminate after
n-1 recursions for all n >= 1.
Proof: By induction:
a. Basis: When n=1, the algorithm terminates after 0 recursions.
b. Hypothesis: Suppose that there exists a k such that for all
n 1 <= n <= k the algorithm terminates in n-1 recursions.
c. Induction step: When n = k+1 the algorithm terminates after
n-1 = k recursions.
Proof: We calculate k+1 factorial as (k+1) * k factorial. But
calculating k factorial involves k-1 recursions by our hypothesis.
Therefore k+1 factorial involves k recursions QED.
II. Why use recursion?
A. Some entities are most naturally defined using recursion:
1. Factorial
2. Fibonacci numbers: Fib(N) = IF N <= 2 THEN 1 ELSE Fib(N-1) + Fib(N-2)
- leading to the following code:
function fib(n: integer): integer;
begin
if n <= 2 then
fib := 1
else
fib := fib(n-2) + fib(n-1)
end;
3. Syntax of languages - both natural and programming
a. Ex: English:
<noun phrase> ::= <adjective phrase> <noun> |
<noun phrase> <noun phrase> <verb>
e.g. the - adj phrase
the ball - noun phrase
the dog - noun phrase
the ball the dog bit - noun phrase
the ball the man the dog bit hit - noun phrase
b. Ex: Pascal: Show syntax graph for "block"; "expression"
4. Many Artificial Intelligence applications use recursive lists
5. Trees: A tree consists of a root and zero or more subtrees, each of
which is a tree. (Draw)
B. Some problems are most easily solved using recursion.
1. Ex: Towers of Hanoi
a. Illustrate using model.
b. Observe:
i. For the case N=1 the problem is trivial.
ii. For N>1 the problem can be solved in terms of two subproblems
of size N-1: To move N disks from peg A to peg B:
(1) Move top N-1 from peg A to peg C
(2) Move 1 disk from peg A to peg B
(3) Move N-1 disks from peg C to peg B
c. Demonstrate for case N=3
d. A portion of a program:
procedure towers(n: integer; startPeg, finishPeg, helpPeg: char);
(* Prints directions for moving n disks from start peg to finish peg *)
begin
if n <= 1 then
writeln('Move disk from peg ', startPeg,' to peg ',
finishPeg)
else
begin
towers(n-1, startPeg, helpPeg, finishPeg);
writeln('Move disk from peg ', startPeg,' to peg ',
finishPeg);
towers(n-1, helpPeg, finishPeg, startPeg)
end
end;
DEMO: [DEMOS.PASCAL]HANOI_GRAPHIC
DEMO: [DEMOS.PASCAL]HANOI_STACK
2. In general, recursion is an appropriate way to tackle a problem if:
a. It has a "size".
b. It is trivial for a certain size.
c. A "big" problem can always be reduced to one or more subproblems of
lesser size.
d. Examples:
i. Find the Greatest Common Divisor of A,B (A >= B)
(1) Let B be the "size"
(2) The problem is trivial when B is 0.
(3) For B > 0, the problem can be reduced to GCD(B, A MOD B)
(Euclid's Algorithm - for proof see Dromey p 97 f)
Now clearly A MOD B < B; therefore we have reduced the size
of the problem
This leads to:
function gcd(a,b: integer): integer;
begin
if b = 0 then
gcd := a
else
gdc := gcd(b, a mod b)
end;
ii. Sorting a list of non-equal items into ascending order:
(1) Let the number of items in the list be the size (N).
(2) The problem is trivial when N=1
(3) For N>1 we can reduce the problem to 2 subproblems of
smaller size, as follows:
(a) Choose an arbitrary item from the list - say the
first. Call it M.
(b) Divide the list into two sublists - one consisting of
all items smaller than M and one consisting of all items
greater than M.
(c) Clearly each sublist contains less than N items, since M
is not a member of either.
(d) Sort the original list by sorting each sublist and then
gluing back together: sorted list of items smaller than
M; M; sorted list of items larger than M.
(4) This is the basis of a standard sorting algorithm known as
quicksort: (TRANSPARENCY)
type
ListToSort = array[1..max] of sometype;
...
procedure QuickSort(var L: ListToSort; lo, hi: integer);
(* Sorts a sublist of L: L[lo] .. L[hi] recursively *)
var
i, j: integer; (* Pointers for partitioning the list. i
starts at lo, moves up; j starts at hi+1,
moves down. *)
begin
if hi <= lo then
(* the trivial case-a zero or one element list-do nothing *)
else
begin
(* Partition the original list *)
i := lo;
j := hi + 1;
repeat
repeat
i := i + 1
until (L[i] >= L[lo]) or (i > hi);
repeat
j := j - 1
until (L[j] <= L[lo]);
if i < j then
exchange(L[i], L[j])
until i >= j;
exchange(L[lo],L[j]);
(* L[lo..j-1] now contains items <= L[j] and L[j+1..hi]
contains items >= L[j]. Now we sort the sublists. *)
QuickSort(L,lo,j-1);
QuickSort(L,j+1,hi)
end
end;
(* To call the procedure from main, we use the call:
QuickSort(array, 1, max) *)
(5) Trace when applied to the list C A D E B:
- First call: lo = 1, hi = 5
L[lo] i j Action
C 1 6 Enter first inner repeat
2 6
3 6 Exit first inner repeat; enter
second
3 5 Exit second inner repeat;
exchange L[3], L[5]
List is now C A B E D
3 5 Enter first inner repeat
4 5 Exit first inner repeat; enter
second
4 4
4 3 Exit second inner repeat
No inner exchange
Exit outer repeat
4 3 Exchange L[1], L[3]
List is now B A C E D
- Second call: lo = 1, hi = 2, sublist = B A
B 1 3 Enter first inner repeat
2 3
3 3 Exit first inner repeat; enter
second
3 2 Exit second inner repeat
No inner exchange
Exit outer repeat
3 2 Exchange L[1], L[2]
List is now A B C E D
- Third call: lo = 3, hi = 2 - trivial
- Fourth call: lo = 1, hi = 1, sublist = A - trivial
- Fifth call (from first) lo = 4, hi = 5, sublist = E D
E 4 6 Enter first inner repeat
5 6
6 6 Exit first inner repeat; enter
second
6 5 Exit second inner repeat
No inner exchange
Exit outer repeat
Exchange L[4], L[5]
List is now A B C D E
- Sixth call: lo = 6, hi = 5 - trivial
- Seventh call: lo = 4, hi = 4, sublist = D - trivial
Algorithm terminates: list is A B C D E
(6) What is the time complexity of this algorithm? ASK
- This is an example of an algorithm for which we must
consider two cases: the average case and the worst
case.
- The average case occurs when each partitioning divides
the list approximately in half. In this case, we have
the following situation:
1 list of size N
|
2 lists of size N/2
|
4 lists of size N/4
|
...
N lists of size 1
- At each level, partitioning the list(s) involves O(N)
total work (size of each list * number of lists = N)
- There are O(log N) levels.
2
k-1
Proof: At level k, # of lists = 2
At last level, # of lists = N, so we get:
#levels - 1
N = 2 ; #levels = log N + 1
2
- Total time is O(N log N), which compares very favorably
with methods you have learned previously that are
2
O(N )
2
Example: For N = 1000, N log N = 10,000, N = 1 million
- Alas, we must also consider the worst case, which occurs
if the list is either already sorted or already sorted in
reverse order:
1 list of size N
|
1 list of size 1 plus 1 of size N-1
|
1 list of size 1 plus 1 of size N-2
|
...
2 lists of size 1
2
Now the work is N + N-1 + N-2 + ... + 1 = O(N )!
iii. Generating all permutations of a list of distinct items.
(1) Let the number of items in the list be the size (N).
(2) The problem is trivial when N=1
(3) For N>1 we can reduce the size of the problem by 1 as
follows: (TRANSPARENCY)
type
itemlist = array[1..max] of itemtype;
procedure permute(var item: itemlist; n: 1..max);
var
i: 1..max;
begin
if n = 1 then
begin
for i := max downto 1 do
write(item[i]);
writeln
end
else
for i := 1 to n do
begin
exchange(item[i],item[n]);
permute(item, n-1);
exchange(item[i],item[n])
end
end;
III. Classes of recursive algorithms:
A. A linear recursive algorithm is one in which each non-trivial call to
the recursive procedure gives rise to one new call to that procedure -
i.e. the procedure calls itself only at one point:
Ex: factorial, gcd
- An important subcategory of linear recursive is tail-recursive. In a
tail-recursive algorithm, the recursive call is the very last step in
the algorithm. Tail-recursive algorithms can be easily converted to
non-recursive form, replacing the recursion with a loop. (Show for
gcd).
B. A binary recursive algorithm is one in which each non-trivial call to
the recursive procedure gives rise to two new calls to that procedure-
ie the procedure calls itself at two points. Many of the most important
recursive algorithms fall into this category - including those based
on a "divide and conquer" approach.
Ex: Fibonacci, towers of Hanoi, QuickSort
C. A non-linear recursive algorithm is one in which each non-trivial call
to the recursive procedure gives rise to a varying number of new calls
to that procedure (often more than two). Frequently, such a procedure
calls itself from within a loop.
Ex: permute
D. A mutually recursive algorithm is one which contains several recursive
procedures which call themselves indirectly - e.g. A calls B, B calls
C, and C calls A. Parsers for languages (artificial and natural) are
often mutual-recursive.
Ex: recursive descent parser for Pascal-like arithmetic expressions:
expr = term | expr addop term
term = factor | term mulop factor
factor = variable | constant | (expr)
A procedure that recognizes an expression would call one that
recognizes a term, which in turn would call one that recognizes a
factor. The latter would call expression if it sees a '(' in the
input stream.
We will develop this in more detail later.
IV. Limitations of recursion
A. Saving of variables: Consider the factorial example. If we execute
fact(3), we must at one point have four different "versions" of
N: 3,2,1,0. Thus, when we execute a recursive call we must tuck away
our current set of local variables until the call is complete.
1. This is done by using a stack, with each activation of the procedure
having its own stack frame containing its parameters, return
address, and local variables - usually in that order.
2. If we invoke fact(3), then when we finally get down to the trivial
case our stack will look like this:
--------------
n = 0
return to fact
--------------
n = 1
return to fact
--------------
n = 2
return to fact
--------------
n = 3
return to main
--------------
3. In Pascal, management of a stack to support recursion is done
automatically by the language implementation. Not so with some other
languages, where you must manage the stack yourself if you wish to
recurse. (Cf later discussion on stacks) Indeed, FORTRAN and
COBOL explicitly forbid recursion, though you can work around this.
B. Efficiency:
1. Overhead of procedure calls, saving of variables etc.
2. For some problems, recursion is highly inefficient.
Ex: Recursive computation of Fib(6)). One computes:
Fib(6)
/ \
Fib(5) Fib(4)
/ \ / \
Fib(4) Fib(3) Fib(3) Fib(2)
/ \ / \ / \
Fib(3) Fib(2) Fib(2) Fib(1) Fib(2) Fib(1)
/ \
Fib(2) Fib(1)
6
Fib(5) once
Fib(4) twice
Fib(3) thrice
Fib(2) 5 x
Fib(1) 3 x
In fact, there are Fib(N) "leaves" to the tree of computations,
and almost as many internal nodes. In general, then, Fib(N)
requires > Fib(N) recursions.
C. Recursion may sometimes be more obscure than other approaches to
the problem - though often it is much more clear.
D. We will see shortly that ANY recursive program can be converted to one
using only loops - though sometimes at the expense of semi-spaghetti
code. For now, we illustrate:
1. Fact: F := 1;
FOR I := 2 TO N DO
F := I*F;
Fact := F
2. Fib: FibNMinus1 := 0;
FibN := 1;
FOR I := 2 TO N DO
BEGIN
NewFib := FibNMinus1 + FibN;
FibNMinus1 := FibN;
FibN := NewFib
END;
Fib := FibN
3. In general, a linear-recursive algorithm can almost always be
converted to a simple loop, and generally should be so converted.
(For a tail-recursive algorithm, the conversion is trivial.)
A binary recursive algorithm can sometimes be converted to a simple
loop (as with fib) - but not always. Converting some binary
recursive and most non-linear and mutually recursive algorithms to
non-recursive loop form can be very messy, however.
Copyright ©1999 - Russell C. Bjork