CS122 Lecture: Recursion                        - Last revised 3/9/98

Materials: 

1. Towers of Hanoi toy and demo program (in [.DEMOS.PASCAL])
2. Pascal syntax graphs transparency
3. Quicksort and permute transparencies

Objectives:

1. To review defininition of "recursive"
2. To give examples of uses of recursion
3. To review how to implement recursion in Pascal
4. To note some dangers/limitations of recursion

I. Introduction

   A. Today we review a programming technique that is very powerful.
      The technique is known as recursion.

   B. Some definitions:

      1. A recursive DEFINITION is one that defines some entity partially in
         terms of itself.

      2. A recursive PROCEDURE/FUNCTION is one that can call itself.

      3. Recursive procedures/functions often arise in programs when one is
         dealing with entities that are defined recursively.

   C. An example:

      1. The factorial operation can be defined mathematically as follows:
         For all integers N >=0: N! = IF N <= 1 THEN 1 ELSE N*(N-1)!

      2. In Pascal, this can be coded as follows:
        
        function fact(n: NonNegInt): NonNegInt;
            begin
                if n <= 1 then
                    fact := 1
                else
                    fact := n * fact(n-1)
            end;

      3. Trace operation for fact(3), showing how a stack of n values and
         return points is maintained.  (Note: we will discuss the use of
         stacks in recursion shortly)

        "n" Stack       Current "fact"          Return point stack

        3               ?                       main
        3 2             ?                       main fact
        3 2 1           ?                       main fact fact
                        1
        3 2             1                       main fact
                        2
        3               2                       main
                        6
               
   D. Note well that a recursive definition/procedure differs from a circular
      definition/procedure.  A recursive definition/procedure must have at
      least one non-recursive component; and it must be guaranteed that 
      the definition/procedure will terminate after a finite number of
      recursions for any (finite) input.

      1. The non-recursive case(s) is/are called BASE CASE(S).  There must
         be at least one, but often are more.

         Example: For factorial the base case is n <= 1.

      2. Thm: The recursive definition/function for fact will terminate after
         n-1 recursions for all n >= 1.

         Proof: By induction:

         a. Basis: When n=1, the algorithm terminates after 0 recursions.

         b. Hypothesis: Suppose that there exists a k such that for all
            n 1 <= n <= k the algorithm terminates in n-1 recursions.  

         c. Induction step: When n = k+1 the algorithm terminates after 
            n-1 = k recursions.
            Proof: We calculate k+1 factorial as (k+1) * k factorial. But
            calculating k factorial involves k-1 recursions by our hypothesis.
            Therefore k+1 factorial involves k recursions QED.

II. Why use recursion?

    A. Some entities are most naturally defined using recursion:

       1. Factorial

       2. Fibonacci numbers: Fib(N) = IF N <= 2 THEN 1 ELSE Fib(N-1) + Fib(N-2)

          - leading to the following code:

          function fib(n: integer): integer;
            begin
              if n <= 2 then
                fib := 1
              else
                fib := fib(n-2) + fib(n-1)
            end;

       3. Syntax of languages - both natural and programming

          a. Ex: English:

                <noun phrase> ::= <adjective phrase> <noun> | 
                                  <noun phrase> <noun phrase> <verb>

                e.g. the        - adj phrase
                     the ball   - noun phrase
                     the dog    - noun phrase
                     the ball the dog bit - noun phrase
                     the ball the man the dog bit hit - noun phrase

          b. Ex: Pascal: Show syntax graph for "block"; "expression"

       4. Many Artificial Intelligence applications use recursive lists

       5. Trees: A tree consists of a root and zero or more subtrees, each of
          which is a tree. (Draw)

    B. Some problems are most easily solved using recursion.

       1. Ex: Towers of Hanoi

          a. Illustrate using model.

          b. Observe:

             i. For the case N=1 the problem is trivial.

            ii. For N>1 the problem can be solved in terms of two subproblems
                of size N-1: To move N disks from peg A to peg B: 

                (1) Move top N-1 from peg A to peg C
                (2) Move 1 disk from peg A to peg B
                (3) Move N-1 disks from peg C to peg B

          c. Demonstrate for case N=3

          d. A portion of a program:

        procedure towers(n: integer; startPeg, finishPeg, helpPeg: char);
        (* Prints directions for moving n disks from start peg to finish peg *)

            begin
                if n <= 1 then
                    writeln('Move disk from peg ', startPeg,' to peg ',
                             finishPeg)
                else
                  begin
                    towers(n-1, startPeg, helpPeg, finishPeg);
                    writeln('Move disk from peg ', startPeg,' to peg ',
                             finishPeg);
                    towers(n-1, helpPeg, finishPeg, startPeg)
                  end
           end;

         DEMO: [DEMOS.PASCAL]HANOI_GRAPHIC

         DEMO: [DEMOS.PASCAL]HANOI_STACK

      2. In general, recursion is an appropriate way to tackle a problem if:

         a. It has a "size".

         b. It is trivial for a certain size.

         c. A "big" problem can always be reduced to one or more subproblems of
            lesser size.

         d. Examples:

            i. Find the Greatest Common Divisor of A,B (A >= B)

               (1) Let B be the "size"
               (2) The problem is trivial when B is 0.
               (3) For B > 0, the problem can be reduced to GCD(B, A MOD B)
                   (Euclid's Algorithm - for proof see Dromey p 97 f)
                   Now clearly A MOD B < B; therefore we have reduced the size
                   of the problem

                   This leads to:

                function gcd(a,b: integer): integer;
                
                begin
                        if b = 0 then
                                gcd := a
                        else
                                gdc := gcd(b, a mod b)
                end;

           ii. Sorting a list of non-equal items into ascending order:

               (1) Let the number of items in the list be the size (N).
               (2) The problem is trivial when N=1
               (3) For N>1 we can reduce the problem to 2 subproblems of
                   smaller size, as follows:

                   (a) Choose an arbitrary item from the list - say the
                       first.  Call it M.
                   (b) Divide the list into two sublists - one consisting of
                       all items smaller than M and one consisting of all items
                       greater than M.
                   (c) Clearly each sublist contains less than N items, since M
                       is not a member of either.
                   (d) Sort the original list by sorting each sublist and then
                       gluing back together: sorted list of items smaller than
                       M; M; sorted list of items larger than M.

               (4) This is the basis of a standard sorting algorithm known as
                   quicksort: (TRANSPARENCY)

        type
            ListToSort = array[1..max] of sometype;
        ...
        procedure QuickSort(var L: ListToSort; lo, hi: integer);
        (* Sorts a sublist of L: L[lo] .. L[hi] recursively *)

            var
                i, j: integer;  (* Pointers for partitioning the list.  i
                                   starts at lo, moves up; j starts at hi+1,
                                   moves down. *)
            begin

                if hi <= lo then
                    (* the trivial case-a zero or one element list-do nothing *)
                else
                  begin

                    (* Partition the original list *)

                    i := lo;
                    j := hi + 1;
                    repeat
                        repeat
                            i := i + 1
                        until (L[i] >= L[lo]) or (i > hi);
                        repeat
                            j := j - 1
                        until (L[j] <= L[lo]);
                        if i < j then
                            exchange(L[i], L[j])
                    until i >= j;
                    exchange(L[lo],L[j]);

                    (* L[lo..j-1] now contains items <= L[j] and L[j+1..hi]
                       contains items >= L[j]. Now we sort the sublists.  *)

                    QuickSort(L,lo,j-1);
                    QuickSort(L,j+1,hi)
                  end
            end;

        (* To call the procedure from main, we use the call:

        QuickSort(array, 1, max)                *)

               (5) Trace when applied to the list C A D E B:

                   - First call: lo = 1, hi = 5

                        L[lo]   i       j       Action

                        C       1       6       Enter first inner repeat
                                2       6
                                3       6       Exit first inner repeat; enter
                                                second
                                3       5       Exit second inner repeat;
                                                exchange L[3], L[5]
                        List is now C A B E D

                                3       5       Enter first inner repeat
                                4       5       Exit first inner repeat; enter
                                                second
                                4       4       
                                4       3       Exit second inner repeat
                                                No inner exchange
                                                Exit outer repeat
                                4       3       Exchange L[1], L[3]

                        List is now B A C E D

                   - Second call: lo = 1, hi = 2, sublist = B A

                        B       1       3       Enter first inner repeat
                                2       3
                                3       3       Exit first inner repeat; enter
                                                second
                                3       2       Exit second inner repeat
                                                No inner exchange
                                                Exit outer repeat
                                3       2       Exchange L[1], L[2]

                        List is now A B C E D

                   - Third call: lo = 3, hi = 2 - trivial

                   - Fourth call: lo = 1, hi = 1, sublist = A - trivial

                   - Fifth call (from first) lo = 4, hi = 5, sublist = E D

                        E       4       6       Enter first inner repeat
                                5       6
                                6       6       Exit first inner repeat; enter
                                                second
                                6       5       Exit second inner repeat
                                                No inner exchange
                                                Exit outer repeat
                                                Exchange L[4], L[5]

                        List is now A B C D E

                   - Sixth call: lo = 6, hi = 5 - trivial

                   - Seventh call: lo = 4, hi = 4, sublist = D - trivial

                Algorithm terminates: list is A B C D E

               (6) What is the time complexity of this algorithm? ASK

                   - This is an example of an algorithm for which we must
                     consider two cases: the average case and the worst
                     case.

                   - The average case occurs when each partitioning divides
                     the list approximately in half.  In this case, we have
                     the following situation:

                                1 list of size N
                                        |
                                2 lists of size N/2
                                        |
                                4 lists of size N/4
                                        |
                                       ...
                                N lists of size 1

                     - At each level, partitioning the list(s) involves O(N) 
                       total work (size of each list * number of lists = N)

                     - There are O(log N) levels.
                                      2
                                                        k-1
                       Proof: At level k, # of lists = 2

                            At last level, # of lists = N, so we get:
                                 #levels - 1
                            N = 2            ; #levels = log N + 1
                                                            2

                     - Total time is O(N log N), which compares very favorably
                       with methods you have learned previously that are
                          2
                       O(N )
                                                                 2
                       Example: For N = 1000, N log N = 10,000, N = 1 million

                   - Alas, we must also consider the worst case, which occurs
                     if the list is either already sorted or already sorted in
                     reverse order:

                                1 list of size N
                                        |
                                1 list of size 1 plus 1 of size N-1
                                        |
                                1 list of size 1 plus 1 of size N-2
                                        |
                                       ...
                                2 lists of size 1
                                                                  2
                     Now the work is N + N-1 + N-2 + ... + 1 = O(N )!

          iii. Generating all permutations of a list of distinct items.

               (1) Let the number of items in the list be the size (N).
               (2) The problem is trivial when N=1
               (3) For N>1 we can reduce the size of the problem by 1 as
                   follows: (TRANSPARENCY)

        type
            itemlist = array[1..max] of itemtype;

        procedure permute(var item: itemlist; n: 1..max);

            var
                i: 1..max;

            begin

                if n = 1 then
                  begin
                    for i := max downto 1 do
                        write(item[i]);
                    writeln
                  end
                else
                    for i := 1 to n do
                      begin
                        exchange(item[i],item[n]);
                        permute(item, n-1);
                        exchange(item[i],item[n])
                      end

            end;

III. Classes of recursive algorithms:

     A. A linear recursive algorithm is one in which each non-trivial call to
        the recursive procedure gives rise to one new call to that procedure -
        i.e. the procedure calls itself only at one point:

        Ex: factorial, gcd

        - An important subcategory of linear recursive is tail-recursive.  In a
        tail-recursive algorithm, the recursive call is the very last step in
        the algorithm.  Tail-recursive algorithms can be easily converted to
        non-recursive form, replacing the recursion with a loop.  (Show for
        gcd).

     B. A binary recursive algorithm is one in which each non-trivial call to
        the recursive procedure gives rise to two  new calls to that procedure-
        ie the procedure calls itself at two points. Many of the most important
        recursive algorithms fall into this category - including those based
        on a "divide and conquer" approach.

        Ex: Fibonacci, towers of Hanoi, QuickSort

     C. A non-linear recursive algorithm is one in which each non-trivial call
        to the recursive procedure gives rise to a varying number of new calls
        to that procedure (often more than two).  Frequently, such a procedure
        calls itself from within a loop.

        Ex: permute

     D. A mutually recursive algorithm is one which contains several recursive
        procedures which call themselves indirectly - e.g. A calls B, B calls 
        C, and C calls A.  Parsers for languages (artificial and natural) are
        often mutual-recursive.

        Ex: recursive descent parser for Pascal-like arithmetic expressions:

                expr = term | expr addop term
                term = factor | term mulop factor
                factor = variable | constant | (expr)

        A procedure that recognizes an expression would call one that
        recognizes a term, which in turn would call one that recognizes a
        factor.  The latter would call expression if it sees a '(' in the
        input stream.

        We will develop this in more detail later.

IV. Limitations of recursion

     A. Saving of variables: Consider the factorial example.  If we execute
        fact(3), we must at one point have four different "versions" of
        N: 3,2,1,0.  Thus, when we execute a recursive call we must tuck away 
        our current set of local variables until the call is complete.

        1. This is done by using a stack, with each activation of the procedure
           having its own stack frame containing its parameters, return
           address, and local variables - usually in that order.

        2. If we invoke fact(3), then when we finally get down to the trivial
           case our stack will look like this: 

        --------------
        n = 0
        return to fact
        --------------
        n = 1
        return to fact
        --------------
        n = 2
        return to fact
        --------------
        n = 3
        return to main
        --------------

        3. In Pascal, management of a stack to support recursion is done
           automatically by the language implementation. Not so with some other
           languages, where you must manage the stack yourself if you wish to
           recurse.  (Cf later discussion on stacks)  Indeed, FORTRAN and
           COBOL explicitly forbid recursion, though you can work around this.

     B. Efficiency:

        1. Overhead of procedure calls, saving of variables etc.

        2. For some problems, recursion is highly inefficient.
           Ex: Recursive computation of Fib(6)).  One computes:

                                Fib(6)
                         /              \
                Fib(5)                          Fib(4)
              /        \                     /          \
        Fib(4)          Fib(3)          Fib(3)        Fib(2)
       /      \        /      \        /     \
   Fib(3)    Fib(2)  Fib(2)  Fib(1)  Fib(2)  Fib(1)
  /    \
Fib(2)  Fib(1)
                                6
                Fib(5) once  
                Fib(4) twice 
                Fib(3) thrice
                Fib(2) 5 x 
                Fib(1) 3 x

           In fact, there are Fib(N) "leaves" to the tree of computations,
           and almost as many internal nodes.  In general, then, Fib(N) 
           requires > Fib(N) recursions.

     C. Recursion may sometimes be more obscure than other approaches to 
        the problem - though often it is much more clear.

     D. We will see shortly that ANY recursive program can be converted to one
        using only loops - though sometimes at the expense of semi-spaghetti
        code.  For now, we illustrate:

        1. Fact:        F := 1;
                        FOR I := 2 TO N DO
                                F := I*F;
                        Fact := F

        2. Fib:         FibNMinus1 := 0;
                        FibN := 1;
                        FOR I := 2 TO N DO
                           BEGIN
                                NewFib := FibNMinus1 + FibN;
                                FibNMinus1 := FibN;
                                FibN := NewFib
                           END;
                        Fib := FibN

        3. In general, a linear-recursive algorithm can almost always be
           converted to a simple loop, and generally should be so converted.
           (For a tail-recursive algorithm, the conversion is trivial.)
           A binary recursive algorithm can sometimes be converted to a simple
           loop (as with fib) - but not always.  Converting some binary
           recursive and most non-linear and mutually recursive algorithms to 
           non-recursive loop form can be very messy, however.

Copyright ©1999 - Russell C. Bjork