Macros and Conditional Assembly

CS222 Lecture: Macros and Conditional Assembly                          10/30/92
                                                                revised 2/16/96

Materials: Handout showing "faking" of new VAX opcodes via Macros

Introduction
------------

   A. After writing a few MACRO programs, you realize they tend to be long and
      may involve a lot of repetitious code that is tedious to write.

      Example: In one of the labs, you were required to check for overflow after
               various arithmetic operations.   Suppose that, instead of
               terminating processing of the equation when an overflow 
               occurred, you were asked to simply write a warning message
               and keep going.  Suppose further that we would be hapy using
               the same message in each case.  The code to do this might look 
               like this:

                BVC     1$
                MOVAB   O_MESSAGE, R0
                MOVL    #O_MESSAGE_LEN, R1
                JSB     PLINE
        1$:

      This same block of code would need to be incorporated about 4 times
      in your program.

   B. To reduce the repetitious work, many assemblers (including ours) contain
      a macro facility that allows a programmer to define a block of code to
      be inserted into a program whenever needed by a simple command.

      Example: The above could be handled as follows:

      1. Define a macro

                .MACRO  OCHECK, ?NONE
                BVC     NONE
                MOVAB   O_MESSAGE, R0
                MOVL    #O_MESSAGE_LEN, R1
                JSB     PLINE
        NONE:
                .ENDM   OCHECK

      2. Each time the code is needed, it could be included by:

                OCHECK

         This will cause the assembler to include all the code between
         the .MACRO and .ENDM directives.

      3. We will now look at the macro facility in VAX MACRO in detail, followed
         by a feature often used in conjunction with it called CONDITIONAL 
         ASSEMBLY.

I. The Macro Facility in VAX MACRO
-  --- ----- -------- -- --- -----

   A. The macro facility allows a progam to contain MACRO DEFINITIONS and
      MACRO CALLS

      1. A macro definition begins with a .MACRO directive, and ends with a
         .ENDM directive.  The code that lies between these directives 
         constitutes the BODY of the macro.

      2. The .MACRO directive gives the macro a name.  When this name is
         encountered in the op-code position on a subsequent line of the
         source program, it is considered a macro call, and the macro body is 
         substituted for it.  (This process is called MACRO EXPANSION)

   B. The macro facility, while included in the assembler, is conceptually a
      distinct facility from the assembler itself.  We can think of it as
      being done by a PRE-PROCESSOR that manipulates the source program
      text before the assembler proper sees it:

         _________       _____________        -------------
        / Source /       | Pre-      |        | Assembler |
       (  File  (------->| processor |------->| proper    |
        \________\       |___________|        |___________|
                               ^
                               |
                               v
                         ______________
                         | macro      |
                         | definition |
                         | table      |
                         |____________|

      1. Ordinarily, the pre-processor simply passes source text through to
         the assembler proper unchanged.

         On diagram:    --------------->

      2. However, when the pre-processor encounters a .MACRO directive, it
         stores the text that follows (up to the terminating .ENDM directive)
         in the macro definition table.  This text is NOT passed through to
         the assembler proper.

         On diagram:    ---------
                                |
                                v

      3. When the pre-processor encounters the name of a defined macro, it
         feeds the body of the macro to the assembler processor, just as if
         it had come from the source file.

         On diagram:            ------->
                                |
                                ^

      4. Note that the body of a macro is not processed by the assembler until
         the macro is called. 

         a. Thus, if the macro body contains syntax errors, they will not be 
            detected until the macro is called.

         b. We will see that macro bodies can contain directives, including
            other macro definitions.  However, these are not processed either
            until the macro is called.

   C. Contrast macros with procedures:

      1. Procedure: one physical copy is assembled, which may be called from
         many points in the program, at run-time.

      2. Macro: a separate physical copy is assembled each time the macro is
         called, at assembly-time.

      3. Efficiency consequences:

         a. Procedures are generally more space efficient.  (Only one physical
            copy exists.)

         b. Macros are generally more time efficient.  (There is no run-time
            overhead for calling.)

   D. To make macros more flexible, they can have ARGUMENTS: formal arguments
      in the definition, and actual arguments in the call.

      1. Example: A macro to perform the mod operation

                .MACRO  MODL A, B
                DIVL3   A, B, -(SP)     ; Top of stack := B div A
                MULL2   A, (SP)         ; Top of stack := A * (B div A)
                SUBL2   (SP)+, B        ; B := B - A * (B div A) = B mod A
                .ENDM   MODL

         This might be called by something like:

                MODL    #10, R0         ; Compute R0 mod 10

         a. In the definition, A and B are FORMAL ARGUMENTS.

         b. In the call, #10 and R0 are ACTUAL ARGUMENTS.

         c. The code that the assembler will assemble is the same as if the
            programmer had written:

                DIVL3   #10, R0, -(SP)
                MULL2   #10, (SP)
                SUBL2   (SP)+, R0

         d. Normally, a macro call will have as many actual arguments as the
            definition has formal arguments.

            i. It is permissible for the call to have fewer arguments - in
               which case the extra arguments are blank (unless a default is
               specified, as described below.)

           ii. It is an error for a macro call to specify more actual arguments
               than the definition has formal arguments.

      2. Note that the processing of macro arguments is a TEXTUAL operation.
         The TEXT of the actual argument is substituted for the text of the
         corresponding formal argument wherever it occurs. 

      3. Some additional facilities available for use with macro arguments

         a. The concatenation operator - '

            Example: Instead of creating a MACRO for MODL, we could create
                     a generic macro that works for any data type for which
                     multiplication and division is defined, as follows:

                .MACRO          MOD2 TYPE, A, B
                DIV'TYPE'3      A, B, -(SP) ; Top of stack := B div A
                MUL'TYPE'2      A, (SP)     ; Top of stack := A * (B div A)
                SUB'TYPE'2      (SP)+, B    ; B := B - A * (B div A) = B mod A
                .ENDM   MOD2

            If this were called by

                MOD2    L, #10, R0

            The generated code would be:

                DIVL3   #10, R0, -(SP)
                MULL2   #10, (SP)
                SUBL2   (SP)+, R0

            While if it were called by

                MOD2    B, #10, R0

            The generated code would be:

                DIVB3   #10, R0, -(SP)
                MULB2   #10, (SP)
                SUBB2   (SP)+, R0

            In fact, we could get really fancy and define further macros to
            give the appearance that the VAX has a MODx machine instruction
            even though it doesn't:

                .MACRO  MODB2   X,Y
                MOD2    B,X,Y
                .ENDM   MODB2

                -- MODW2, MODL2 done similarly

         b. The use of non-positional syntax for actual arguments

            Example: Given the above definition for MOD, all of the following
                     calls are equivalent:

                MOD     L, #10, R0
                MOD     TYPE=L, A=#10, B=R0
                MOV     TYPE=L, A=R0, B=#10
                MOD     A=#10, B=R0, TYPE=L

                (plus several other possibilities)

         c. The specification of default values for arguments, allowing an
            actual argument to be omitted

            Example: The following macro swaps two longword arguments.
                     This requires a temporary location, which can be either
                     specified by the programmer, or R0 will be used by
                     default

                .MACRO  SWAPL A, B, TEMP=R0
                MOVL    A, TEMP
                MOVL    B, A
                MOVL    TEMP, B
                .ENDM   SWAPL

            If this were called by 

                SWAPL X, Y, R7

            The generated code would be

                MOVL    X, R7
                MOVL    Y, X
                MOVL    R7, Y

            However, if it were called by

                SWAPL   X, Y

            The generated code would be

                MOVL    X, R0
                MOVL    Y, X
                MOVL    R0, Y

            i. When a macro is called with positional syntax, only the last
               argument(s) can meaningfully have defaults

           ii. If a macro is going to be called with non-positional syntax,
               any argument can have a default and can be omitted.

        d. If an actual argument contains spaces or punctuation marks,
           it may be necessary to enclose it in angle brackets.

           Example:

                .MACRO  XYZ     ARGS
                ADDL3   ARGS
                .ENDM   XYZ

           The programmer may wish to call this by:

                XYZ     R0, R1, R2

           Intentending the generated code to be:

                ADDL3   R0, R1, R2

           However, R0 would be taken as the match for ARGS, and R1 and R2
           would be regarded as extra arguments - an error.

           The call has to be written this way:

                XYZ     <R0, R1, R2>

           This would generate the intended code.

   E. Generating unique labels

      1. A common problem in writing macros is creating appropriate labels
         within the macro body without duplication.

         Example: Suppose we want to define a macro that adds two numbers and
                  branches to an error handler if an overflow occurs.  We could
                  write something like this:

                .MACRO  ADD_CHECKED     A, B
                ADDL2   A, B
                BVC     NO_OVER
                JMP     OVERFLOW_ERROR
        NO_OVER:
                .ENDM   ADD_CHECKED

         If this macro were called only once, we would have no problem; but if
         it were used two or more times (presumably the reason for defining it
         in the first place), then the symbol NO_OVER would be multiply defined.

      2. To address this problem, VAX MACRO has a facility that automatically
         generates unique labels - a different one each time the macro is 
         called.  Our macro would be defined this way:

                .MACRO  ADD_CHECKED     A, B, ?NO_OVER
                ADDL2   A, B
                BVC     NO_OVER
                JMP     OVERFLOW_ERROR
        NO_OVER:
                .ENDM   ADD_CHECKED

         a. If the macro is called with no third argument, a unique label
            is automatically generated.  These will be of the form
            30001$, 30002$ ...

            Example: Suppose we have the following series of calls:

                ADD_CHECKED X, R0
                ADD_CHECKED Y, R0

            The generated code might be

                ADDL2   X, R0
                BVC     30001$
                JMP     OVERFLOW_ERROR
        30001$:
                ADDL2   Y, R0
                BVC     30002$
                JMP     OVERFLOW_ERROR
        30002$:

         b. If the user wants to, he can specify a label to be used by
            specifying an explicit third argument.

            Example: The following call

                ADD_CHECKED R1, R2, OK
        
            Expands to:

                ADDL2   R1, R2
                BVC     OK
                JMP     OVERFLOW_ERROR
        OK:

   F. In addition to creating his own macros, a programmer can also use
      macros from a library of predefined macros.

      1. There is a library of predefined macros for accessing various system
         routines.  The names of these macros all begin with $.

         Example: the macro $EXIT_S calls the exit program system service.  It
                  takes an optional argument specifying the final status code
                  for the program.

      2. Other macro libraries can be incorporated in an assembly process by
         using a /LIBRARY qualifier on the command line or a .LIBRARY directive
         in the program.

   G. Macro definitions and calls can be NESTED:

      1. A macro definition can contain a call to another macro

         Example - a macro that checks the status value returned by a system
                   routine and exits the program if it is a failure status:

                .MACRO  CHECK   WHAT=R0, ?OK
                BLBS    WHAT, OK
                $EXIT_S WHAT
        OK:
                .ENDM   CHECK

         As we just noted, $EXIT_S is, in fact, a system macro which will be
         called as part of the process of expanding CHECK.

      2. A macro definition can contain a definition of another macro.  In
         this case, the nested definition is not processed until the outer
         macro is called.

         Example: We saw earlier that we could use macros to create the
                  appearance of new VAX machine instructions - e.g. MODB2,
                  MODL2, MODW2.  Of course, we could also create MODB3 etc,
                  and we could also create other operations like AND (which the 
                  VAX does not have).  

         We could automate much of the work as follows:

         a. Define "generic" MOD2, MOD3, AND2, AND3 etc. macros that take a
            type as a parameter.

         b. Define two more "constructor" macros:

                .MACRO  FAKE_OP NAME, TYPE

                        .MACRO NAME'TYPE'2  X, Y
                                NAME'2  TYPE, X, Y
                        .ENDM  NAME'TYPE'2

                        .MACRO NAME'TYPE'3  X, Y, Z
                                NAME'3  TYPE, X, Y, Z
                        .ENDM NAME'TYPE'3

                .ENDM   FAKE_OP

                .MACRO  DO_FAKES NAME
                FAKE_OP NAME, B
                FAKE_OP NAME, W
                FAKE_OP NAME, L
                .ENDM   DO_FAKES

         c. Now we could create six versions of a macro like MOD (given
            definitions for MOD2 and MOD3) by

                DO_FAKES        MOD

         HANDOUT
               
II. Conditional and Repeat Assembly
--  ----------- --- ------ --------

   A. The conditional and repeat assembly facility of VAX MACRO allows the
      programmer to "program" the assembler to generate certain code
      conditionally, or repeatedly.

      1. Example: Suppose a certain program contains debugging code, which
                  should be included when the program is assembled during
                  the software development process, but not when the "production"
                  version is assembled.  This can be handled as follows:

         a. Include at the beginning of the program one of the following lines

                DEBUG=1         ; if debugging code should be included
                DEBUG=0         ; if not

         b. Enclose each block of debugging code by:

                .IF     NE DEBUG
                ...
                .ENDC

            The enclosed code will be assembled if DEBUG is 1, and will be 
            ignored if DEBUG is 0.

      2. Example: Suppose one wishes to initialize to all -1's an array of 40 
         longwords.  In terms of execution time, the fastest way to do this
         to use a series of 10 MOVO instructions, each of which handles 4
         of them.  (This is much faster than a loop.)  This could be handled
         by writing:

                MOVAL   MY_ARRAY, R0
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+

         However, the programmer could save the effort of writing the MOVO
         instruction 10 times by instead writing:

                MOVAL   MY_ARRAY, R0
                .REPEAT 10
                MOVO    #-1, (R0)+
                .ENDR

      3. Note that these conditional and repeat operations are done by the
         assembler AT ASSEMBLY TIME.  In essence, they constitute a program
         EXECUTED by the assembler to GENERATE code to be assembled.

   B. This facility is logically distinct from the macro facility we just
      looked at, but is usually studied with it for two reasons:

      1. The same preprocessor handles both macro definition / expansion and
         conditional and repeat assembly.

      2. One of the major uses of conditional assembly is in writing flexible
         macros that can be expanded in different ways to suit different
         circumstances.

         Example: The SWAPL macro we defined earlier uses R0 as a temporary
                  if the programmer specifies none.  It would be safer to
                  save R0 on the stack before using it.  This cannot be
                  handled by simply using the default TEMP=R0.

                .MACRO  SWAPL A, B, TEMP
                .IF     BLANK TEMP
                PUSHL   R0
                MOVL    A, R0
                MOVL    B, A
                MOVL    R0, B
                MOVL    (SP)+, R0
                .IF_FALSE
                MOVL    A, TEMP
                MOVL    B, A
                MOVL    TEMP, B
                .ENDC
                .ENDM   SWAPL

   C. In general, a conditional assembly block consists of:

      1. An initial .IF directive to specify the condition to be tested

         a. One form of condition compares an ASSEMBLY-TIME expression to 0

                EQUAL (or EQ) expression        - true iff expression = 0
                NOT_EQUAL (or NE) expression    - true iff expression <> 0
                GREATER (or GT) expression      - true iff expression > 0
                LESS_THAN (or LT) expression    - true iff expression < 0
                GREATER_EQUAL (or GE) expression- true iff expression >= 0
                LESS_EQUAL (or LE) expression   - true iff expression <= 0

            Example: assemble code iff the symbol DEBUG has non-zero value

                .IF NE DEBUG

         b. Another form tests to see whether or not a given symbol has been
            defined

                DEFINED (or DF) symbol
                NOT_DEFINED (or NDF) symbol

            Example: Give the symbol BUFFLEN a default value of 80 if no
                     value has been assigned

                .IF NDF BUFFLEN
                BUFFLEN=80
                .ENDC

         c. Another form tests macro actual arguments to see if they are blank

                BLANK (or B) formal_argument    - true iff argument is blank
                                                  (no actual specified)
                NOT_BLANK (or NB) formal_argument- true iff argument is nonblank
                                                  (a non-blank actual was 
                                                   specified)

            Example: See above

         d. Another form tests to see if an actual macro argument is the same
            as some specified text

                IDENTICAL (or IDN) formal_argument text
                DIFFERENT (or DIF) formal_argument text

            Example: The SWAPL macro we defined would fail if either A or B
                     were R0.  To achieve more complete generality (at the cost
                     of incredible complexity), we could include tests like

                .IF     IDENTICAL A, R0
                ...

      2. Any number of lines of code, optionally interspersed with any number of
         subconditional directives

         a. The directive .IF_FALSE (or .IFF) causes the code that follows it
            to be assembled if the main condition was FALSE.  (Its effect is
            like that of ELSE, but takes place at assembly time.)

            Example: SWAPL above

         b. The directive .IF_TRUE (or .IFT) can be inserted after a .IF_FALSE
            to resume assembling only if the main condition was TRUE.  (There
            is no analogue in conventional programming.)

         c. The directive .IF_TRUE_FALSE (or .IFTF) can be inserted to to
            cause code in the middle of a conditional block to always be
            assembled.

            Example:

                .IF NE DEBUG
                -- this code assembled only if DEBUG is non-zero
                .IFF
                -- this code assembled only if DEBUG is zero
                .IFT
                -- this code assembled only if DEBUG is non-zero
                .IFTF
                -- This code always assembled
                .IFT
                -- this code assembled only if DEBUG is non-zero
                .IFF
                -- this code assembled only if DEBUG is zero
                .ENDC

      3. A terminating .ENDC directive

      4. If the body of a conditional assembly block would consist of only a
         single line, a short form - .IIF - can be used instead

        Example:

                .IIF NE DEBUG JSB PRINT_WHERE

        If DEBUG <> 0, the following line will be assembled

                JSB PRINT_WHERE
                  
   D. A repeat assembly block can be used to cause a block of code to be
      assembled repeatedly (in effect throwing the assembler into a loop).  It
      consists of:

      1. One of the following repeat directives

         a. .REPEAT expression - where expression is an ASSEMBLY-TIME
            expression that evaluates to a positive integer

            Example:

                .REPEAT 10
                MOVO    #-1, (R0)+
                .ENDR

             Is equivalent to

                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+
                MOVO    #-1, (R0)+

         b. .IRP symbol list - the repeat block code is assembled once for
            each item on the list, with the symbol replaced by the
            appropriate element (in essence, an ASSEMBLY-TIME for loop)

            Example:

                .MACRO  PUSH_LIST LIST
                .IRP    ITEM, LIST
                PUSHL   ITEM
                .ENDR
                .ENDM   PUSH_LIST

            If called by

                PUSH_LIST <R0, R3, R5, R7>

            This would expand to

                PUSHL   R0
                PUSHL   R3
                PUSHL   R5
                PUSHL   R7

         c. .IRPC symbol string is similar, but the repeat block is assembled
            once for each character in the string.

      2. A .ENDR directive terminates the repeat block

   E. As is true in programming languages, conditional and repeat blocks can
      be nested within each other to any depth up to a maximum of 31 - e.g.
      the following is legal

        .IF GT FOO
        .IF GT BAR
        .REPEAT FOO+BAR
        ...
        .ENDR
        .ENDC
        .ENDC

      If both FOO and BAR are greater than zero, then the code inside the
      repeat block is assembled FOO+BAR times.  If either is <= zero, the
      REPEAT block is totally ignored.
 
   F. MACRO also supplies some additional directives that can be useful
      with conditional and repeat assembly blocks that occur in macros.

      1. .NARG symbol   -- Sets symbol to the number of actual arguments to the 
                           current macro call

         Example:

         .MACRO FOO A, B, C
         .NARG  N
         ...
         .ENDM FOO

         If called by

         FOO    R0, R1

         The symbol N would have the value 2

      2. .NCHR symbol, string -- sets symbol equal to the number of characters
         in string.

      3. .NTYPE symbol, operand -- sets symbol equal to the operand specifier
         byte corresponding to operand

         Example:  A macro to access the i'th element of some array of
                   longwords and copy it to some destination.  If I is a
                   register, we can use indexed mode addressing; otherwise,
                   we need to copy I to a register first

                .MACRO  ELT, ARRAY, I, DEST
                .NTYPE  ITYPE, I                ; Sets ITYPE to operand
                                                ; specifier corresponding to I
                .IF     NE <ITYPE & ^XF0>-^X50  ; Mode is NOT 5
                PUSHL   R0
                MOVL    I, R0
                MOVL    ARR[R0], DEST
                MOVL    (SP)+, R0
                .IFF
                MOVL    ARR[I], DEST
                .ENDC
Copyright ©1999 - Russell C. Bjork