Software Design Notes                                    Closures


                           SECTION 12

                            Closures


12.1  INTRODUCTION

A closure is a functional  object  with  some  extra  information
attached to it.  In simplest terms, it is a function coupled with
some  representation  of the environment in whihc it was created.
The Explorer system supports  two  kinds  of  closures:   Dynamic
                                                          _______
Closures  which capture some part of the dynamic variable binding
________
environment in which they are created, and Lexical Closures which
                                           ________________
capture the surrounding lexical (textual) variable environment in
which the closure function was written.

Each of these closures is a separate primitive functional  object
having  a dedicated data type and a unique storage format for its
environment structure.  Each also  implies  a  different  set  of
support   operations   which  must  occur  when  the  closure  is
funcalled.  This section covers these topics.


12.2  DYNAMIC CLOSURES

Conceptually, a dynamic  closure  object  is  a  function  and  a
remembered   binding   environment.    The   remembered   binding
environment is necessary to resolve  possible  binding  conflicts
caused  by  free  references  to variables in functions passed as
arguments (funargs).  The implementation of dynamic  closures  on
the  Explorer  system  is  closely tied with the variable binding
method chosen for the system.

Some implementations of Lisp use deep binding, where hte  binding
stack  can  be thought of as an a-list of symbol/value pairs.  In
this case accessing a variable requires an ASSOC, and takes  time
proportional  to the number of bindings on the a-list.  In a deep
binding implementation, there is one global binding stack (or one
per process, in multi-processing systems).  Dynamic  closures  in
such  a  system can "close over" the entire environment simply by
saving a pointer to the current global binding  a-list  when  the
dynamic closure is created.

Explorer  system  Lisp,  like MacLisp, uses shallow binding.  The
goal of shallow binding is to keep the time needed  to  access  a
symbols  value  to  a  small  constant:  typically only that of a
single memory reference.  Conceptually, shallow  binding  can  be


                             12-1

Closures                                    Software Design Notes


thought  of  as  a  scheme where there is a stack associated with
each symbol which contains a history of  the  symbol's  bindings.
The  top  of  the  symbol's  binding stack (which can be accessed
quickly) always contains its current binding.

Shallow binding on the Explorer is a  slight  variation  on  this
conceptual  model.  Each symbol's value cell (called the symbol's
internal value cell since it is the  internal  physical  location
___________________
within  the  symbol  data  structure  itself) always contains the
current binding for the function being  run.   Previous  bindings
are  saved  on  the binding stack (the Special PDL or SPECPDL) by
the BIND primitive and are restored when the binding construct is
exited.  BIND does this by saving  the  actual  contents  of  the
symbol's  internal  value  cell  on  the Special PDL along with a
pointer (of type DTP-LOCATIVE) to the internal value cell.

Dynamic closure bunding is a special kind of binding which may be
shared; that is, other functions and dynamic closures in the same
binding environment as the CLOSURE function cna access and change
the dynamic closure's closed over  variables.   This  sharing  is
accomplished  by using a new data type called DTP-EXTERNAL-VALUE-
CELL-POINTER  (EVCP).   It  can  be  thought  of  as  a  kind  of
"environment  pointer,"  referring  to the symbol's value when it
was closure-bound.

Since bindings are associated with the symbol and not on a global
a-list, dynamic closure binding on the Explorer system  is  on  a
per-symbol basis.  The function CLOSURE takes two arguments:  the
first  argument  is a list of symbols (the symbols whose bindings
are to be saved), and the second is a function object (such as  a
LAMBDA  expression,  or  a compiled-code object).  First, CLOSURE
CDRs down its first argument, assuring that each of  the  symbols
has an external value cell.  Whenever it finds one which doesn't,
it allocates a word from free storage, places the contents of the
symbol's  internal  value  cell  into  the word, and replaces the
internal value cell with a DTP-EXTERNAL-VALUE-CELL-POINTER to the
word.  Then, CLOSURE allocates a block of 2*N+1 words of storage,
where N is the length of CLOSURE's first argument.  In the  first
word  of the block, CLOSURE stores its second argument.  Then for
each symbol in its first argument, it stores  a  pointer  to  the
internal  value  cell,  and a copy of the symbols EVCP.  Finally,
CLOSURE returns an object of datatype DTP-CLOSURE which points at
the block.  This is the dynamic closure itself.

A symbol's internal value cell contains an  EVCP  only  when  its
binding  is  being  remembered  by some dynamic closure; at other
times the internal value cell just contains the  symbol's  value.
The  presence of the EVCP when a variable is closure-bound allows
any  modifications  to  the  variable  by  functions  or  dynamic
closures  in  CLOSURE's  binding  environment  to  be seen by the
function closed over.  An EVCP is treated  in  the  usual  manner
(described  above)  by  the  BIND  and  UNBIND operations, but is
treated as an invisible pointer by SET  (the  primitive  function


                             12-2

Software Design Notes                                    Closures


for  updating  a  symbol's  value)  and by SYMEVAL (the primitive
function for accessing a symbol's value).  The word pointed to by
the EVCP is called the symbol's external value cell.   Thus,  SET
                                ___________________
and  SYMEVAL  access and modify the symbol's external value cell,
while BIND and UNBIND refer to the internal value cell.

When a CLOSURE is invoked as a function,  the  first  thing  that
happens  is that the saved environment is restored:  the contents
of the internal value cells of the current environment are  saved
on the binding PDL.  Then the EVCPs stored in the dynamic closure
are  placed  in  the symbols' internal value cells, restoring the
saved environment.  Finally, the function  is  invoked  with  the
arguments passed to the dynamic closure.


12.3  LEXICAL CLOSURES

While  a  dynamic  closure is a function and an environment which
captures a set of  dynamic  bindings,  a  lexical  closure  is  a
function  and  a  lexical  environment  which  captures  a set of
vairables shared between the  function  and  its  lexical  parent
(and/or  any  higher  lexical  grandparents).   A dynamic closure
takes a snapshot of the global, fluid dynamic binding environment
at the point in time that it is created.  The  environment  of  a
lexical  closure,  on  the  other  hand,  is  neither  fluid  nor
temporally changing.  A lexical environment is fully defined when
the functions are written.  Hence all environment  is  statically
determined at compilation time.

When  discussing lexical closures it is convenient to distinguish
between a closure function, which is the  functional  part  of  a
          ________________
lexical  closure,  and  a closure parent function, or just parent
                          _______________________          ______
function, which is a function inside whose  textual  context  the
________
lexical  closure  is defined.  Any time a function defined within
an outer (parent) function freely refers to a local variable name
or argument name belonging to its parent function,  the  function
is  considered  a  lexical  closure.  In the SILLY-ADDER function
shown in Table 12-1, SILLY-ADDER is the parent function, and  the
internal anonymous LAMBDA function is a lexical closure.


                             12-3

Closures                                    Software Design Notes


            Table 12-1  Some Lexical Closure Examples


         (DEFUN silly-adder (num)
            (funcall #'(lambda () (+ num 1))))

         (DEFUN make-summer ()
            (LET ((sum 0))
               #'(lambda (n) (INCF sum n))))

         (SETF (SYMBOL-FUNCTION 'add-to-sum) (make-summer))
         (add-to-sum 5) ==> 5
         (add-to-sum 2) ==> 7
         (add-to-sum 6) ==> 13


However,  such  extremely simple creations of lexical closures as
that in SILLY-ADDER can be completely "compiled out" and  reduced
to in-line code.  Although the anonymous LAMBDA function accesses
its parent's argument, the compiler can statically determine that
neither it nor its shared binding of NUM needs to be protected or
preserved  outside  of  the  execution environment of SILLY-ADDER
itself.  Hence, it can just compile into a simple (+ NUM 1).

On the other hand, the  MAKE-SUMMER  parent  function  creates  a
lexical closure which is returned as the value of a call to MAKE-
SUMMER,  so  the internal LAMBDA function cannot be compiled out.
Furthermore, it shares the variable SUM with its  lexical  parent
MAKE-SUMMER,  hence  it  is  a  lexical  closure.  Let's call the
closure returned by a call to MAKE-SUMMER ADD-TO-SUM.   Calls  to
ADD-TO-SUM  now  side-effect  the  SUM  variable  in  its lexical
parent.  The lexical closure ADD-TO-SUM, then, must contain  both
a   functional   object   to  perform  the  computation,  and  an
environment object which captures the referenced variables in its
textual scope.  The lexical closure  environment  data  structure
must   provide   links  back  to  the  defining  lexical  parent.
Moreover, the defining lexical parent MAKE-SUMMER must have a way
of keeping track of the closures  it  creates  (since  subsequent
calls  will create more closures).  Each such closure must have a
separate, protected lexical environment.  Hence stack  frames  of
lexical parent functions must have reserved slots for maintaining
information  about  their  offspring.   The  following discussion
covers the data structures for maintaining these links.


                              NOTE

          The implementation  of  lexical  closures  is
          quite  different  in  Explorer Release 3 from
          earlier releases.  The  following  discussion
          applies to Release 3 and later only.


                             12-4

Software Design Notes                                    Closures


12.3.1   Lexical  Closure  Implementation.   The  lexical closure
object is represented by a word of type DTP-LEXICAL-CLOSURE which
points to a cons cell whose car is the Function and whose cdr  is
a  locative  to the environment.  The layout of a lexical closure
can be seen in Figure 12-1.


                             12-5

Closures                                    Software Design Notes


           LEXICAL CLOSURE OBJECT
     +----+----------------------+------------------------+
     |    | DTP-Lexical-Closure  |           |            |
     +----+----------------------+-----------|------------+
                                             |
  +------------------------------------------+
  |
  |        LEXICAL CLOSURE STORAGE (A CONS CELL)
  |  +----+----------------------+------------------------+
  +->|Norm| DTP-Function         | Pointer to FEF storage |
     +----+----------------------+------------------------+
     |Err | DTP-Locative         |           |            |
     +----+----------------------+-----------|------------+
                                             |
  +------------------------------------------+
  |
  |        LEXICAL CLOSURE ENVIRONMENT
  |  +----+-----------------------------------------------+
  |  |Next|      Environment-Vectors-All-Vectors-Link     | -2
  |  +----+-----------------------------------------------+
  |  |Next|      Environment-Vector-Frame-Pointer         | -1
  |  +----+-----------------------------------------------+
  +->|Next|      Link to Parent Environment               |  0
     +----+-----------------------------------------------+
     |Next|      0th Lexical Variable Slot                |  1
     +----+-----------------------------------------------+
                            .
                            .
     +----+-----------------------------------------------+
     |Nil |      Last Lexical Variable Slot               |  N
     +----+-----------------------------------------------+

 Environment-Vectors-All-Vectors-Link -- Link used to thread
                                           together LEX-ALL-
                                           VECTORS (not needed
                                           if next one is NIL).
 Environment-Vector-Frame-Pointer     -- Locative to LEX-ALL-
                                           VECTORS slot of frame
                                           or NIL if already
                                           unshared.
 Link to Parent Environment           -- Link to parent
                                           environment or NIL
                                           if there is no
                                           parent.
 Lexical Variable Slots               -- EVCPs to the lexical
                                           variables on the stack
                                           or actual values if
                                           unshared.


             Figure 12-1  Lexical Closure Structure


                             12-6

Software Design Notes                                    Closures


12.3.2  Dedicated Locals.

The implementation of lexical closures makes use of  up  to  four
dedicated  local  variable  slots  the  function's  runtime stack
frame.  In all cases it is determinable at compile  time  exactly
which  of  these  locations  need  be dedicated/reserved within a
function.  Whenever a particular one (or more) of  these  is  not
                                                              ___
required,  its  location  is  available for general local storage
within  the  function.   These  locals,  if  present,  have   the
following uses:


  Offset    Name                    Description
  ---------------------------  -----------------------------
     2  LEX-PARENT-ENV-REG     Parent lexical environment
                                 pointer and primary lexical
                                 "base register"
     3  LEX-ENV-B-REG          Secondary lexical "base
                                 register"
     4  LEX-CURRENT-VECTOR-REG Current environment pointer
     5  LEX-ALL-VECTORS-REG    Head of environment list


Note  that local slot 0 is reserved for &REST args and local 1 to
hold the mapping table in combined flavor methods.  The  symbolic
names  are  in  the  SYSTEM  package  and  are  defined  in  file
SYS:UCODE;LROY-QCOM.

Compile time conditions for dedicating locals:

LEX-PARENT-ENV-REG   (Initialized by CALL microcode)

     Dedicated if function is a closure or if function contains
     a MAKE-LEXICAL-CLOSURE macroinstruction.

LEX-ENV-B-REG   (Initialized to NIL)

     Dedicated if compiler determines that a second lexical base
     register is needed.

LEX-ALL-VECTORS-REG   (Initialized to NIL)

     Dedicated if any MAKE-LEXICAL-CLOSURE macroinstruction is
     emitted which indicates a push onto the environment list.

LEX-CURRENT-VECTOR-REG   (Initialized to NIL)

     Dedicated if function contains a MAKE-LEXICAL-CLOSURE
     macro instruction.


                             12-7

Closures                                    Software Design Notes


12.3.3  Calling a Lexical Closure.

When a lexical closure is called, the microcode  will  store  the
new  environment  pointer  in  LEX-PARENT-ENV-REG  of  the called
function.  This allows efficient access to the  environment  from
within  the  function.   This  approach requires that all lexical
closures dedicate LEX-PARENT-ENV-REG as an  environment  pointer.
Furthermore,  any  function creating an environment must use LEX-
                            ________
PARENT-ENV-REG  as  an  environment  "parent"  pointer.    Simple
functions  (non closures) are called with no special treatment of
LEX-PARENT-ENV-REG in the called function.


12.3.4  Free Lexical References.

References  to  non-local  lexical  variables  must  specify  the
variable  being  referenced  and which higher lexical environment
the variable belongs to.  Such references can  be  compiled  into
one  of  the  general form instructions LOAD-FROM-HIGHER-CONTEXT,
STORE-IN-HIGHER-CONTEXT, or LOCATE-IN-HIGHER-CONTEXT  (returns  a
locative  to the variable).  Each of these takes from the stack a
context descriptor, which is a FIXNUM whose lower 13 bits are the
offset within a particular lexical environment.  This defines the
variable itself.  The next 12 bits specify which  environment  An
environment  value  of  0  represents the immediate parent (taken
from LEX-PARENT-ENV-REG); 1 represents the  grandparent,  and  so
forth.

The majority of non-local lexical references, however, are either
to  variables  in  the immediate lexical parent or in the lexical
grandparent.   Hence,  most  non-local  lexical  references   are
handled  more  efficiently  by  using MAIN-OP instructions with a
special base  register  field  which  specifies  "higher  lexical
context"  addressing  (register value 3).  The 6-bit offset field
of the instruction is then used  divided  into  a  5-bit  lexical
environment  offset and one bit which specifies the base register
(0 = LEX-PARENT-ENV-REG, 1 = LEX-ENV-B-REG) These base  registers
(actually  in the function's local block) are assumed to point to
valid  environment  structures.   The  5-bit  displacement  field
allows   reference   to   the  first  32  slots  of  the  lexical
environments pointed to by LEX-PARENT-ENV-REG  or  LEX-ENV-B-REG.
The  compiler  will  revert  to  the <access>-HIGHER-CONTEXT form
instruction only when the short form addresses  cannot  be  usedl
that  is, the lexical environment displacement is greater than 31
or when access to a third environment frame not covered by a base
register is required.

It is left up to  the  compiler  to  determine  which  particular
environment  pointers  will reside in LEX-PARENT-ENV-REG and LEX-
ENV-B-REG at given points in the code.  While it is unlikely that
the compiler would choose to have other than the  parent  pointer


                             12-8

Software Design Notes                                    Closures


reside  in  LEX-PARENT-ENV-REG;  it  is quite possible that other
than the  grandparent  pointer  might  reside  in  LEX-ENV-B-REG.
(Remember  that  LEX-PARENT-ENV-REG  must  point  to  the lexical
parent when  MAKE-LEXICAL-CLOSURE  is  executed.)   The  compiler
makes  the  choice  based  on  the  number  of references to each
lexical level.

Environment pointers are  obtained  by  the  instruction  LOCATE-
LEXICAL-ENVIRONMENT,  which  accepts  an  integer  specifying the
number of generations to go back (0 = parent,  1  =  grandparent,
etc.)   and returns a pointer to the requested environment object
on the PDL.  This instruction, of course, begins  its  search  at
LEX-PARENT-ENV-REG.  The simplest approach for the compiler would
be to allocate ELX-ENV-B-REG on a global basis, although for some
functions  local allocation might win.  The anticipated "typical"
code sequence on entry to a closure would be something like:

     LOCATE-LEXICAL-ENVIRONMENT  1    ;Grandparent
     POP   LOCAL|3                    ;LEX-ENV-B-REG
     .
     .
after which we could have:
     .
     .
     PUSH  LEX-A|7            ;7th slot of PARENT
     .
     .
     POP   LEX-B|4            ;4th slot of GRANDPARENT
     .

Should the compiler determine that a closure has no need of  LEX-
ENV-B-REG  (as  a  base  register) then it is free to use it as a
normal local slot.  The short form lexical reference instructions
reduce the size of the code (1 instruction vs.   2)  as  well  as
reudcing  the  time  required  to  access grandparent (or higher)
environments.  The use of the LEX-ENV-B-REG base register  allows
us to incur the lookup cost only once; thereafter references cost
the same as references to the parent environment.


12.3.5  Constructing Lexical Closures.

Lexical  closure  objects are constructed using the MAKE-LEXICAL-
CLOSURE   or   MAKE-EPHEMERAL-LEXICAL-CLOSURE   macroinstruction.
These  instructions  take  two  arguments  from  the  stack:  the
environment descriptor list and the FEF  pointer  to  use.   They
___________________________
return  (on  the  stack)  a  pointer  to  the constructed lexical
closure object.   These  instructions  also  use  three  implicit
arguments,  found  in  the  invoking  function's  local slots, as
follows:

    1. LEX-CURRENT-VECTOR-REG  is  accessed  to  retrieve  the
       pointer  to  the  lexical environment to be used in the


                             12-9

Closures                                    Software Design Notes


       constructed lexical closure.  If LEX-CURRENT-VECTOR-REG
       contains NIL, then it is necessary to construct  a  new
       environment  object  according  to  the descriptor list
       mentioned above.  In this case the pointer to  the  new
       environment  object  is stored into LEX-CURRENT-VECTOR-
       REG as a side effect of the macro instruction.

    2. LEX-ALL-VECTORS-REG is  used  to  maintain  a  list  of
       environment   objects   constructed   by   the  current
       function.  See below for details.

    3. LEX-PARENT-ENV-REG  is  used  as  the  lexical   parent
       pointer  should  it  be  necessary  to  construct a new
       environment object.

The  two  instructions  differ  in  whether  the  (possibly  new)
environment  object  will  be pushed onto the environment list at
LEX-ALL-VECTORS-REG.  For MAKE-LEXICAL-CLOSURE,  the  environment
object  (whether  new or old) is pushed onto the list at LEX-ALL-
VECTORS-REG, provided that it is not already the first element of
the list.  For MAKE-EPHEMERAL-LEXICAL-CLOSURE, no attempt is made
to push the environment onto the LEX-ALL-VECTORS-REG list.

The decision as to whether an  environment  object  needs  to  be
added  to  the  environment  list  is made by the compiler and is
based on whether the closure being created can  possibly  outlive
the current function's stack frame.


12.3.6  Lexical Environments.

Lexical  environment  objects are constructed as a side effect of
the MAKE-LEXICAL-CLOSURE macroinstruction.   Environment  objects
are  always  consed in the heap.  Each object is a cdr-coded list
containing a pointer to its parent environment followed  by  some
number  of slots which may contain either the value of a variable
or an indirect pointer (EVCP) to the variable's value cell.

Value entries in a freshly constructed environment correspond  to
locals/args  of  the current function which are freely referenced
and which the compiler has determined are "read only", i.e., they
are never stored into.  In this case each constructed environment
gets its own copy of the value of the local/arg rather  than  the
                         ____
EVCP used in the general case.


12.3.7  Environment Descriptor List.

The  environment  descriptor  list  is  used  as  a template when
constructing an environment object.  The  first  element  of  the
list  is  the  number  of slots to be created, and each remaining
element is a FIXNUM which defines one lexical variable  slot,  in
the format shown in Figure 12-2.


                            12-10

Software Design Notes                                    Closures


           2 2 2
      bit  4 3 2                               6 5         0
          +-+-+---------------------------------+-----------+
          |A|V|0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0| offset    |
          +-+-+---------------------------------+-----------+
          /   \
       arg     value

          Figure 12-2  Lexical Variable Slot Descriptor


The  sign  bit  is  0 for a local variable, or 1 for an argument.
The next bit is the value/reference flag; when it is 0, the  slot
will hold a DTP-EXTERNAL-VALUE-CELL-POINTER to the variable; when
1,  the  value  of  the  variable  is  copied  directly  into the
environment slot.  If the number of slots  is  greater  than  the
number  of of slot descriptors provided, then the remaining slots
are initialized with a value of NIL and do not correspond to  any
local  variables; the compiler uses such slots as value cells for
lexical variables which are in excess of the maximum number of 64
that can be allocated in the stack.


12.3.8  Managing Environments.

We have already discussed the creation of new environments in our
discussion of the operation of MAKE-LEXICAL-CLOSURE with  respect
to  LEX-CURRENT-VECTOR-REG, althoguh we did not indicate how LEX-
                                                         ___
CURRENT-VECTOR-REG  might  ever  be  set  back  to  NIL  once  an
environment  had  been constructed.  The motivation for forcing a
new  environment  object  to   be   constructed   is   to   allow
___
representation  of environments containing different instances of
selected variables.  To accomplish this also requires the use  of
the LEXICAL-UNSHARE instruction.

The LEXICAL-UNSHARE instruction creates an instance of a variable
by  copying  its  value  (from  the stack) into the heap and then
relocating all references (from environments in  the  environment
list)  to  point to the heap allocated value.  References to this
variable in subsequent environments will point (via  EVCPs)  into
            __________
the stack, thus referencing a different instance of the variable.
There is also a LEXICAL-UNSHARE-ALL instruction which can be used
to unshare all of the local variables at once.

Thus  we  need  to  set LEX-CURRENT-VECTOR-REG to NIL whenever we
have reason to unshare any of the variables of  the  environment.
This  is  done  implicitly  by  the  LEXICAL-UNSHARE and LEXICAL-
UNSHARE-ALL microcode.

The UNSHARE instructions will access  the  environment  list  via
LEX-ALL-VECTORS-REG.   This list is also accessed implicitly upon
RETURNing from a frame when  the  environment pointer points here
                                  _______________________________


                            12-11

Closures                                    Software Design Notes


(EPPH)  flag  is set in the frame's Call-Info word.  This flag is
set  only  when  an  environment  object  is  pushed   onto   the
environment  list  at  LEX-ALL-VECTORS-REG.   The  microcode must
perform an UNSHARE-ALL operation when this flag is set.

COMPILER NOTE:  Locals/args which are determined to be read  only
need  not  be  unshared since copies of the values will have been
distributed at the time the environment was constructed.


                            12-12