Software Design Notes Storage Subprimitives SECTION 15 Storage Subprimitives 15.1 STORAGE SUBPRIMITIVES NOTE Unless otherwise noted, all symbols and functions named in this section are in the SYSTEM package. The byte specifiers can be found in SYS:UCODE;LROY-QCOM. Most of the subprimitives are microcoded MISC-OPs defined in SYS:UCODE;DEFOP. Some that are Lisp-coded are defined in SYS:KERNEL;STORAGE-MACROS and SYS:KERNEL;STORAGE-INTERNALS. Subprimitives are functions which are not intended to be used by the average program, only by system programs. They allow one to manipulate the environment at a level lower than normal Lisp. Many primitive Lisp functions are implemented on the Explorer system using subprimitives to accomplish their work. In this sense, subprimitives often take the place of what would be individual machine instructions or low-level subroutines on other systems. In fact, many subprimitive functions are simple Lisp- callable interfaces to miscellaneous operation macro instructions (miscops) written in Explorer microcode. Others are written in Lisp, but violate normal system storage conventions in order to achieve efficiency. Subprimitives by their very nature cannot do full checking. Improper use of subprimitives can destroy the environment. They come in varying degrees of dangerousness. Generally, those whose names begin with % can ruin the Lisp world just as readily as they can do something useful. Subprimitives without a % in their names are usually not directly dangerous. This section describes subprimitives that manipulate storage; these are among the most dangerous ones. The next section describes other subprimitives and low-level system variables. 15-1 Storage Subprimitives Software Design Notes NOTE Because they are used at the lowest level of the Explorer system implementation, subprimitives may change in the way they function or even be removed entirely from the system without notice. Most programs should thus generally not use subprimitives directly. 15.2 DANGERS OF SUBPRIMITIVES The most common problem you can cause using subprimitives is the creation of illegal pointers: pointers that are not allowed to exist in the machine state according to storage conventions. If you create such an illegal pointer, one of several things may happen. It might cause a system crash or a bad data type trap immediately. Or it might not be detected until much later when another part of the system (most likely the garbage collector) sees it, notices that it is illegal, and halts the machine. Subprimitives can also be used to alter the contents of nearly any location in memory, causing unpredictable results if the memory location is part of a sensitive system data structure. In this context even CAR, CDR, RPLACA, and RPLACD can be considered subprimitive functions. If they are given a locative instead of a list, they access or modify the addresses cell without regard to any object that may contain the cell. The pointer-manipulating subprimitives are the most likely ones to cause problems (in general the ones beginning with %P-). These primitives can be very powerful when used properly, but can be very dangerous. There are strict conventions that dictate how these primitives should be used. Most of these conventions have to do with the garbage collector, and some newly-imposed restrictions stem from the temporal garbage collection algorithm (TGC). Some of these storage conventions are described here, along with the subprimitives themselves, and some are discussed in the sections on Internal Storage Formats and Garbage Collection. 15-2 Software Design Notes Storage Subprimitives NOTE Extreme caution should be exercised when using any of the pointer-manipulating functions. At a minimum, before using them you should fully understand the notion of boxed versus unboxed storage, the role of primitive data types (both of these are covered in the Internal Storage Formats section), and the TGC-imposed restrictions on subprimitive use. If you have any questions regarding the correct use of the functions, please contact a Texas Instruments systems analyst. NOTE Unless indicated otherwise, all the subprimitives listed below are in the SYSTEM package. 15.3 STORAGE LAYOUT DEFINITIONS The following special variables have values which define the most important attributes of the way Lisp data structures are laid out in storage. In addition to the variables documented here, there are many others that are more specialized. Variables whose names start with %% are byte specifiers; those beginning with % are numeric constants. %%q-cdr-code Constant The field of a boxed memory word that contains the cdr-code. %%q-data-type Constant The field of a boxed memory word that contains the data type code. %%q-pointer Constant The field of a boxed memory word that contains the object address or immediate data. %%q-pointer-within-page Constant The field of a boxed memory word that contains the part of the address that lies within a single page. 15-3 Storage Subprimitives Software Design Notes %%q-typed-pointer Constant The concatenation of the %%Q-DATA-TYPE and %%Q-POINTER fields. %%q-all-but-typed-pointer Constant This is now synonymous with %%Q-CDR-CODE, and therefore obsolete. %%q-all-but-pointer Constant The concatenation of all fields of a memory word except for %%Q-POINTER. %%q-all-but-cdr-code Constant The concatenation of all fields of a memory word except for %%P-CDR-CODE. %%q-high-half Constant %%p-low-half Constant The halves of a memory word. These fields are generally only useful for storing into unboxed memory locations. cdr-normal Constant cdr-next Constant cdr-nil Constant cdr-error Constant The values of these four variables are the numeric values that go in the cdr-code field of a memory word. 15.4 DATA TYPES q-data-type The value of Q-DATA-TYPES is a list of all of the symbolic names for data types. These are the symbols whose print names begin with "DTP-". The values of these symbols are the internal numeric data type codes for the various internal data types. The section on Internal Storage Formats contains a list of all internal data types along with a description of each. %data-type (x) Returns the data type field of x, as a FIXNUM. q-data-types(type-code) Given the internal numeric data type code, returns the corresponding symbolic name. This "function" is actually an array. 15-4 Software Design Notes Storage Subprimitives data-type (arg) DATA-TYPE returns a symbol that is the name for the internal data type of arg. The TYPE-OF function is a high-level primitive that is more useful in most cases; normal programs should always use type-of (or, when appropriate, TYPEP) rather than data type. Note that some types as seen by the user are not distinguished from each other at this level, and some user types may be represented by more than one internal type. For example, DTP-EXTENDED-NUMBER is the symbol that DATA-TYPE would return for either a double-float or a BIGNUM, even though those two types are quite different. Some of these type codes occur in memory words but cannot be the type of an actual Lisp object. These include header types such as DTP-SYMBOL-HEADER, which identify the first word of a structure, and forwarding or "invisible" pointer types such as DTP-ONE-Q-FORWARD. 15.5 POINTER MANIPULATION %pointer (x) Returns the pointer field of x, as a fixnum. For most types, this is dangerous since the garbage collector can copy the object and change its address. %make-pointer (data-type pointer) %make-pointer-offset (data-type pointer offset) These functions make up an object (pointer) created from the specified DATA-TYPE field and with pointer field POINTER (or POINTER plus OFFSET). DATA-TYPE should be an internal numeric data type code; the value of one of the DTP- symbols. POINTER may be any object; its pointer field is used. In the case of %MAKE- POINTER-OFFSET, OFFSET may also be any object (just its pointer field is used) but is usually a FIXNUM. The types of the arguments are not checked; their pointer fields are simply added together. This is useful for constructing locative pointers into the middle of an object. However, remember that it is illegal to have a pointer to untyped data, such as the inside of a FEF or a numeric array. The resulting object is returned, which means it is subjected to the Write Barrier so must be a valid Lisp object. Hence DATA-TYPE must not be a data type that is illegal "in the machine" (such as DTP-NULL or most of the invisible forwarding types). 15-5 Storage Subprimitives Software Design Notes %MAKE-POINTER and %MAKE-POINTER-OFFSET are extremely dangerous functions. They can be used to create objects with illegal data types or pointers to nonexistent memory (in which case the machine will crash immediately). More subtly, they can create objects with valid data types and pointer fields, but which do not point at valid structures (an array object pointing to a symbol header, for example, or into the middle of the unboxed portion of a FEF_. Any of these can cause severe or fatal problems for the low-level system and for the garbage collector. Note that it is always safe to create a FIXNUM data object with either of these two functions. It is creating pointer type objects that can be dangerous. %pointerp (object) T if object points to storage. For example, (%POINTERP "foo") is T because "foo" is an array which points to extended storage. (%POINTERP 5) is NIL since 5 is a FIXNUM. %pointer-type-p (data-type) T if the specified DATA-TYPE is one which points to storage. For example, (%POINTER-TYPE-P DTP-Fix) returns NIL. %stack-frame-pointer Returns a locative pointer to its caller's stack frame. This function is not defiend in the interpreted Lisp environment: it only works in compiled code. Since it turns into a miscop instruction, the "caller's stack frame" really means "the frame for the FEF that executed the %stack-frame-pointer instruction." 15.6 POINTER ARITHMETIC Addition, subtraction and comparison of object addresses should not be done with normal Lisp arithmetic functions since the largest positive Lisp FIXNUM fits in 24 bits while high virtual addresses can use the 25th bit. The Lisp functions thus will consider FIXNUMS with the 25th bit set as a negative number, whereas address arithmetic considers them large positive numbers. Furthermore, with Lisp functions like +, a BIGNUM is created if two 24-bit FIXNUM values added together overflow the 24-bit field. The pointer field of the resulting BIGNUM object is, of course, an object reference (probably pointing to the EXTRA-PDL area or into WORKING-STORAGE somewhere where the actual BIGNUM storage exists). Address arithmetic, on the other hand, requires that a 25-bit FIXNUM with the high bit set be returned when two such 24-bit values are added. The following primitives should always be used to do pointer arithmetic. Bad pointer arithmetic is a frequent cause of bugs 15-6 Software Design Notes Storage Subprimitives in system code. However, it should be remembered that pointer arithmetic can itself be dangerous, depending on the context, even when the right functions are used. For example, if you use %POINTER-PLUS to generate an address inside of an object, you must make sure that the address thus generated cannot change (due to garbage collection) while you are using it. This can be done by protecting small sections of critical code with a WITHOUT- INTERRUPTS form or larger sections of code with a INHIBIT-GC- FLOPS form. Either will guarantee the object's address will not change inside its body by preventing a flip from occuring. %pointer-difference (ptr1 ptr2) Returns a FIXNUM which is ptr1 minus ptr2. No type checks are made. For the result to be meaningful, the two pointers must point into the same object, so that their difference cannot change as a result of garbage collection. %pointer-plus (ptr1 ptr2) Adds PTR1 and PTR2 together using address arithmetic; the result is always a FIXNUM suitable for use by the %P- routines (if, of course it represents a valid address). The data type field of PTR1 and PTR2 are ignored, but for this to be meaningful one is usually an object and the other is a FIXNUM meant as an offset. %pointer= (ptr1 ptr2) T if the pointer fields of PTR1 and PTR2 are the same. This is like EQ except that data types are ignored. %pointer> (ptr1 ptr2) T if PTR1 has a strictly higher virtual memory address than PTR2. Data type fields are ignored. %pointer>= (ptr1 ptr2) T if PTR1 has a higher virtual memory address than PTR2 or the same. Data type fields are ignored. %pointer< (ptr1 ptr2) T if PTR1 has a strictly lower virtual memory address than PTR2. Data type fields are ignored. %pointer<= (ptr1 ptr2) T if PTR1 has a lower virtual memory address than PTR2 or the same. Data type fields are ignored. convert-to-unsigned (num) Returns a number that corresponds to NUM (usually a FIXNUM) considered as a 25-bit unsigned quantity. Thus if FIXNUM is ________ a positive FIXNUM, it is just returned. If it is a negative FIXNUM, a BIGNUM is returned. convert-to-signed (num) Returns a number that corresponds to NUM (usually a BIGNUM) 15-7 Storage Subprimitives Software Design Notes considered as a 25-bit unsigned quantity. This is the ________ reverse of CONVERT-TO-UNSIGNED. (CONVERT-TO-SIGNED (CONVERT-TO-UNSIGNED )) = always holds true if NUM is a FIXNUM. %logdpb (value ppss word) A FIXNUMs-only form of DPB. The low order SS bits of VALUE replace the PPSS field of WORD. Always returns a FIXNUM. Does not complain about loading/clobbering the sign bit. Data type fileds of VALUE and WORD are ignored. %logldb (ppss word) A FIXNUMs-only form of LDB. Returns a FIXNUM obtained from the uninterpreted 32-bits of WORD. The result may be negative if the field width is 25. Signals the ARGTYP error if PPSS is not a FIXNUM or if it specifies a field greater than 25 bits wide. 15.7 FORWARDING An invisible pointer or forwarding pointer is a kind of pointer that does not represent a Lisp object, but just resides in memory. There are several kinds of invisible pointers, and there are various rules about where they may or may not appear. Those are detailed in the sections on Internal Storage Formats and Garbage Collection. The basic property of an invisible pointer is that if the Explorer system reads a word of memory and finds an invisible pointer there, instead of seeing the invisible pointer as the result of the read, it does a second read at the location addressed by the invisible pointer, and returns that as the result instead. Writing behaves in a similar fashion. When the Explorer system writes a word of memory it first checks to see if that word contains an invisible pointer; if so it goes to the location pointed to by the invisible pointer and tries to write there instead. Many subprimitives that read and write memory do not do this checking, hence violate normal rules about handling internal storage formats (they do this because they are used in low-level system routines that implement the storage formatting). The simplest kind of invisible pointer has the data type code DTP-ONE-Q-FORWARD. It is used to forward a single word of memory to some place else. The invisible pointers with data types DTP- HEADER-FORWARD and DTP-BODY-FORWARD are used for moving whole Lisp objects (such as cons cells or arrays) somewhere else. The DTP-EXTERNAL-VALUE-CELL-POINTER is similar to DTP-ONE-Q-FORWARD: the difference is that it is not "invisible" to the operation of binding. If the (internal) value cell of a symbol contains a DTP-EXTERNAL-VALUE-CELL-POINTER that points to some other word (the external value cell), then SYMEVAL or SET operations on the 15-8 Software Design Notes Storage Subprimitives symbol consider the pointer to be invisible and use the external value cell. The operation of binding the symbol, however, saves away the DTP-EXTERNAL-VALUE-CELL-POINTER itself, and stores the new value into the internal value cell of the symbol. This is how closures are implemented. The other forwarding pointer types DTP-GC-FORWARD and DTP-GC- YOUNG-POINTER are not the same kind of forwarding pointers. They can never be seen by any program other than the garbage collector. structure-forward (old-object new-object) This causes references to OLD-OBJECT to be forwarded to NEW- OBJECT. Does this by creating new storage, copying the contents of OLD-OBJECT into it, then storing invisible DTP- HEADER-FOREWARDs/ DTP-BODY-FORWARDs into OLD-OBJECT. It returns OLD-OBJECT. An example of the use of STRUCTURE-FORWARD is ADJUST-ARRAY. If the array is being made bigger and cannot be expanded in place, a new array is allocated, the contents are copied, and the old array is structure-forwarded to the new one. This forwarding ensures that pointers to the old array, or to cells within it, continue to work. When the garbage collector goes to copy the old array, it notices the forwarding and uses the new array as the copy; thus the overhead of forwarding disappears eventually if garbage collection is in use. follow-structure-forwarding (object) Normally just returns object, but if object has been structure-forwarded, returns the object at the end of the chain of forwardings instead. If object is not exactly an object, but a locative to a cell in hte middle of an object, a locative to the corresponding cell in the latest copy of the object is returned. forward-value-cell (from-symbol to-symbol) This alters FROM-SYMBOL so that it always has the same value as TO-SYMBOL, by sharing its value cell. A DTP-ONE-Q- FORWARD invisible pointer is stored into FROM-SYMBOL'S value cell. Do not do this while FROM-SYMBOL's current dynamic binding is not global, as the microcode does not bother to check for that case and something bad will happen when FROM- SYMBOL'S binding is unbound. The microcode check is omitted to speed up binding and unbinding. This is how synonymous variables (such as *TERMINAL-IO* and TERMINAL-IO) are created. To forward one arbitrary cell to another (rather than specifically one value cell to another), given two locatives, do 15-9 Storage Subprimitives Software Design Notes (%P-STORE-DATA-TYPE-AND-POINTER from-location DTP-One-Q-Forward to-location) follow-cell-forwarding (loc evcp-p) LOC is a locative to a cell. Normally LOC is returned, but if the cell has been forwarded, this follows the chain of forwardings and returns a locative to the final cell. If the cell is part of a structure which has been forwarded, the chain of structure forwardings is followed, too. If EVCP-P is true, EVCPs are followed; if it is NIL they are not. 15.8 ANALYZING STRUCTURES %find-structure-header (pointer) Finds the structure into which POINTER points and returns an object reference to it. For example, given a locative into an ART-Q array, %FIND-STRUCTURE-HEADER would return the array. Works by searching backward in memory for a header, so it is illegal to give it a pointer into unboxed storage. It is a basic low-level function used by such things as the garbage collector. The data type of POINTER is ignored, but is generally a locative or a FIXNUM. Structure forwarding is not followed. In structure space, the "containing structure" of a pointer is well-defined by the object's header. In list space, however, the containing structure is considered to be the contiguous, cdr-coded segment of list surrounding the location pointed to. Hence, it will return the length of an entire cdr-coded list segment if given a pointer anywhere into that segment, and will return 2 if given a pointer to either the CAR or the CDR of any simple CONS cell. If a CONS of a cdr-coded list has been copied out by RPLACD, the contiguous list measured includes that CONS cell pair (so actually may be one greater than the contiguous segment). find-structure-header (pointer) Identical to %FIND-STRUCTURE-HEADER but first follows any structure-forwarding markers between what POINTER points to and the eventual actual object storage. %find-structure-leader (pointer) Identical to %FIND-STRUCTURE-HEADER except that if the structure is an array with a leader, this returns a locative pointer to the leader-header, rather than returning the 15-10 Software Design Notes Storage Subprimitives array-pointer itself. Thus the result of %FIND-STRUCTURE- LEADER is always the lowest address in the structure. This is the one used internally by the garbage collector. Structure forwarding is not followed. find-structure-leader (pointer) Identical to %FIND-STRUCTURE-LEADER but first follows any structure-forwarding markers between what POINTER points to and the eventual actual object storage. %find-structure-header-safe (ptr) %find-structure-leader-safe (ptr) Just like %FIND-STRUCTURE-HEADER and %FIND-STRUCTURE-LEADER but are safe (and slow). Returns NIL if PTR isn't valid virtual memory; else returns the object (or locative to the array leader). These are safe because they parse memory forward from the region origin. They will FERROR if anything strange is found in the parsing. These even do intelligent things when given Oldspace and Copyspace addresses. See the documentation string for more details. They used for debugging only. They are too slow for system code. %structure-boxed-size (object) Returns the number of "boxed Q's" in OBJECT. This is the number of words at the front of the structure which contain Lisp objects. Some structures, for example FEFs and numeric arrays, contain additional "unboxed Q's" following their boxed Q's. Note that the boxed size of a PDL (either regular or special) does not include Q's above the current top of the PDL. Those locations are technically boxed, since they are part of a Q-type array, but their contents are considered unboxed garbage and are not looked at by the garbage collector. OBJECT's data type is not checked, but it better point to valid, boxed storage. Structure forwarding is not followed. %STRUCTURE-BOXED-SIZE is used internally by the garbage collector to get a count of the number of words in a structure that need to be scavenged. This number is sometimes different from the real number of legitimage boxed words in an object. For example, it never counts forwarded body words as boxed (although forwarded leader and header words are always counted as boxed). This is because after scavenging the leader and the header of a forwarded structure, all the work is done. Also, will return 0 for any BIGNUM because the BIGNUM header word technically doesnt have to be scavenged since its pointer field just contains flag bits. 15-11 Storage Subprimitives Software Design Notes %structure-total-size (object) Returns the total number of words occupied by the representation of OBJECT, including all boxed and unboxed words (and garbage past the top-of-stack pointers of PDLs). OBJECT's data type is not checked, but it better point to valid, boxed storage. structure-boxed-size (object) Same as %STRUCTURE-BOXED-SIZE but first follows any structure-forwarding markers between what POINTER points to and the eventual actual object storage. structure-total-size (object) Same as %STRUCTURE-TOTAL-SIZE but first follows any structure-forwarding markers between what POINTER points to and the eventual actual object storage. %structure-size-safe (ptr &optional (include-leader t)) A safe way to find the real size of an object. Returns NIL if PTR isnt valid virtual memory. PTR should point at a structure header in structure space (but just FERRORs if not). Returns 4 values: total size in words, number of boxed words, space type of PTR (:NEW :COPY :OLD) and a flag which if T means it is a forwarded structure (RPLACD- forwarded in list space; structure forwarded in structure space). Used only for debugging. Too slow for system code. dump-memory (ptr &key num-objects) dump-objects-in-region region &key (start-offset 0) Start dumping out a brief representation of all objects starting either at address PTR (for DUMP-OBJECTS) or at the start of region REGION plus :START-OFFSET (for DUMP-OBJECTS- IN-REGION). Will just keep dumping until end of a region is reached or until :NUM-OBJECTS specified runs out. More keywords are available: check the documentation strings. 15-12 Software Design Notes Storage Subprimitives 15.9 CREATING OBJECTS %allocate-and-initialize (data-type header-type header second- word area size) This is the subprimitive for creating most structure-type objects. AREA is the area in which it is to be created, as a FIXNUM or a symbol. SIZE is the number of words to be allocated. The value returned points to the first word allocated and has data type DATA-TYPE. The words allocated are initialized with interrupts disallowed so that storage conventions are preserved at all times. The first word, the header, is initialized to have HEADER-TYPE in its data-type field and HEADER in its pointer field. The second word is initialized to SECOND-WORD. The remaining words are initialized to NIL. The cdr-codes of all words except the last are set to CDR-NEXT; the cdr-code of the last word is set to CDR-NIL. Note that programs should generally not rely on the cdr-code field of non-CONS cells being in a known state. %allocate-and-initialize-array (header data-length leader-length area size) This is the subprimitive for creating arrays, called only by MAKE-ARRAY. It differs from %ALLOCATE-AND-INITIALIZE because arrays have a more complicated header structure. %allocate-and-initialize-instance (header area size) Allocates storage for an instance, sets header type to DTP- INSTANCE-HEADER and sets data type to DTP-INSTANCE. Fills allocated space with NIL and places HEADER in word 0. The basic functions for creating list-type objects are CONS and MAKE-LIST; no special subprimitive is needed. 15.10 RETURNING STORAGE return-storage (object &optional force-p) With the advent of Temporal Garbage Collection (TGC), the RETURN-STORAGE facility becomes less useful as well as more difficult to implement without breaking the garbage collector. Hence RETURN-STORAGE now does nothing (and returns NIL) in the normal case. The normal case is when TGC is enabled (the value of %TGC-ENABLED is true, which is the default), or when the optional hidden FORCE-P argument to RETURN-STORAGE is NIL (the default). 15-13 Storage Subprimitives Software Design Notes When either the FORCE-P argument is true or TGC is disabled, this function attempts to return object in order to free storage. If it is a displaced array, it returns the displaced array itself, not the data that the array points to. RETURN-STORAGE does nothing if the object is not at the end of its region (i.e., if it was neither the most recently allocated non-list object in its region, nor the most recently allocated list in its region). If you still have any references to object anywhere in the Lisp world after this function returns, the garbage collector can get a fatal error if it sees them. Since the form that calls this function bust get the object from somewhere, it may not be clear how to call return-storage legally. One of the only ways to do it is as follows: (DEFUN func () (LET ((object (MAKE-ARRAY 100))) ... (RETURN-STORAGE (PROG1 object (SETQ object nil))))) so that the variable object does not refer to the object when RETURN-STORAGE is called. Alternatively, you can free the object and get rid of all pointers to it while interrupts are turned off with WITHOUT-INTERRUPTS. If RETURN-STORAGE is forced when TGC is enabled, and if the object being returned ever contained pointers to younger objects, those younger objects will not be able to be collected by hte garbage collector until a full, promoting garbage collection is performed. For this reason, it is recommended that code force NILs into all boxed slots of a data structure being returned before using RETURN-STORAGE. 15.11 COPYING DATA %BLT and %BLT-TYPED are subprimitives for copying blocks of data, word aligned, from one place in memory to another with little or no type checking. The acronym BLT is short for BLock Transfer. %blt (from to count increment) %blt-typed (from to count increment) Copies COUNT words, separated by INCREMENT. The word at address FROM is moved to address TO; the word at address FROM+INCREMENT is moved to address TO+INCREMENT, and so on until CPUNT words have been moved. Only the pointer fields of FROM and TO are significant; they may be locatives or even FIXNUMs. If one of them must point 15-14 Software Design Notes Storage Subprimitives to the unboxed data in the middle of a structure, you must make it a FIXNUM, and you must do so with interrupts disabled, or else garbage collection could move the structure after you have already created the FIXNUM. %BLT-TYPED assumes that each copied word contains a data type field and that every destination location is boxed memory. That is, %BLT-TYPED subjects its data to both the Read Barrier and the Write Barrier. Use %BLT-TYPED when copying FROM boxed storage TO boxed storage. %BLT assumes that all source and destination locations are unboxed memory. That is, it subjects its data to neither the Read Barrier not the Write Barrier. Use %BLT when copying FROM unboxed storage TO unboxed storage. Copying FROM boxed storage TO unboxed storage must be handled specially. See the subsection on TGC Subprimitive Semantics. It is actually safe to use either %BLT or %BLT-TYPED on data which is formatted with data types (boxed) but whose contents do not now point to storage and never have. This includes words whose contents are always FIXNUMs or short floats, and also words which contain array headers, array leader headers, or FEF headers. Whether or not such words go through the Read Barrier and Write Barrier makes no difference since there is no special processing for these types in either one. 15.12 SPECIAL MEMORY REFERENCING This section describes the most dangerous storage management subprimitives. They are dangerous for several different reasons. First, since they read a virtual memory address specified by a pointer field, they can cause a crash if given an invalid virtual address (for example, one which used to contain an object but is now free because of garbage collection). Others are also dangerous because the allow the manipulation of the various fields of memory words, and often circumvent the normal Read Barrier/Write Barrier processing that normally occurs. In addition to these dangers, which have always existed, there are some new problems use of these subprimitives can cause due to the TGC algorithm (described in the section on Garbage Collection). A later subsection explains these new restrictions. They have to do with which primitives must be used on boxed data and which are safe to use on unboxed data, so that part of the descriptions below should be paid special attention. 15-15 Storage Subprimitives Software Design Notes 15.12.1 Subprimitives for Boxed Memory Referencing. %p-pointerp (location) T if the contents of the word at location points to storage. This is similar to (%POINTERP (CONTENTS location)), but the latter may get an error if LOCATION contains a forwarding pointer, a header type, or a void marker. In such cases, %P-POINTERP correctly tells you whether the header or forwarding pointer points to storage. Suitable for use only on boxed locations. %p-pointerp-offset (location offset) Similar to %P-POINTERP but operates on the word OFFSET words beyond LOCATION. Suitable for use only on boxed locations. %p-contents-offset (base-pointer offset) Returns the contents of the word OFFSET words beyond BASE- POINTER. This first checks the cell pointed to by BASE- POINTER for a forwarding pointer. Having followed forwarding pointers to the real structure pointed to, it adds OFFSET to the resulting forwarded BASE-POINTER and returns the contents of that location. Suitable for use only on boxed locations. There is no %P-CONTENTS, since CAR and CONTENTS perform that operation. %p-contents-safe-p (location) T if the contents of word LOCATION are a valid Lisp object, at least as far as data type is concerned. It is NIL if the word contains a header type, a forwarding pointer, or a void marker. If the value of this function is T, you will not get an error from (CONTENTS location). Suitable for use only on boxed locations. %p-contents-as-locative (pointer) Given a POINTER to a memory location containing an EVCP or One-Q-Forward, returns the contents of the location as a DTP-LOCATIVE. It changes the disallowed data type to DTP- LOCATIVE so that you can safely look at it and see what it points to. Suitable for use only on boxed locations. %p-contents-as-locative-offset (base-pointer offset) Extracts the contents of a word like %P-CONTENTS-OFFSET, but changes it into a locative like %P-CONTENTS-AS-LOCATIVE. This can be used, for example, to analyze the DTP-EXTERNAL- VALUE-CELL-POINTER pointers in a FEF, which are used by the compiled code to reference value cells and function cells of symbols. Suitable for use only on boxed locations. 15-16 Software Design Notes Storage Subprimitives %p-safe-contents-offset (location offset) Returns the contents of the word offset words beyond location as accurately as possible without getting an error. Suitable for use only on boxed locations. Forwarding pointers are checked as in %P-CONTENTS-OFFSET. If the contents are a valid Lisp object, it is returned exactly. If the contents are not a valid Lisp object but do point to storage, the value returned is a locative which points to the same place in storage. If the contents are not a valid Lisp object and do not point to storage, the value returned is a FIXNUM with the same pointer field. %p-store-contents (pointer value) Stores value into the data type and pointer fields of the location addressed by pointer, and returns value. The cdr- code field of the location remains unchanged. %p-store-contents-offset (value base-pointer offset) Stores value in the location offset beyond words beyond base-pointer, then returns value. The cdr-code field remains unchanged. Forwarding pointers in hte location at base-pointer are handled as they are in %p-contents-offset. %p-pointer (pointer) Extracts the pointer field of the contents of the location addresses by POINTER and returns it as a FIXNUM. Use only on boxed locations. %p-data-type (pointer) Extracts the data-type field of the contents of the location addressed by POINTER and returns it as a FIXNUM. Use only on boxed locations. %p-cdr-code (pointer) Extracts the cdr-code field of the contents of the lcoation addressed by POINTER and returns it as a FIXNUM. Use only on boxed locations. %p-store-pointer (pointer value) Stores value in the pointer field of the location addressed by POINTER, and returns VALUE. Use only on boxed locations. %p-store-data-type (pointer value) Stores value in the data-type field of the location addressed by POINTER, and returns VALUE. Use only on boxed locations. %p-cdr-code (pointer value) Stores value in the cdr-code field of the location addressed by POINTER, and returns VALUE. Use only on boxed locations. 15-17 Storage Subprimitives Software Design Notes %p-store-tag-and-pointer (pointer miscfields pointerfield) Stores MISCFIELDS and POINTERFIELD into the location addressed by POINTER. 25 bits are taken from POINTERFIELD to fill the pointer field of the location, and the low 7 bits of MISCFILEDS are used to fill both the data-type and cdr-code fields of the location. The low 5 bits of MISCFIELDS become the data-type, and the top two bits become the cdr-code. Use only on boxed locations. This is a suitable primitive to use when both the data tupe and the pointer field of a memory word are to be changed since it does so atomically; that is, the resulting word is only subjected to the Write Barrier after both fields have been changed. Applying the Write Barrier when only one of the two has changed can cause problems. %p-store-data-type-and-pointer (pointer data-type-to-store ptr- to-store) Similar to %P-STORE-TAG-AND-POINTER except that the cdr code of the location being stored into is preserved. Stores data type DATA-TYPE-TO-STORE and pointer field PTR-TO-STORE into the corresponding fields at location POINTER, preserving the cdr code at POINTER. Use only on boxed locations. This is a suitable primitive to use when both the data tupe and the pointer field of a memory word are to be changed since it does so atomically; that is, the resulting word is only subjected to the Write Barrier after both fields have been changed. Applying the Write Barrier when only one of the two has changed can cause problems. 15.12.2 Subprimitives for Unboxed Memory Referencing. %p-ldb (byte-spec pointer) Extracts a byte according to BYTE-SPEC from the contents of the location addressed by POINTER, in effect regarding the contents as a 32-bit number and using LDB. The result is always a FIXNUM. Use only on unboxed locations. %p-ldb-offset (byte-spec base-pointer offset) Extracts a byte according to BYTE-SPEC from the contents of the location OFFSET words beyond BASE-POINTER, after handling forwarding pointers. Use only on unboxed locations. %p-dpb (value byte-spec pointer) Stores VALUE, a FIXNUM, into the byte selected by BYTE-SPEC in the word addressed by POINTER. NIL is returned. Use only on unboxed locations. 15-18 Software Design Notes Storage Subprimitives %p-dpb-offset (value byte-spec base-pointer offset) Stores value into the specified byte of the location OFFSET words beyond that addressed by BASE-POINTER, after first handling forwarding pointers. NIL is returned. Use only on unboxed locations. %p-deposit-field (value byte-spec pointer) Like %P-DPB, except that the selected byte is stored from the corresponding bits of VALUE rather than the right- aligned bits. See the note above %P-DPB for restrictions. Use only on unboxed locations. %p-deposit-field-offset (value byte-spec base-pointer offset) Like %P-DPB-OFFSET, except that the selected byte is stored from the corresponding bits of VALUE rather than the right- aligned bits. See the note above %P-DPB for restrictions. Use only on unboxed locations. %p-mask-field (byte-spec pointer) Like %P-LDB, except that the selected byte is returned in its original position within the word instead of right- aligned. Use only on unboxed locations. %p-mask-field-offset (byte-spec base-pointer offset) Like %P-LDB-OFFSET, except that the selected byte is returned in its original position within the word instead of right-aligned. Use only on unboxed locations. 15.13 TGC SUBPRIMITIVE SEMANTICS With the advent of Temporal Garbage Collection, the semantics (and hence the proper use) of most of the special memory referencing subprimitives has changed. This change was required in order to handle properly the new "super-invisible" forwarding pointer used by TGC, the DTP-GC-Young-Pointer (GCYP). All uses of these primitives in Explorer kernel code had to be checked for compliance with these rules. Any user code which employs them must likewise be checked for compliance. This subsection gives a summary of the rules and the reasons behind them. NOTE Failure to follow the TGC-imposed rules for subprimitive usage can cause the system to crash even if the garbage collector is not active. 15-19 Storage Subprimitives Software Design Notes 15.13.1 GC Young Pointer Usage. With TGC, any time an attempt is made to store a (pointer to a) young object into an older object a trap is taken in the microcode, and instead a DTP-GC-Young-Pointer (GCYP) is stored in the old object. This young pointer points at an indirection cell in the new INDIRECTION-CELL-AREA area. The indirection cell actually contains the young object (or a pointer to it). Isolating these indirection cells allows scavenging of generations to be very fast, since we know that all pointers to a generation can be found using just a few indirection cell regions as the root of the scavenge space. The GCYP is handled much like other forwarding pointer types, and probably most closely resembles the single-word-forwarding type DTP-ONE-Q-FORWARD. On any reference to a cell that can contain a Lisp object a check for a GCYP must be made. If one is found the reference is indirected to what the GCYP points to instead. This is the familiar transporting process. As with most other forwarding pointer types, the GCYP is not allowed "in the machine"; that is, it is always "snapped out" right when it is read from memory. It is illegal to construct a pointer with type GCYP (with, for example %MAKE-POINTER), and it is further illegal to have any pointer or locative to an indirection cell except via GCYPs. 15.13.2 Dividing the Subprimitives. In nearly all cases, the presence of GCYPs is completely invisible to the Lisp world. All Lisp objects are checked for the presence of forwarding pointers. The process of reading an object "into the machine" (pushing it on hte PDL as a function argument, for example), performs this check automatically. The introduction of GCYPs forces us to rethink our use of certain subprimitives, however. The %P- class of subprimitives require special care since they may be used to address arbitrary memory locations, not just ones known to be valid Lisp objects. When using %P- subprimitives to reference memory locations, it has always been required that the user know if the referenced location is boxed or unboxed. Obviously it does not make sense to pick up the pointer field of an unboxed cell and attempt to use it as an address. However the user has formerly had a choice of functions to use in referencing word fields. This is no longer the case. Consider the sets of statements below. Previously, if you wanted to read the pointer field of a memroy word, you could use either of the following: 15-20 Software Design Notes Storage Subprimitives (%P-POINTER ptr) (%P-LDB %%Q-Pointer ptr) You might know you are looking at a boxed word, and expect to get a valid address back. Or you might be looking inside unboxed data, but for some reason just "happen" to want to look at the field that corresponds to the pointer field of a boxed word. In either case, it was formerly allowable to use either call. With the advent of TGC, however, we make a hard distinction between subprimitives which must be used to reference Lisp ____ objects only, and those that should be used to reference unboxed data. If you are looking at or storing into Lisp object cells, there must be a check performed for the presence of a GCYP, since what you really want to reference is the corresponding field in ______ the indirection cell the GCYP points to. If you know you are looking at unboxed data, however, you do not want to check for ___ GCYP forwarding, since the cell referenced cannot be correctly interpreted as a type and pointer. The tables below summarize the members of the two subprimitive groups. The bottom line is that the first group can only be ____ meaningfully used on storage words that contain legitimate tag- and-pointer fields (boxed data). It is safe to use these functions if you know the word has a valid tag field (as long as you also follow the other storage rules of the particular function, of course). The second group should be used on unboxed memory words and may be used (for ease or efficiency) in certain ___ very special cases on typed words (called "SAFE" boxed ____________ locations). The meaning of "SAFE" boxed locations is described _________ later. 15-21 Storage Subprimitives Software Design Notes Table 15-1 GROUP 1. Use on BOXED cells ONLY %P-POINTER * %P-POINTER-OFFSET %P-STORE-POINTER * %P-STORE-POINTER-OFFSET %P-DATA-TYPE * %P-DATA-TYPE-OFFSET %P-STORE-DATA-TYPE * %P-STORE-DATA-TYPE-OFFSET %P-CDR-CODE * %P-CDR-CODE-OFFSET %P-STORE-CDR-CODE * %P-STORE-CDR-CODE-OFFSET %P-STORE-TAG-AND-POINTER * %P-STORE-DATA-TYPE-AND-POINTER %P-STORE-CONTENTS %P-STORE-CONTENTS-OFFSET %BLT-TYPED --------------------------------------- * Indicates a function new with TGC. --------------------------------------- Table 15-2 GROUP 2. Use on UNBOXED or "SAFE" BOXED locations %P-LDB %P-LDB-OFFSET %P-DPB %P-DPB-OFFSET %P-DEPOSIT-FIELD %P-DEPOSIT-FIELD-OFFSET %P-MASK-FIELD %P-MASK-FIELD-OFFSET %BLT %BLT-TO-PHYSICAL %BLT-FROM-PHYSICAL Note that the members of Group 1 all either explicitly indicate a typed-data filed (pointer, cdr-code, data-type), or otherwise imply by their names that they handle typed data (%BLT-TYPED, %P- _____ STORE-CONTENTS). The members of Group 2 have more generic load ________ 15-22 Software Design Notes Storage Subprimitives or store names. To summarize the necessary changes: 1. If %P-LDB is being used to reference boxed data use %P- POINTER, %P-DATA-TYPE, or %P-CDR-CODE instead. 2. If %P-LDB-OFFSET is being used to reference a boxed word inside a structure, use the new %P-POINTER-OFFSET, %P-DATA-TYPE-OFFSET, or %P-CDR-CODE-OFFSET instead. 3. If %P-DPB or %P-DEPOSIT-FIELD is being used to reference boxed data use %P-STORE-POINTER, %P-STORE- DATA-TYPE, or %P-STORE-CDR-CODE instead. 4. If %P-DPB-OFFSET is being used to reference a boxed word inside a structure, use %P-STORE-POINTER-OFFSET, %P-STORE-DATA-TYPE-OFFSET, or %P-STORE-CDR-CODE-OFFSET instead. These are also new. 5. Always use %BLT-TYPED to move data FROM completely boxed locations TO completely boxed locations. Use %BLT only to move data FROM unboxed storage TO unboxed storage. Special handling is required for the rare cases where the source and destination storage locations in a block transfer do not have the same typing characteristic (are not both boxed or both unboxed). This is discussed later. 15.13.3 Structure Forwarding Considerations. Whenever an address offset into a structure is referenced, the -OFFSET version of a subprimitive should be used since these follow structure- forwarding. In the examples below, (1) and (2) are not equivalent, but (2) and (3) are equivalent as long as ___ ___ ADDRESS points to a header. 1. (%P-LDB (BYTE n n) (%POINTER-PLUS address offset)) 2. (%P-LDB-OFFSET (BYTE n n) address offset) 3. (%P-LDB (BYTE n n) (%POINTER-PLUS (FOLLOW-STRUCTURE-FORWARDING address) offset)) Whether you use a Group 1 or Group 2 -OFFSET function further depends on whether you are referencing the boxed or unboxed part of a structure. For example, you would use %P-LDB-OFFSET when looking at FEF instructions, but use the new %P-DATA-TYPE-OFFSET 15-23 Storage Subprimitives Software Design Notes to check the data-type of a FEF-relative argument which is stored in the boxed section of the FEF. 15.13.4 Use of Group 2 Subprimitives on "SAFE" Boxed Storage. It is actually all right to use Group 2 (unboxed) subprimitives to access boxed data under certain conditions. Most of these these conditions are listed below. There is actually quite a bit of system code that uses Group 2 subprimitives on boxed data. The array code, for example, uses %P-LDB to access array-header fields even though array headers are technically boxed. This is all right because an array header's pointer field does not contain a pointer and never can. In short "SAFE" boxed storage is storage that does not now ________ contain a pointer type Lisp object and never has contained one in _________ the past. It can, however, contain immediate data types such as FIXNUMs now or in the past. Such storage locations are safe because they cannot contain GCYPs. In addition, certain areas will never contain GCYPs for simplicity and efficiency reasons. Remember, if you cannot easily determine whether a boxed storage location is "SAFE", just change to a Group 1 subprimitive. This will always work, but may not be strictly necessary. SAFE, after all, is a relative term. It can be said, for example, that it is much SAFER to swim in shark-infested waters if you carry a harpoon. Some examples of SAFE BOXED locations follow. - The FIXED areas that appear before WORKING-STORAGE will never contain GCYPs. - Regular PDLs, binding PDLs, and stack groups will never contain GCYPs. This is for efficiency because these structures are so volatile. Hence Group 2 subprimitives can generally be used on addresses in the areas shown below. The error handler takes advantage of this rule. Area Contents ------------------------------------------------ LINEAR-PDL-AREA Regular PDL of initial stack group LINEAR-BIND-PDL-AREA Bind PDL of initial stack group PDL-AREA All other regular PDLs SG-AND-BIND-PDL-AREA All other bind PDLs and all stack groups 15-24 Software Design Notes Storage Subprimitives NOTE Locations beyond the current top-of-stack (push-pointer) in any PDL may contain unboxed data. These locations should not be referenced unless it is well-known will be found there (the error handler sometimes does this). - Header types that have immediate pointer fields are safe. These are DTP-HEADER, DTP-ARRAY-HEADER, DTP-FEF-HEADER. The -OFFSET versions of group 2 subprimitive may be used on these. Symbol headers and instance headers are not safe, however, ___ because they are actually pointer fields. - Storage that has been freshly allocated by one of the following is safe because it has not yet been stored into at all: %ALLOCATE-AND-INITIALIZE %ALLOCATE-AND-INITIALIZE-ARRAY %ALLOCATE-AND-INITIALIZE-INSTANCE MAKE-LIST %MAKE-STACK-LIST - Boxed storage that has always always (and you can guarantee this) contained only immediate types is safe. Immediate types include DTP-FIX, DTP-SHORT- FLOAT, DTP-CHARACTER. By special dispensation the symbols NIL and T are included in this group. They are not immediate types but a GCYP can never be created which points to them because by definition no object is older than T or NIL. Note that it is not sufficient to know that a location currently _________ contains an immediate type, since if it has ever contained a ____ pointer-type Lisp object, a GCYP may be there. Such a GCYP will not get snapped out until a collection of the old object's generation occurs. For example, just because a symbol value cell now holds a FIXNUM does not guarantee that no GCYP exists there. ___ If a young BIGNUM were stored there previouslym and it has since been overwritten by a FIXNUM, there will still be a GCYP there pointing to an indirection cell which contains the FIXNUM. 15.13.5 Special Cases of %BLT-TYPED and %BLT. With TGC, %BLT-TYPED can only safely be used to transfer data FROM completely boxed storage TO storage that is completely boxed 15-25 Storage Subprimitives Software Design Notes before the transfer. The requirement that the destination be ___________________ boxed is new with TGC, since an attempt will be made to follow GCYPs at the destination prior to the write. To transfer typed data TO a location that has formerly contained untyped data, the destination must be "faked" to look typed before %BLT-TYPED is used. An example: (LET ((typed-array (MAKE-ARRAY 5.)) (untyped-array (MAKE-ARRAY 5. :element-type '(unsigned-byte 32.))) .... other code .... ;; Transfer contents of TYPED-ARRAY to UNTYPED-ARRAY ;; First make inside of UNTYPED-ARRAY look boxed. (%P-DPB-OFFSET dtp-fix %%Q-DATA-TYPE untyped-array 1) (%P-DPB-OFFSET 0 %%Q-POINTER untyped-array 1) (%BLT ;; FROM: The 1st data word where we've ;; just put a fixnum 0... (%POINTER-PLUS (FOLLOW-STRUCTURE-FORWARDING untyped-array) 1) ;; TO: The 2nd data word ... (%POINTER-PLUS (FOLLOW-STRUCTURE-FORWARDING untyped-array) 2) ;; Then from the 2nd to the 3rd, and so forth. (1- (ARRAY-LENGTH untyped-array)) 1) ;; Now can do %BLT-TYPED (%BLT-TYPED (%POINTER-PLUS (FOLLOW-STRUCTURE-FORWARDING typed-array) 1) (%POINTER-PLUS (FOLLOW-STRUCTURE-FORWARDING untyped-array) 1) (ARRAY-LENGTH typed-array) 1)) %BLT should only be used to transfer FROM completely unboxed storage TO completely unboxed storage. If the destination is boxed and you want to replace it with unboxed data, the destination should first be "faked" to be SAFE BOXED before the %BLT. This ensures that all objects pointed to by the current boxed storage will be made garbage. An example: (LET ((typed-array (MAKE-ARRAY 5.)) (untyped-array (MAKE-ARRAY 5. :element-type '(unsigned-byte 32.))) .... other code .... ;; Transfer contents of UNTYPED-ARRAY to TYPED-ARRAY ;; First zap inside of TYPED-ARRAY with fixnum zeros ;; so the garbage collector will know all the old ;; stuff is trashed. 15-26 Software Design Notes Storage Subprimitives (%P-STORE-CONTENTS-OFFSET 0 typed-array 1) (%BLT-TYPED ;; FROM: The 1st data word where we've ;; just put a fixnum 0... (%POINTER-PLUS (FOLLOW-STRUCTURE-FORWARDING typed-array) 1) ;; TO: The 2nd data word ... (%POINTER-PLUS (FOLLOW-STRUCTURE-FORWARDING typed-array) 2) ;; Then from the 2nd to the 3rd, and so forth. (1- (ARRAY-LENGTH typed-array)) 1) ;; Now can do %BLT. (%BLT (%POINTER-PLUS (FOLLOW-STRUCTURE-FORWARDING untyped-array) 1) (%POINTER-PLUS (FOLLOW-STRUCTURE-FORWARDING typed-array) 1) (ARRAY-LENGTH untyped-array) 1)) 15-27