[LispM-Hackers] Bug in SSDN2 Macroinstruction Notes

06 Jan 2002 21:31:35 -0900

I've discovered what I'm pretty sure is a bunch of bugs in the SSDN2
description of the macroinstruction formats (the first part of the
chapter on macroinsns).  I hope to outline why I think these are bugs
and would like people to comment.

Consider the format of a mainop:

  FEDC BA98 7654 3210
  cccc cccc bbbo oooo
  c -- opcode
  b -- base register
  o -- offset

It's given that all mainops have to have at least an opcode.  The base
register and offset can be put to other purposes, if the particular op
needs them used differently.  For this reason I believe that the
diagram of immedops given in the SSDN2 is incorrect, ie:

  FEDC BA98 7654 3210
  cccc cccv vvvv vvvv
  c -- opcode
  v -- value

should actually be:

  FEDC BA98 7654 3210
  cccc cccc vvvv vvvv

with the same bit-semantics as above.  In the diagram in the SSDN2
this seems to be due to an off-by-one error, giving 9|8 instead of 8|7
as the border of the two fields.

Now consider a callop.  The format of a callop is:

  FEDC BA98 7654 3210
  10nn nddb bboo oooo
  n -- number of args
  d -- destination
  b -- base register
  o -- offset

as is given in the SSDN2.  Again I think that this is an off-by-one
error, since the SSDN2 specifically says "The base and offset fields
are the same as for the MAIN-OPS and specify the function to be
called."  If they were the same shouldn't they be in the same position
as well as having the same meaning?  Wouldn't that make more sense
than shifting them left one bit?  If we used this reasoning and then
looked at the opcode of CALL-2-DEST-PUSH, 111, as though it were a
mainop opcode we would see:

  FEDC BA98 7654 3210
  0100 1001 bbbo oooo
     ^^^^~~
      |   \-- 01: D-PUSH
       \----- 010: 2 args

If this were true then the true format of a callop would be:

  FEDC BA98 7654 3210
  010n nndd bbbo oooo

which makes much more sense to me and fits with the opcode numbers
correctly.

Auxops need little explanation.  They are given as:

  FEDC BA98 7654 3210
  0000 0000 nnnn nnnn
  n -- auxop number

And in defop.lisp we have:

  (DEFOP AUX-GROUP          0 D-NONE () :No-Reg AUX)    ;non-result ops

which gives the auxops as having 0 for an opcode.  This fits
perfectly.

What about miscops?  Miscops are separated into two groups, as are
module-ops.  Both groups are split according to whether they have
D-INDS or D-PUSH.  This is not exactly obvious in defop.lisp, due to
inconsistent naming:

  (DEFOP TEST-MISC-GROUP    1 D-INDS   () :No-Reg MISC)
  (DEFOP TEST-MODULE-GROUP  2 D-INDS   () :No-Reg Module)
  [...]
  (DEFOP PUSH-MISC-GROUP   41 D-PDL    () :No-Reg MISC)
  (DEFOP PUSH-MODULE-GROUP 42 D-PDL    () :No-Reg Module)

Those names make more sense if we consider the reason for the split of
both miscops and module-ops.  TEST indicates a test operation, ie one
that only returns T or NIL, and hence only sets the indicators.  It
doesn't indicate an association with the TEST and TEST-C*R insns.
PUSH indicates an operation that pushes a value onto the stack, ie
D-PUSH (aka D-PDL).  PUSH doesn't indicate an association with the
PUSH and PUSH-C*R insns.  The split is because of the noncontiguous
bit used to represent the destination, which can be seen from their
insn formats.  These formats, for miscops and module-ops are given in
the SSDN2 respectively as:

  FEDC BA98 7654 3210
  0d00 001n nnnn nnnn
  0d00 0100 mmmm mooo

Now if we put 1 (TEST-MISC-GROUP) and 41 (PUSH-MISC-GROUP) in for the
mainop opcode field we get:

  FEDC BA98 7654 3210
  0000 0001 xxxx xxxx
  0010 0001 xxxx xxxx

If we do the same for module-ops with 2 (TEST-MODULE-GROUP) and 42
(PUSH-MODULE-GROUP) we have:

  FEDC BA98 7654 3210
  0000 0010 xxxx xxxx
  0010 0010 xxxx xxxx

If these above suppositions are correct then we would have:

  FEDC BA98 7654 3210
  00d0 0001 nnnn nnnn
  d -- destination
  n -- miscop number

as the format for miscops and:

  FEDC BA98 7654 3210
  00d0 0010 mmmm mnnn
  d -- destination
  m -- module number
  n -- module-op number

as the format for module-ops.  Also, in the case of module-ops, the
module-op number must necessarily be at least 3 bits since the TV
module has an op numbered 7.  Thus the module number is probably only
5 bits.  Since there are only two modules 5 bits are plenty.  (My only
nagging question is whether there are 4 module-op number bits or not,
but I think probably not.)

One last insn group needs examining.  The arefi-ops are supposed to
have a format similar to those of the miscops and module-ops:

  FEDC BA98 7654 3210
  0d00 111r rrii iiii
  d -- destination
  r -- reference kind
  i -- index

The arefi-ops are also split into two groups.  In defop.lisp they are
inconsistently labeled without a '-GROUP' name:

  (DefOp Test-AREFI         7 D-INDS    () :No-Reg AREFI)
  [...]
  (DefOp Push-ArefI        47 D-PDL     () :No-Reg AREFI)

Their numbers don't match up correctly with the mainop opcode field,
but if we use those in defop.lisp:

  FEDC BA98 7654 3210
  0000 0111 xxxx xxxx
  0010 0111 xxxx xxxx

which seems like what we expect based on the miscops and module-ops.
The remaining question is how the refkind and index fields fit.  It's
known that the refkind field is exactly 3 bits; seven different values
are given, only one of them unused.  The SSDN2 doesn't explicitly
state how large the index field is supposed to be, but I think we can
safely assume that it's 5 bits.  This gives us the following format:

  FEDC BA98 7654 3210
  00d0 0111 rrri iiii

which appears to fit well with the other changed insn classes above.

Now that I've bored everyone to tears with every last shred of
evidence I can find, is there anyone who sees anything wrong with
going ahead with changes to the code and docs to reflect this?  We'd
also need to start an errata list for the SSDN2 until we can replace
it entirely with a text version.

'james

-- 
James A. Crippen <james@unlambda.com> ,-./-.  Anchorage, Alaska,
Lambda Unlimited: Recursion 'R' Us   |  |/  | USA, 61.20939N, -149.767W
Y = \f.(\x.f(xx)) (\x.f(xx))         |  |\  | Earth, Sol System,
Y(F) = F(Y(F))                        \_,-_/  Milky Way.