Individual Entry

A queue't little bug in Itanium's INSQTI!

Recently, while making some changes to the Attunity RMS CDC code, I experienced a system crash when testing the edits on Itanium. The Attunity RMS CDC code, as much as is conceivably possible, is common source code for all of the supported OpenVMS architectural variants; thus, I was rather puzzled that these new edits would cause a bugcheck and crash on Itanium, but not on Alpha. Perusing the crash dump, it turned out that the bugcheck was precipitated by a ROPRAND fault on the Itanium's equivalent of the VAX INSQTI instruction. In the system dump analyzer, I examined the address in Itanium register R32 which should be the queue header. Verified! It was properly quadword aligned. I then examined the address in Itanium register R33 which should be the entry address. It too was quadword aligned. Why then should it be evoking a ROPRAND fault?

My first thought was that, perhaps, the register conventions I'd assumed for the INSQTI primitive's inputs were incorrect. So, I added some debugging code to the Attunity RMS CDC source to stuff the header address and entry address into my favorite little debugging tool — a page of user mode readable and writeable system address space. I built the code and loaded it on the Itanium. I ran the exact same test as before and, once again, the system crashed. Using the system dump analyzer, I verified that it was the same crash scenario as before — a ROPRAND in kernel mode. It was. R32 and R33 also contained the same address values. This time, however, I had additional debugging information. I checked the data that I stashed in my special debug page and it too was exactly as anticipated. WTF? This just shouldn't be happening.

INSQTI Background

In the venerable VAX architecture, there are several instructions which atomically provide for the manipulation of queues. One class of these queues are known as absolute queues, meaning that the pointers to the entries in the queue are absolute addresses. The VAX also provided for another class of queues known as self-relative queues. In these queues, the elements are located, not by address, but by an offset to the next and former entries. The latter class is also atomic through an interlock enforcement mechanism.

Entries on a self-relative queue can be inserted into and or removed from the head or tail of the queue. Thus, FIFO, FILO and ordered message queues can be easily implemented. For the sake of and pertinent to the rest of this discussion, here is a diagram of a self-relative queue with one entry inserted into the queue.

+------------------------------------+
| (entry address) - (header address) | : (header address)
+------------------------------------+
| (entry address) - (header address) |
+------------------------------------+

+------------------------------------+
| (header address) - (entry address) | : (entry address)
+------------------------------------+
| (header address) - (entry address) |
+------------------------------------+

The forward and backward links are, thus, offsets from the current element's address. For example, using the above diagrammed example, to find the address of the entry from the header is quite simple.

[(entry address)-(header address)] + (header address) = (entry address)

Using this relationship, it was possible to walk a self-relative queue forward or backward with a single VAX instruction and a single register. To walk the queue forward, assuming that the header's address or any entry's address is in Rn, use: MOVAB @(Rn)[Rn),Rn. To walk the queue backward, use: MOVAB @4(Rn)[Rn),Rn.

On both Alpha and Itanium, there are no queue instructions in the architecture. To provide an equivalent primitive to software for such operations, the Alpha and Itanium have adopted the use of special code routines. On Alpha, these are implemented in PALcode. On Itanium, OpenVMS itself has special routines to perform the equivalent incorporated in the executive. For the INSQTI, this routine is called SYS$PAL_INSQTIL which is essentially a jacketing routine entry point for an Itanium assembler sourced routine called EXE$PAL_INSQTIL.

The Proof Is In The Plodding

For the sake of those who may be reading this without the luxury of access to the OpenVMS source listings, the analysis of the INSQTI problem will be done using the poor man's microfiche listings — the OpenVMS System Analyzer's (SDA) EXAMINE/INSTRUCTION command.

There are several initial Itanium instruction bundles which exist to do argument validation. While these are important checks, the discussion of these will be skipped here as they are not pertinent to the analysis of the ROPRAND problem. The first Itanium instruction bundles of interest with respect to this discussion are these:

{ .mmi
EXE$PAL_INSQTIL_C+00050: and r14 = 07, r32 ;;
nop.m 000000
cmp.eq p0, p6 = r14, r0
}
{ .mib
EXE$PAL_INSQTIL_C+00060: nop.m 000000
nop.i 000000
(p6) br.cond.spnt.few 1FFE960
}

The above instruction stream snippet is checking the header address to insure quadword alignment. In the first Itanium instruction bundle, the header address (in R32) is ANDed with 7 and that result is stored in R14. R14 is subsequently tested for zero (R0) setting the predicate registers P0 and P6 accordingly. Then, in the second Itanium instruction bundle, if the address was not quadword align, P6 would be true and cause the branch to be taken. This branch would report the ROPRAND fault. This should occur only if the header is not quadword aligned. Let's have a look at the branch target:

SDA> EXAMINE/INSTRUCTION EXE$PAL_INSQTIL_C+00060+iFFE960
{ .mii
EXE$MUT_RUNDOWN_C+004B0: ssm 004000
mov r15 = 000454
mov r25 = 000001
}
SDA> EVALUATE/CONDITION 454
%SYSTEM-F-ROPRAND, reserved operand fault at PC=!XH, PS=!XL

Then, the next Itanium instruction bundles of interest contain:

{ .mmi
EXE$PAL_INSQTIL_C+00070: and r14 = 07, r33 ;;
nop.m 000000
cmp.eq p0, p6 = r14, r0
}
{ .mib
EXE$PAL_INSQTIL_C+00080: nop.m 000000
nop.i 000000
(p6) br.cond.spnt.few 1FFE940
}

These instructions are checking that the entry address is quadword aligned. In the first Itanium instruction bundle, the entry address in R33 is ANDed with 7 storing that result in R14. R14 is then tested for zero setting the predicate registers P0 and P6 accordingly. Then, in the second Itanium instruction bundle, if the address was not quadword align, P6 will be true causing the branch to be taken. This branch will report the ROPRAND fault which should occur only if the header is not quadword aligned. Let's verify that the branch target does just that:

SDA> EXAMINE/INSTRUCTION EXE$PAL_INSQTIL_C+00080+iFFE940
{ .mii
EXE$MUT_RUNDOWN_C+004B0: ssm 004000
mov r15 = 000454
mov r25 = 000001
}

The same Itanium instruction bundle as for the header alignment check.

Because prior checks of the header and entry addresses in the Attunity RMS CDC code proved that they were quadword aligned, neither of these should be the cause of the ROPRAND fault. There must be some other check in the EXE$PAL_INSQTIL that is branching to the code returning the ROPRAND fault.

Continuing with the analysis of the instruction stream will show checks for accessibility, returning an ACCVIO if the addresses passed as the header or the entry are not accessible at the present access mode. After all of the access mode checks, and some manipulation of the header and entry addresses to get the sign-extended FLINK and BLINK, there is a rather curious collection of instructions.

{ .mii
EXE$PAL_INSQTIL_C+00300: sub r14 = r30, r20
sub r15 = r10, r20 ;;
sxt4 r16 = r14 ;;
}
{ .mii
EXE$PAL_INSQTIL_C+00310: nop.m 000000
sxt4 r17 = r15
cmp.eq p0, p6 = r14, r16 ;;
}
{ .mib
EXE$PAL_INSQTIL_C+00320: cmp.eq p0, p7 = r15, r17
mov r8 = r30
(p6) br.cond.spnt.few 1FFE690
}
{ .mib
EXE$PAL_INSQTIL_C+00330: nop.m 000000
nop.i 000000
(p7) br.cond.spnt.few 1FFE680
}

Both of the branches lead to the instruction which will eventually cause the return of a ROPRAND fault. What is this curious bit of code doing? I couldn't very well step through this in the OpenVMS debugger, so I did the next best thing — head-check the code. To simplify it, I selected two values that represent the addresses which caused the ROPRAND in the Attunity RMS CDC code. For the header address, I chose 7F000000 because the header was in P1 address space. For the entry address, I chose 88000000 because the entry came from a special page of memory in S0 space. To further simplify analysis, I assumed the header/queue was empty (ie. the first insert to the queue). So, R30 — the predecessor entry — contains 7F000000 and R10 — the header — contains 7F000000 also. R20 now maintains the address of the entry to be inserted in the queue which, for this head-check, contains 88000000.

The first instruction bundle of this section apparently computes the relative offset by subtracting the entry's address from the predecessors's address and the header's address. Let's see what these would be. Remember that this is a 64 bit machine and that the addresses chosen are sign-extended 64 bit addresses.

SDA> EVALUATE 7F000000 - i88000000
Hex = 00000000.F7000000 Decimal = 4143972352

After both of the SUBtract instructions have completed, the registers R14 and R15 will contain 00000000.F7000000. Now, comes the curious and specious code. The next two instructions perform a 32 bit sign extension of the values in R14 and R15. Why? These aren't addresses, they're now relative offsets. Following the two sign-extension operations are checks to compare the before sign-extension values with the values of the after sign-extension operations. 00000000.F7000000 does not equal FFFFFFFF.F7000000! Therefore, the branch is taken and the ROPRAND fault is returned!

Could this be? Did I just stumble on a latent bug in the OpenVMS Itanium operating system? Incredulous as it may seem, I'd unearthed a day one bug in OpenVMS Itanium and, in of all things, a well exercised primitive function! I'd surely need some more ammunition before I reported this one to HP and the world, so I set out to create a reproducer/demonstrator program.

Seeing It Is Believing It

Understanding of the sign-extension faux pas was the key driving element to putting together the reproducer of the problem I had experienced in the Attunity RMS CDC code. There is no problem with the EXE$PAL_INSQTIL if the header and entries are in the same address space. The problem only rears its ugly head when, for example, the header is in P1 space and the entry is in S0 space. So, I was going to have to create a reproducer which did just that. In addition, I thought that it would have much more impact if it could be shown that INSQTIing an S0 space address into a P1 space listhead worked on VAX and Alpha. Therefore, I set out to code this reproducer to do just that. I also thought that it would be good to demonstrate that this wasn't just a problem in inner-mode code, so I made certain that the reproducer could be run from user mode.

While it was a simple task to code a reproducer to run from user mode, the fact still remained that either the entry or the listhead needed to be in system address space to expose the problem. To meet this criteria, I created a program which would knit together a page of memory in system address space that was user mode readable and writeable. Of course, to do this requires CMKRNL privilege. To simplify things, I made this a standalone program. It creates the page of memory and stores its base address into a logical name. The reproducer code can then translate this logical name to get the base address of this special page of memory. This was the basis of the code that I provided to HP when I reported this issue.

OpenVMS Engineering

I opened a support call with HP OpenVMS engineering and gave them a link to my reproducer code. In a very brief period, the top guns were looking at this and had, indeed, confirmed my findings. It has now been assigned QUIX and GCSS case numbers and it has been elevated to OpenVMS engineering for resolution. I do hope that this will get all of the self-relative queue primitives re-examined. This sign-extension faux pas extends to all of the Itanium's self-relative longword queue primitives: EXE$PAL_INSQHIL, EXE$PAL_REMQHIL and EXE$PAL_REMQTIL.


Comments?


To thwart automated comment SPAM, you must answer this question to post.

Comment moderation is enabled. Your comment(s) will not be visisble until approved.
Remember personal info?
Notify?
Hide email?
All html tags, with the exception of <b> and <i>, will be removed from your comment. You can make links by simply typing the url or email-address.
Powered by…