[Simh] Unaligned references to IO space

Tue Nov 12 09:51:07 EST 2013

A bug was reported that Ultrix conversational boot fails because the 
handling of unaligned references into IO space was incorrect. The issue 
is that the unaligned code in vax_mmu.c issues a pair of longword reads 
surrounding the unaligned reference, but the Qbus IO routine sees the 
full address (with <1:0> intact) and tries to fetch two sequential 16b 
words starting at the odd word address. The suggested fix was to force 
alignment of the PA by zeroing <1:0> for the unaligned reads. 
Unfortunately, the problem is much more complex, and that fix won't work 
as a general solution.

First of all, the VAX Architecture Spec (DEC STD 032) says of IO space:

"References using a length attribute other than the length of the 
register, or to unaligned addresses, may produce UNPREDICTABLE results."

The code is question is doing a BLBC (longword operand) on a 16b 
register as its odd byte address. This seems to be the result of an 
"optimization" in the C compiler going awry. The code should have 
generated a BITW #xyz,register.

Second, the handling of unaligned longword references, in CVAX anyway, 
is much more nuanced than fetching 8 bytes all the time. CVAX doesn't 
use the low order two bits of the physical address at all; it uses 
bits<29:2> and then a four bit byte mask to indicate which bytes are 
wanted. So an aligned longword read would generate:

pa<29:2> + 1111

while an aligned word read would generate:

pa<29:2> + 0011 or 1100, depending on which half is needed

while a byte read would generate

pa<29:2> + 0001 or 0010 or 0100 or 1000

Unaligned longword reads generate a pair of longword reads:

pa<29:2> +     1110/1100/1000
pa<29:2>+1 + 0001/0011/0111

This is important because the QBus interface generates the transaction 
based on the byte mask, NOT the length. To look at the details for 
unaligned longword read:

pa<1:0> = 01 - generates pa<29:2> + 1110, creating 2 Qbus reads
                        - generates pa<29:2>+1 + 0001, creating 1 Qbus 
read - the upper word isn't read because the byte masks are 0

pa<1:0> = 10 - generates pa<29:2> + 1100, creating 1 Qbus read - the 
lower word isn't read
                        - generates pa<29:2>+1 + 0011, creating 1 Qbus 
read - the upper word isn't read

pa<1:0> = 11 - generates pa<29:2> + 1000, creating 1 Qbus read - the 
lower word isn't read
                        - generates pa<29:2>+1 + 0111, creating 2 Qbus 
reads

The reason this matters is that Qbus reads can have side effects. So if 
a read transaction occurs that wasn't expected, it can mess up state. 
Admittedly, this is not likely, but it can happen and apparently does in 
the QDSS code.

Thus, the unaligned flows are wrong, at least for CVAX. The simulator 
uses PA<1:0> plus length (forced to longword in the unaligned case) as a 
substitute for byte mask. This doesn't work when going to IO space; 
there's not enough detail preserved to parse the second reference 
properly. The simulator will generate 4 Qbus operations on an unaligned 
reference, when in fact either 2 or 3 occur. Ugh! I am not at all sure 
how to fix this. The standard Read and Write routines don't pass enough 
information to lower-level routines to figure out what's going on.

The current fix breaks the 780 as well. Nexus adapters only support 
aligned longword operations and detect unaligned operations by looking 
at address<1:0>. Unibus IO space only supports aligned word and byte 
operations.

One possibility is to make the unaligned flows more nuanced, and fetch 
exactly as many words/longwords as are needed for the extraction. 
However, I suspect that fixing this properly will require generalization 
of the vax_mmu.c Read and Write unaligned flows. I will probably need to 
add a model-dependent test to see whether unaligned non-memory 
references need special handling, and then add model-dependent routines 
for unaligned IO references. This will allow passing enough information 
to create the correct behavior.

This would be fine for my version of the simulator, which just has two 
models. However, I understand that the GIT pool now has many other 
models, and all of them would be affected by this change. Further, the 
authors would need to understand the behavior of unaligned IO space 
accesses, which is quite difficult without going through schematics.

/Bob Supnik