[Simh] New simulator - VAX-11/782

Thu May 25 18:51:47 EDT 2017

On 25/05/2017 04:03, Sergey Oboguev wrote:
> Superficially looking at (AS)MP VMS code, it appears that the
> following should (hopefully) suffice for correct operation:
>
> 1. BBSSI and BBCCI should acquire a lock when accessing the memory
> location. A simplistic implementation may use one lock for the whole
> memory (or the whole MA780 memory bank). A more sophisticated
> implementation may use a bucket of locks, with a particular physical
> address within an MA bank mapping to a corresponding lock in the
> bucket (with a lock being shared by a range of MA physical addresses)
> -- but that would probably be an overkill for 2-CPU config which is
> not particularly heavy on synchronization.

My plan was to use just one global lock which would be set on the read
cycle and cleared on the write cycle.

>
> 1.2. VMS itself does not appear to use anything other than BBSSI and
> BBCCI in the ASMP code. However applications or libraries using the
> multiprocessing may, so for their sake the same applies to other
> interlocked instructions as well. Those applications or libraries
> might also conceivably use a higher rate of locking (justifying the
> bucketing of locks in this case) -- but do they even exist in the
> first place?

Chapter 4 of the VAX-11/782 User's Guide recommends the interlocked
instructions for user written code so they all need to be supported. We
really need the MA780 technical description of the field maintenance
print set to understand how it handles the read-interlocked SBI cycle.

>
> 2. When sending out an IPI, the sending VCPU thread should execute a
> write memory barrier right before writing to the interrupt register.
>
> 3. When receiving an IPI and before handling it, the receiving VCPU
> thread should execute a read memory barrier matching the barrier in
> (2). An obvious implementation would be for (2) and (3) to acquire a
> lock on the "interrupt pending" register of the CPU that is the target
> of the IPI.

I probably should have researched memory barriers a bit more. I knew a
little bit about them but wasn't sure if they were needed here. The
problem may also exist for the rest of the shared memory.

>
> As is always with legacy MP code though, it is a bit of a gamble.
> Modern host processors have different cache coherency model than that
> of the 780 CPUs. Thus it is possible for some sequences that worked on
> the 11/78x multiprocessor to start failing when emulated on x86 or
> other contemporary host CPU. Only a detailed review of the code with
> respect to the cache coherency assumed by the code can tell.
>
> But do we even know how the 780 cache operates?
> Is it write-through or lazy writeback?
> Do interlocked instructions (such as BBSSI/BBCCI) invalidate the 780
> read cache?
> Do they commit pending writebacks from the cache to MA780/main memory
> (MS780) before the instruction completion?

Here is extract from the VAX-11/782 User's Guide that partly answers the
question:

"Each MA780 shared memory subsystem should have the cache invalidation
map option. This option reduces traffic on the Synchronous Backplane
Interconnect (SBI) by reducing the number of cache invalidate requests
sent to each processor. By keeping track of which locations in MA780
memory have been placed in the cache of each processor, the option
allows cache invalidate requests to be sent only to the processor(s)
whose cache contains the location that has been invalidated."

So it looks like the port invalidation control register contains a mask
of the SBI nodes that need to have cache invalidate requests sent to
them. The ASMP code sets the bit for nexus 0 (CPU) as part of the
initialisation.

Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.trailing-edge.com/pipermail/simh/attachments/20170525/fb870cc0/attachment.html>