[Simh] EXT :Re: DEC Alpha Emulation

Timothe Litt litt at ieee.org
Mon Feb 5 13:05:37 EST 2018


On 05-Feb-18 12:01, Clem Cole wrote:
>
>   But marketing never accepted because of the failover issue for
> clusters.
>
> I never understood that.  My argument was that nobody was going to
> *knowingly***put a $1M cluster at risk with a $100 PCI card.   We
> could have just stated in the SPD that the Adaptec chip set was
> supported on single (small) systems such as Workstations, DS10, DS20
> etc...  But I lost that war.
>>
The *word *you left out was probably the issue.  It is trivially easy to
add a workstation to a cluster, and neither VMS nor Tru64 verify that
hardware meets requirements when a node joins a cluster.  So it's not
easy to dismiss the scenario that someone buys a workstation that is not
intended for cluster use; then circumstances change and it turns up in
your cluster.  And it "just works" for a long time, until you hit the
corner case.  In your $M enterprise, stuff gets passed around and
information gets lost as ownership changes at the periphery.  (The way
things moved about on the ZK engineering clusters  is typical.  Despite
attempts to control, people needed to do their jobs & configuration
limits were ignored/fudged.)  *We just didn't make adding a node to a
cluster difficult and mysterious enough.*  Plus, profit is usually a
percentage of user cost.  More cost => more profit.  (Assuming you make
the sale.) 

So product management's conservatism is understandable, given the risk
that the SPD won't be re-read when the function of a node changes, and
the resulting data corruption being laid at DEC's feet.  Engineers
aren't known for reading the instructions - and IT people who are
under-staffed and under pressure less so.  SPDs are even less appealing
- they tend to be read at initial purchase - and subsequently only when
the finger pointing starts.  And that's after customer services has
spent a lot of time and money diagnosing the problem.

These days, we have gates with names like "network admission control";
they won't allow a VPN or Wireless client to connect to a network unless
software is up-to-date.  Something along those lines that also included
hardware and firmware would be a useful addition to clusters - assuming
you could do everything quickly enough to prevent cluster transition
times from becoming unacceptable.  It's non-trivial; the nasty cluster
cases have to do with multi-ported hardware, so you need to check
firmware revisions & bus configurations on all ports for compatibility. 
With all the permutations of the controllers being on stand-alone
systems, cluster nodes not yet joined, joined cluster nodes, and
redundant controllers on the same node.  And interconnects: CI, NI, MC,
DSSI, SCSI.   And hot swap, which can upgrade or downgrade a controller
on the fly.

So, the counter-argument becomes "how much engineering should be
invested in allowing a customer to save $100 on the cost of a PCI
card?"  And the easy answer is one of "none" and "it's not a priority". 
Ship only cluster capable hardware, and "problem solved".  Not all
engineering problems are best solved with engineering solutions.  But
I'll grant that the engineering would be a lot more fun :-)

An imperfect analogy would be selling cars without windshield wipers to
people who promise that they never drive in the rain.  It's in the
nature of things that someday the rain will come.  Or the car will be
passed on.  Of course, missing wipers are a lot more obvious than what
kind and revision of a PCI card is buried in a cardcage :-)

A better analogy is a exercise left to the reader.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.trailing-edge.com/pipermail/simh/attachments/20180205/5595275d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4577 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mailman.trailing-edge.com/pipermail/simh/attachments/20180205/5595275d/attachment.bin>


More information about the Simh mailing list