<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    On 05-Feb-18 12:01, Clem Cole wrote:<br>

    <blockquote type="cite"

cite="mid:CAC20D2MZLDjnLyVUE=DsBxOeXWS67DopdxXsU_ebu-c=v56k0w@mail.gmail.com">

      <div dir="ltr">

        <div class="gmail_default"

          style="font-family:arial,helvetica,sans-serif"><br>

        </div>

        <div class="gmail_extra"><font color="#0000ff">  But marketing

            never accepted because of the failover issue for clusters.</font>

          <div class="gmail_quote">

            <div class="gmail_default"

              style="font-family:arial,helvetica,sans-serif"><font

                color="#0000ff"><br>

              </font></div>

            <div class="gmail_default"

              style="font-family:arial,helvetica,sans-serif"><font

                color="#0000ff">I never understood that.  My argument

                was that nobody was going to <b><font color="#cc33cc">knowingly</font></b><b>

                </b>put a $1M cluster at risk with a $100 PCI card.   We

                could have just stated in the SPD that the Adaptec chip

                set was supported on single (small) systems such as

                Workstations, DS10, DS20 etc...  But I lost that war.</font></div>

          </div>

        </div>

      </div>

      <div hspace="streak-pt-mark" style="max-height:1px"><img alt=""

          style="width:0px;max-height:0px;overflow:hidden"

src="https://mailfoogae.appspot.com/t?sender=aY2xlbWNAY2NjLmNvbQ%3D%3D&type=zerocontent&guid=11fb68e5-08fb-4156-9a31-f59da455d338"

          moz-do-not-send="true"><font size="1" color="#ffffff">ᐧ</font></div>

      <br>

    </blockquote>

    The <b><font color="#cc33cc">word </font></b>you left out was

    probably the issue.  It is trivially easy to add a workstation to a

    cluster, and neither VMS nor Tru64 verify that hardware meets

    requirements when a node joins a cluster.  So it's not easy to

    dismiss the scenario that someone buys a workstation that is not

    intended for cluster use; then circumstances change and it turns up

    in your cluster.  And it "just works" for a long time, until you hit

    the corner case.  In your $M enterprise, stuff gets passed around

    and information gets lost as ownership changes at the periphery. 

    (The way things moved about on the ZK engineering clusters  is

    typical.  Despite attempts to control, people needed to do their

    jobs & configuration limits were ignored/fudged.)  <b>We just

      didn't make adding a node to a cluster difficult and mysterious

      enough.</b>  Plus, profit is usually a percentage of user cost. 

    More cost => more profit.  (Assuming you make the sale.)  <br>

    <br>

    So product management's conservatism is understandable, given the

    risk that the SPD won't be re-read when the function of a node

    changes, and the resulting data corruption being laid at DEC's

    feet.  Engineers aren't known for reading the instructions - and IT

    people who are under-staffed and under pressure less so.  SPDs are

    even less appealing - they tend to be read at initial purchase - and

    subsequently only when the finger pointing starts.  And that's after

    customer services has spent a lot of time and money diagnosing the

    problem.<br>

    <br>

    These days, we have gates with names like "network admission

    control"; they won't allow a VPN or Wireless client to connect to a

    network unless software is up-to-date.  Something along those lines

    that also included hardware and firmware would be a useful addition

    to clusters - assuming you could do everything quickly enough to

    prevent cluster transition times from becoming unacceptable.  It's

    non-trivial; the nasty cluster cases have to do with multi-ported

    hardware, so you need to check firmware revisions & bus

    configurations on all ports for compatibility.  With all the

    permutations of the controllers being on stand-alone systems,

    cluster nodes not yet joined, joined cluster nodes, and redundant

    controllers on the same node.  And interconnects: CI, NI, MC, DSSI,

    SCSI.   And hot swap, which can upgrade or downgrade a controller on

    the fly.<br>

    <br>

    So, the counter-argument becomes "how much engineering should be

    invested in allowing a customer to save $100 on the cost of a PCI

    card?"  And the easy answer is one of "none" and "it's not a

    priority".  Ship only cluster capable hardware, and "problem

    solved".  Not all engineering problems are best solved with

    engineering solutions.  But I'll grant that the engineering would be

    a lot more fun :-)<br>

    <br>

    An imperfect analogy would be selling cars without windshield wipers

    to people who promise that they never drive in the rain.  It's in

    the nature of things that someday the rain will come.  Or the car

    will be passed on.  Of course, missing wipers are a lot more obvious

    than what kind and revision of a PCI card is buried in a cardcage

    :-)<br>

    <br>

    A better analogy is a exercise left to the reader.<br>

    <br>

  </body>

</html>