• "Write Buffer" for the 370/168

    From James Dow Allen@3:633/10 to All on Thu Oct 16 11:51:11 2025

    The add-on memory company I worked for was in a hurry to develop
    memory for the 370/168. The memory chips they planned to use met
    ALMOST all the specs: Their cycle speed was fast enough; the critical Address-In to Data-Out delay was fast enough. The problem occurred
    when a write was immediately followed by a read to a different address.
    The write signal arrived early enough, but the data
    defining the write weren't available until some time AFTER the write
    was signaled. The chips' write was delayed and used up some of the
    relaxation time the chips needed before a subsequent read.

    The write signal arrived early enough to fulfill all timing requirements,
    but data arrived too slowly. So, on the N+1'th write
    to that memory the N'th {address,data} were provided to the chips,
    while the N+1'th information from the CPU was set into temporary
    registers, ready to be written on the N+2'nd write!

    Nifty? The fact that this whole approach was a stopgap -- cheaper
    16-kilobit chips would soon be available without the timing problem --
    gives some idea of how lucrative this "IBM compatible main memory"
    market was.

    Our factory floor had a 370/145 and a 370/158 rented from IBM for our experimentation -- I had great fun twirling the dials and setting up
    scope loops -- but the 370/168 was just too expensive to rent by the month. Instead the 168 memory project rented 168 time by the hour from a
    large insurance company in Marin County. Every weekend they made
    the long drive, paid more for the standalone 168 time than my monthly
    take-home pay, and saw the same failures over and over.

    They installed their memory cabinet as LSU 2; and set the machine to "serial mode" (swapping address bits 27,28 with 12,13 iirc) -- without that
    low memory would be affected and little could be done. And they plugged in their HAND-WIRED "Write Buffer" card. The machines had been purchased
    from IBM, so the user had IBM diagnostics available.

    First our team ran '3CC' - the memory diagnostics. All OK!! The memory diagnostics reported ZERO errors. Wow!! Then our team celebrated by
    running 'the diagnostic '3E7', which supposedly simulated complex activity
    or such. FAILURE! '3E7' crashed but had no way to articulate what was wrong.

    Next weekend -- same thing, with another $1000+ burned up to rent the CPU.

    At the time, I was the juniorest junior in the systems engineering
    department but one of my tasks was to write memory diagnostic software
    for the 370/168 attachment if/when it was ever working. Someone decided
    I should tag along one weekend just to see if my standalone program
    could even boot. This was the occasion I posted about earlier, my first
    view of a 168. When the cabinet doors came off I learned that the machine didn't even belong to the insurance company!
    " THIS MACHINE IS THE PROPERTY OF THE CHASE MANHATTAN BANK OF NEW YORK.
    ANY HYPOTHECATION OF THIS PROPERTY IS PROHIBITED BY LAW. "

    I IPL'ed from the card reader(?). (This might have been even before I switched to mag tape.) The engineers were breathing down my neck. "We're
    paying $650 per hour. Make it snappy." (I was autistically shy and
    didn't remind them they'd said that three times already.

    In moments my program was calling out errors in its "Routine 1" -- the
    simplest memory test in the repertoire.

    "It must be your software is defective" three engineers said in unison.
    "Why are the errors confined to LSU 2?" I gently asked in reply.

    Not only this, but simple examination of the error reports, and simple experiments from the front panel showed what was wrong. Due to a simple
    wiring blunder on the hand-wired write-buffer card, it was the address
    for the N+1'th request that was used for the N+1'th write instead
    of the older N'th address as intended.

    And *how could IBM's "Memory Diagnostic" '3CC' miss such a trivial error?

    This post is getting long; I may tell more in a follow-up.

    But wasting MANY $1000's on 168 rental on a miswired kluge board instead of TESTING the kluge board, at least with a least a $1 continuity prober
    seems like some sort of "punchline."

    Cheers,
    James

    --- PyGate Linux v1.0
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Al Kossow@3:633/10 to All on Thu Oct 16 07:17:01 2025
    On 10/16/25 4:51 AM, James Dow Allen wrote:

    The add-on memory company I worked for was in a hurry to develop
    memory for the 370/168.

    I have some micocode floppy images at http://bitsavers.org/bits/IBM/Microcode_Floppies
    but no one has ever told me very much about them or if
    they would be useful for 370 emulation at the microcode
    level. Apparently the later models had service processors
    that you would IML that was attached to the floppy drive.



    --- PyGate Linux v1.0
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From James Dow Allen@3:633/10 to All on Sat Oct 18 19:39:47 2025

    James Dow Allen <user4353@newsgrouper.org.invalid> posted:

    ... simple examination of the error reports, and simple
    experiments from the front panel showed what was wrong. Due to a simple wiring blunder on the hand-wired write-buffer card, it was the address
    for the N+1'th request that was used for the N+1'th write instead
    of the older N'th address as intended.

    And *how could IBM's "Memory Diagnostic" '3CC' miss such a trivial error?

    The simple test that caught the gross error that '3CC' missed was "March":

    The MARCH Test for Memory:
    Step 1: Set memory under test to all zeroes.
    Step 2: for (A = Low; A <= High; A++) {
    Read A and verify it is 0;
    Write FFFFFFFF at A; // "march" 1's through 0's
    }
    Step 3: Repeat step 2, substituting FFFFFFFF for 0, and 0 for FFFFFFFF.
    Step 4: Repeat with at least one more pattern to toggle parity and check bits.

    Simple enough? Certainly straightforward enough that it amazes that the supposedly thorough '3CC' couldn't detect the simple error "March" detected.

    As I said, I was then the juniorest of the juniors at this memory company.
    I eventually developed some interesting tests, but there was no need for me
    to reinvent March. At least two techs or engineers had mentioned to me
    that simple March was the best memory test of all!
    And this was no secret, making IBM's failure bizarre.

    In addition to the plethora of addressing failure modes it obviously detects, March barrages memory with data switching at high speeds and this provokes other classes of error.

    Eventually I became a close colleague of the company's brilliant
    Chief Scientist. He claimed to be the original inventor of March.
    I alluded indirectly to this man (whom I playfully call the
    "Mad Hungarian") earlier. (He and his pregnant wife snuck under a
    fence to escape Hungary in 1956. I doubt if any old-timers here
    can guess his name -- am I wrong?)


    Cheers,
    James

    --- PyGate Linux v1.5
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Al Kossow@3:633/10 to All on Sat Oct 18 13:18:37 2025
    On 10/18/25 12:39 PM, James Dow Allen wrote:

    Eventually I became a close colleague of the company's brilliant
    Chief Scientist. He claimed to be the original inventor of March.
    I alluded indirectly to this man (whom I playfully call the
    "Mad Hungarian") earlier. (He and his pregnant wife snuck under a
    fence to escape Hungary in 1956. I doubt if any old-timers here
    can guess his name -- am I wrong?)


    I'm trying to guess the company
    Advanced Memory Systems? https://www.ithistory.org/db/companies/advanced-memory-systems-ams

    was it in Silicon Valley?

    there was no shortage of third-party memory suppliers at the time,
    including Intel.



    --- PyGate Linux v1.5
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Al Kossow@3:633/10 to All on Sat Oct 18 14:31:12 2025
    On 10/18/25 1:58 PM, Stefan Ram wrote:
    Al Kossow <aek@bitsavers.org> wrote or quoted:
    I'm trying to guess the company
    Advanced Memory Systems?

    I think James mentioned AMS here in 2007.



    I remember seeing the postings a long time ago.
    There was something that came up recently about AMS RAM chips
    and then researching the history of the company that
    reminded me of this.

    http://bitsavers.org/components/advancedMemorySystems

    --- PyGate Linux v1.5
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From Rich Alderson@3:633/10 to All on Sat Oct 18 21:03:55 2025
    James Dow Allen <user4353@newsgrouper.org.invalid> writes:

    James Dow Allen <user4353@newsgrouper.org.invalid> posted:

    ... simple examination of the error reports, and simple
    experiments from the front panel showed what was wrong. Due to a simple wiring blunder on the hand-wired write-buffer card, it was the address
    for the N+1'th request that was used for the N+1'th write instead
    of the older N'th address as intended.

    And *how could IBM's "Memory Diagnostic" '3CC' miss such a trivial error?

    The simple test that caught the gross error that '3CC' missed was "March":

    The MARCH Test for Memory:
    Step 1: Set memory under test to all zeroes.
    Step 2: for (A = Low; A <= High; A++) {
    Read A and verify it is 0;
    Write FFFFFFFF at A; // "march" 1's through 0's
    }
    Step 3: Repeat step 2, substituting FFFFFFFF for 0, and 0 for FFFFFFFF.
    Step 4: Repeat with at least one more pattern to toggle parity and check bits.

    The two patterns which should follow these two are (in hex) 55555555 and AAAAAAAA.
    (I tend to think of memory tests in octal, so its 252525252525 and 5252525252525252.)

    More baroque patterns may be devised as necessary.

    --
    Rich Alderson news@alderson.users.panix.com
    Audendum est, et veritas investiganda; quam etiamsi non assequamur,
    omnino tamen proprius, quam nunc sumus, ad eam perveniemus.
    --Galen

    --- PyGate Linux v1.5
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)
  • From James Dow Allen@3:633/10 to All on Sun Oct 19 02:14:09 2025

    Yes, the merger between AMS and Intersil occurred during my stint there.
    AMS was the stronger company, and purchased Intersil, but adopted the
    latter name. Orion Hoch was the CEO. I once thought of giving my own
    son this unusual name.

    There was some special camaraderie at AMS. "Join AMS and see the world"
    was a meme as engineers flew around the world for installations and repairs. Silicon Valley being what it was, employees often left to go hither and
    yon but often returned for the annual "AMS Alumni Party."

    One executive -- it seems an invasion to mention names -- left to form
    a successful mainframe repair business with his brother; there was some
    talk of my joining him. Both brothers were lost on the infamous KAL 007.



    Rich Alderson <news@alderson.users.panix.com> posted:
    James Dow Allen <user4353@newsgrouper.org.invalid> writes:

    James Dow Allen <user4353@newsgrouper.org.invalid> posted:

    ... simple examination of the error reports, and simple
    experiments from the front panel showed what was wrong. Due to a simple wiring blunder on the hand-wired write-buffer card, it was the address for the N+1'th request that was used for the N+1'th write instead
    of the older N'th address as intended.

    And *how could IBM's "Memory Diagnostic" '3CC' miss such a trivial error?

    The simple test that caught the gross error that '3CC' missed was "March":

    The MARCH Test for Memory:
    Step 1: Set memory under test to all zeroes.
    Step 2: for (A = Low; A <= High; A++) {
    Read A and verify it is 0;
    Write FFFFFFFF at A; // "march" 1's through 0's
    }
    Step 3: Repeat step 2, substituting FFFFFFFF for 0, and 0 for FFFFFFFF. Step 4: Repeat with at least one more pattern to toggle parity and check bits.

    The two patterns which should follow these two are (in hex) 55555555 and AAAAAAAA.
    (I tend to think of memory tests in octal, so its 252525252525 and 5252525252525252.)

    "Crosstalk" between data bits in a word was seldom (never!) a problem on
    these systems. Each chip was one-bit wide.

    All patterns mentioned so far have the same parity. We'd want to toggle parity bits also. More important than toggling parity is toggling the ECC check bits, but most of them are also untoggled by the mentioned patterns.


    Cheers,
    James

    --- PyGate Linux v1.5
    * Origin: Dragon's Lair, PyGate NNTP<>Fido Gate (3:633/10)