Thursday, January 4, 2018

Rescuing My First Customer

Many years ago, before I was married, when I was just 22, 1981 to be precise, Toga Computer Services had a customer called Pacific Brewers Distributors.  They've since been merged with other brewery distributors from other provinces into a company called Brewers Distributor Limited. But back then, they were just distributing beer for the three major BC breweries: Carling, Labatt, and Molson.

They had a computer system.  It was a Microdata 1600 (I believe - not 100% sure on the model) with 64 K of core memory and 4 Winchester disk drives that each had 50 MB capacity.  The disk drives looked like top loading washing machines. The computer was the size of a large refrigerator.  The really amazing thing was that their computer system, with only 64 K of core, ran 16 users.  If you do the math, you have 4 kilobytes of memory for each user.  It didn't really work like that. Each user used a lot more than 4 K. The system would page a user's state out to make room for another user to run.  Note: I have 128 Million kilobytes in my phone, and it runs 1 user (it can't even technically multi-task!)  This system ran a multi-valued operating system called Reality.  It was developed with Dick Pick's input, and was a variation of what was known as a Pick system.

Most of those 16 users took orders over the phone.  They would enter the order, which would be put into a phantom processing file.  Then a background process called a phantom processor, would pick up the orders and process them.

Now, there was a problem with the data design. I'll spell it out as simply as I can:

First, Pick predated relational databases.  (The main database at that time was ISAM.)  The idea of Pick was that if you had an invoice, a single record would have all the header information, and also all the detail lines and options for the invoice.  One record, that had multi-values for detail lines, and sub-multi-values (also called sub-values) if the detail lines had multiple options.

This meant a single disk read would get you a small to moderate invoice into memory. A single write would write it out.  The BASIC extensions for handling all this were very easy to use, making the handling of an invoice by a programmer very easy.

Unfortunately, someone decided that they would track all orders for a particular brewer in a single record.  And they also had a consolidated record that tracked all orders for all brewers.  This meant that every order had to update two of these 4 records.

These records recorded, by date, all orders of all products for that brewer (or any brewer for the consolidated record) for all licensed premises or liquor stores in all of BC.  The records got very big.

The smallest one was about 16K, the consolidated one was bumping into the 32K limit that Reality imposed on records. Given that core memory was only double that, the restriction was pretty reasonable.

The other thing you might notice if you are good at simple math, is that two of these records take up almost all of memory. But there's more!

If you add data to a record in the BASIC language, making it longer, there is a likelihood that it will be too big for the buffer the BASIC interpreter had originally allocated. At that point a new, bigger buffer gets allocated, and the data gets copied over to the new buffer along with the changes.  If you do that with the consolidated record, you have two copies of the record in memory and have now used up pretty well all of available core memory. Given some of that memory is used for other things, your working set cannot fit in memory at the same time.  And that's just the phantom processor. If any other users are trying to get work done, their state has probably been pushed out of memory.

Note that the read/write time on these old drives was extremely slow by today's standards, there was no caching to speak of (not even track reads at first), and you read or write 1/2 kilobyte at a time (512 bytes).  So if you are reading a 30K record, you have to do 60 disk reads.  If the copy that the BASIC processor is working with has to be written out to let another user do work, you get to read it back in before you can do any work on it.

I won't go into fragmentation or any of the other problems that this raises. The key thing is, that the system got stuck reading and writing to disk. The industry term is "the system thrashed". The other problem was that if you let the big record hit the 32K limit, it truncated and you had data corruption, that sometimes would result in the phantom program crashing. Because it ran in the background, you might not realize it had crashed for quite some time.

The users would enter orders until 5:00 pm, then the phantom process would try to catch up.  If you hit the size limit on the big record, it would crash. On many mornings the order desk could not open at 9:00 as the phantom was not finished processing.

So, in comes Toga Computer Services, with me, laid off from Fraser Mills Plywood Mill, helping to write a conversion program and change order programs to handle a new data design.

The conversion program took the 3 levels of multi-values in each record and wrote them into 3 different files. We turned 4 records into about 600.  We also had to change the order processing programs to process records from the 3 files, both reads and writes.

We tested and retested, and finally we did the conversion, in January of 1982, as I recall it.

Instead of flushing all of main memory several times over for each order, the system generally processed less than 1K of memory per order. Instead of 60 reads or writes for the consolidated record, we were down to usually just 3.

I was still very rusty and needed a fair bit of help to get it right, but we finally got it good enough to do the conversion in production.

The first day on the new system, we had to fix a few bugs, but the system performance was amazing, and within less than 1 minute of the order desk closing, the phantom processor had caught up all the orders!  The impact of the massive records on performance was exponential! The fix was amazing!

I learned a valuable lesson about data design, and came away with an appreciation of how data design, disk access, system memory management and other factors worked together to affect performance.  I also had the great pleasure of having the CEO and other executives of the company thanking us profusely for saving their system!

These were lessons that have stayed with me over the years!

Next post - Recession Was a Good Teacher...

1 comment:

  1. Wow. This sounds very impressive. I always knew you were a genius.

    ReplyDelete