net.digest - June 2001

net.digest – June 2001

It was a strangely quiet month for the off-topic and wildly off-topic. One of the longer running threads was about what MPE stands for. This led to a certain amount of "I remember when" musing by some of the gray hairs on the list. In the same vein, someone reported that his 23 year-old HP2648 terminal finally died, so he replaced it with a 21 year-old HP2647. This morphed into a discussion about how PCs are not really built to last because the market is evolving so quickly it does not make economic sense. This then morphed into a discussion of B-17s during World War II and how they were built only to last for 100 takeoffs and landings since no plane had survived that many missions.

Finally, people had a little fun with the runaway train in the US that traveled 70 miles without a driver before stopping. The question was where were all those action heroes who routinely drop from helicopters onto speeding trains, cars, boats, etc. when we needed them?

As always, I would like to hear from readers of net.digest and Hidden Value. Even negative comments are welcome. If you think I’m full of it or goofed, or a horse's behind, let me know. If something from these columns helped you, let me know. If you’ve got an idea for something you think I missed, let me know. If you spot something on HP3000-L and would like someone to elaborate on what was discussed, let me know. Are you seeing a pattern here? You can reach me at john.burke@paccoast.com or john@burke-consulting.com.

What happens when the lights go out? Part one.

With everyone's attention on blackouts this summer, especially in California, though no one can feel completely immune, I thought I'd consolidate a couple of power related threads.

The first thread started out this way: "I have a hardware question. Yesterday we lost power in our office for a while. When the power came back on, our 987 came right back up (showed power fail on the console screen), but our 928 went into "coolstart" mode. Both of the HP e3000's are using the same UPS. Why would one do a power fail and the other go into a reboot?"

Doug Werth, Fred Metcalf, Steve Dirickson and Bruce Toback all responded. I've taken the liberty of combining their responses and adding a few comments of my own.

The 987 system has the classic "Recover From Powerfail" capability whereas the 928 does not. The UPS probably kept both systems alive for some period of time depending upon its capabilities. When the UPS ran out, the internal battery on the 987 continued to hold the memory contents. Apparently power was restored before this battery gave out, at which time the system recovered and continued right where it left off. The 928 does not have the internal battery to maintain the state of memory and had to be restarted. Note that the 928 has a UPS serial port on the back to communicate with the UPS.

9x7s and earlier HP 3000s had a battery that maintained memory for a limited time while the power was out. In my experience, this was usually between 30 minutes and an hour. They also depended on special firmware in HP-made HPIB and SCSI drives (I think it was called sector atomicity) to ensure data integrity during a power loss. These earlier systems used so much power that the best you could hope for with a reasonable-sized battery was to keep the memory alive. Today, systems are much less power-hungry, so that same sized battery can keep the whole system alive. The battery just lives in a UPS now instead of inside the cabinet. CSY decided about the time the 9x8 line was being developed that since the newer systems required less power, a small UPS could keep everything running, which seemed superior to just saving memory state. It also meant CSY could use standard disk drives, thus reducing the system cost. This is why HP started shipping UPSs as a system component.

HP-provided UPSs can talk to the systems to let them know when the lights are about to go out (I think they simply cause a system abort of some kind to terminate processing and disk I/O).

And the UPS solution is a much better one. Again, in times past, there wasn't much point in keeping the system alive during a power failure, because all the users, who were probably in the same building, would also be without power. Today, users can be spread all over the country or even the world. To these remote users memory-only backup is no backup at all. Even if they want to hang around waiting for power to come back to the system, their network connections will be gone when it does.

The only thing we lose is the ability to do that power fail demo. Sure, it was fun, but this really is better for the users. It's a feature that used to distinguish the HP3000 from other systems, but it's a feature that's no longer needed.

Unfortunately, since the vast majority of unplanned power outages last less than 30, in fact, less than 15 minutes, many, if not most, people do not have the battery power to keep systems running for the up to 1 1/2 hours that constitute a rolling blackout. This has led to a lot of scrambling in California with the alternatives generally being more batteries or a backup generator. In our case, we got lucky, though we did not realize how lucky at the time. We parleyed fears of a Y2K disaster into the acquisition of a diesel generator that can power the entire data center and IS offices. It is the way to go if you can afford it and are in a location that will allow a generator.

What happens when the lights go out? Part two.

There have been several questions lately about AUTOBOOT. In both cases, the questioner was considering using AUTOBOOT in the event of power fail. One was something like "Does AUTOBOOT only allow for a START RECOVERY"? The other was along the lines of "I created my AUTOBOOT file and set AUTOBOOT ON at the ISL prompt but it does not work and keeps telling me the file is not found. What's wrong?"

Doug Werth, Gilles Schipper, Wesley Setree and Steve Dirickson all contributed to the following.

To answer the first question, whether AUTOBOOT uses START RECOVERY or START NORECOVERY is entirely up to you. Check the contents of your AUTOBOOT file to see which option you have configured.

:sysgen
sysgen> sysfile
sysfile> sh
DISC AUTOBOOT = AUTOBOOT.PUB.SYS <<<<<<<<<<<<<<<<
TAPE AUTOBOOT = NONE
SYSTEM CATALOG = CATALOG.PUB.SYS
CM SL = SL.PUB.SYS
NMCONFIG FILE = NMCONFIG.PUB.SYS
NM LIB = NL.PUB.SYS
sysfile>exit
sysgen>exit
:print AUTOBOOT.PUB.SYS
START NORECOVERY

As for the second question, you have to create a SLT using SYSGEN that tells the system where the AUTOBOOT file is and then do an UPDATE CONFIG. The complete instructions for setting up AUTOBOOT:

1. Log on as MANAGER.SYS

2. :EDITOR

1 START NORECOVERY
2 //
...
/C "#" to '13 IN 1 <<changes # to a CR >>
/C "@" to '10 IN 1 <<changes @ to LF >>

3. :SYSGEN

SY
AAUTO FILE=AUTOBOOT.PUB.SYS TYPE=DISC
HOLD
E
KE
EXIT

4. Create a new SLT

5. At the ISL prompt,

ISL> AUTOBOOT ON

6. Boot from the SLT and UPDATE CONFIG

SYSSTART, I thought I knew ye.

This thread started when someone asked how to synchronize startup jobs launched by SYSSTART where one job should not start until one or more other jobs finish. One suggestion was to use the new version of the PAUSE command that allows you to pause until a specific job has completed. Sounded great until it was discovered that PAUSE did not work in SYSSTART. The question then was posed: what commands do work in SYSSTART? It turns out the documentation is not too good on this point. Jeff Vance finally had to resort to looking at the code to determine the following list of commands supported by SYSSTART:

ALTLOG
ALTSEC
COMMENT
CONSOLE
TELL
TELLOP
ACCEPT
ALLOW
DISALLOW
DOWN
DOWNLOAD
HEADOFF
HEADON
JOBFENCE
JOBPRI
JOBSECURITY
LDISMOUNT
LIMIT
LMOUNT
LOG
MRJECONTROL
OUTFENCE
REFUSE
STARTSPOOL
STOPSPOOL
STREAM
STREAMS
SUSPENDSPOOL
TUNE
UP
VMOUNT
WELCOME
STARTSESS
DISCRPS
ALLOCATE
DEALLOCATE
NSCONTROL
NETCONTROL
OPENQ
SHUTQ
SPOOLER
FORMSALIGN
SPUCONTROL
SETCOUNTER

Jeff then noted that not having PAUSE available clearly makes synchronization more difficult. He suggested initially setting the job and session limits to zero and then having all of the start-up jobs log on HIPRI. The last job could then set the job and session limits to normal values. Note, of course, that these jobs all have the PAUSE command available for use.

Ted Ashton, reporting from the trenches, confirmed that this is the technique they use. For what it is worth, this is also the approach I've followed for many years.

Robert Schlosser noted another technique: "One method that I have seen work to 'pause' the SYSSTART process is to have an empty message file that is written into by the job you wish to wait for, then place

FCOPY FROM=msgfile.group.account;to=$null;subset=(0,1)

in your SYSSTART file. The FCOPY will wait for a record to be written and only then will the SYSSTART process continue."

Paul Taffel suggested a variant to the separate job file:

"Allow me to point out how anyone can already place any MPE command they like in their SYSSTART file. The only commands that need to be in the STARTUP section are those that setup the spooler, and any other commands that require the user to be OP. The rest of startup processing is performed by a job that is appended to the end of the SYSSTART file. It may look confusing, but it works well for me."

startup
streams 10
openq lp
...
stream sysstart.pub.sys
***
!job sysstart,manager.sys;hipri
!# any desired MPE commands...
!eoj

Wait a minute. I thought NSCONTROL and NETCONTROL could not be called from SYSSTART? To be continued…