Hidden Value – December 2002

Q: My system crashed. Now when I bring it back up it starts to behave strangely indicating several system files cannot be accessed. I can sign on, as MANAGER.SYS, but most of the accounts that used to be on the system cannot be found. When I do a listf of PUB.SYS, most of the files have a message associated with them that reads:

*BAD UFID FOR THE FOLLOWING FILE : /SYS/PUB/COMMAND

UFID : 05650002 1B8D7A30 000042D2 18020864 0266B8F5

Bad UFID for the file /SYS/PUB/COMMAND (CIWARN 9165)

I believe the system disk experienced some "difficulties" at some point, and I'm not sure what happened or if it’s repairable. Of course I have a SYSGEN tape, but never having had to use one, I need to know if it contains the SYS account files necessary for me to begin reconstruction and reloading of accounts.

A: Paul Courry replied:

Bad UFID == Bad Universal File IDentifier. In other words your file system is corrupted. You can try running FSCHECK.MPEXL.TELESUP [run with EXTREME care, reading the manual first] but considering the extent of the damage you probably will not be able to recover everything.

Larry Barnes noted:

Your SYSGEN tape may or may not have the SYS account on it. It depends on how the tape was created. You can generate a SYSGEN tape and have it include certain accounts. I usually included sys and TELESUP on the tape.

Finally, John Clogg replied:

Since you have missing accounts as well as the UFID problem, it seems your system directory is damaged. I think it's a safe bet that your system volume set is clobbered. You need to do an INSTALL from your SLT. This will re-install your operating system and give you a brand new directory. Files, groups, and accounts on private volume sets are still there, but you will need to recreate the system directory entries for those accounts and groups. If you have BULDACCT output, that will make the job easier. It's always a good idea to run BULDACCT periodically and store the result to tape for just this eventuality. [Editor's note: I use BULDACCT as backup in case my primary method to recover directory entries fails for some reason: the DIRECTORY option of STORE.]

You will also need to restore the contents of your system volume set. Make sure you use the KEEP option so you won't lose any files created by the INSTALL. You might want to purge or rename COMMAND.PUB.SYS before the restore, so you get your SETCATALOG definitions restored along with the files.

Q: Does anyone know the pin configurations to go from a DTC RJ45 distribution panel to a Modem?

A: Mark Halstead replied:

I researched this recently. What I got from HP was that the RJ45 ports "don't generate modem signals". If you have a dtc16 or dtc72 you can get a MDP -- Modem Distribution Panel to proivide modem ports.

Q: I have an older HP 3000. I want to add a standalone single ended HP DLT drive. Does anyone know what HP DLT4000 external drives are supported on MPE/iX?

A: Denys Beauchemin replied:

A DLT4000 is a DLT4000. Any one will work. HP no longer sells DLT4000 devices. They only sell DLT8000 and probably SDLT, but the latter is not currently supported on MPE. You should be able to find a used DLT4000 most anywhere for a very low price. I will be easy to attach directly to a SCSI port on the system, just make sure it is not a differential (FWD) SCSI port. In SYSGEN, the device ID will be DLT4000. Use the shortest cable you can get away with.

Q: We experienced a power outage Sunday. After bringing everything up I have a couple of serial connected printers (Zebra) that do not want to work properly. Does anyone have any ideas?

A: Jeff Kell replied:

Check first to see if the DTCs are healthy. If they power failed as well, they may not have been downloaded correctly (by the host or by DTC Manager, whichever flavor you use for configuration).

Q: We have been successfully using and recommending DDX for years. We tried MDX way back when it was first introduced and got corrupt databases. We have a real need for it now, however. Is MDX as stable as DDX now? Our 3000s run MPE/iX 6.5. Is this adequate? Are special patches necessary?

A: Guy Paul replied:

The problem you refer to about corruption was a serious one from the 5.5 days. It has been patched and no serious ones like it have popped up that I am aware of. There are some corner cases when corruption can occur but the latest TI patch TIXMX73 should fix them. Without MX73 the possibility exists that if more than one dataset expansion (DDX or MDX) happens within one dbxbegin/dbxend and then rollback (dbxundo) the transaction you will get corruption. Fortunately the corruption is in the user label and not in the data. The scenario for this would be very complex transactions and very small increments on your expansions so it was a corner case. So, to answer your question - MDX is stable as is DDX.

Q: What should be done to move a HP959 15 feet across the office? Is it possible to just power it down (removing all connections of course) and roll it across? Or do we need to get HP involved with it?

A: John Burke replied:

Yep, been there, done that. However, my understanding of HP's official policy is that if the system does not come back up OK, the repair is not covered under your support contract since you moved it.

To which John Clogg added:

In which case, you roll it back to its original location before you call HP. But seriously, rolling machines around in the room is no big deal, and people do it all the time. I would definitely involve HP if moving the machine to a different site, but not for the move you describe.

Q: While waiting for a backup to finish, I began to ponder anew a recommendation from HP. Ever since STM was foisted upon us, and HP Predictive Support was changed since it needs STM, HP has recommended that the JPSMON job should run all of the time. I have wondered why JPSMON couldn't be started, say 10 minutes before Predictive was scheduled to run, and then aborted sometime later when we are sure that Predictive ran successfully. Any thoughts on this topic?

A: No one on HP3000-L had an answer, so the ITRC was consulted:

Your suggestion is good food for thought this morning. I see no reason why your plan would not work. I believe the reason we recommend JPSMON be ran at all times is to guarantee it is running when Predictive runs. Besides, the job takes little if any resources.

Q: I'm doing a report in Query. I can get line breaks after a total, but want a page break. How can I do this?

A: Mark Wonsil and Roy Brown replied:

Your line breaks will be SPACE A [number] or SPACE B [number] I imagine.

Use SKIP A or SKIP B (no numbers) in place of these, to page break. See http://docs.hp.com/mpeix/all/index.html#QUERY/iX for the appropriate manual.

Q: An interesting thing occurred to us this month, when we tried to convert a very large MPE file to bytestream.

record count: 102370

record length: 25158

bytes: 2575424460, or approx. 2.5 GB

Unbeknownst to us, the tobyte program has a 2 GB limit, or so it seems. Using the POSIX command 'tobyte <source> <dest>' always produced a file with a bytecount of 2147483647 (which is one byte less than 2 GB). Moreover, it did not produce a warning or an error, so we didn't initially realize the tobyte-ed file was incomplete. Anyone know why this is the case and if there's a workaround (other than splitting up the original file and converting the split files separately)?

A: Michael Berkowitz replied:

The problem is not with tobyte. The answer is the way bytestream files are implemented. The file system is record oriented for all file types including bytestream files. Bytestream files are simulated by making 1 byte records. However the maximum number of records that any file can have, a 32 bit integer, has never changed. So the maximum number of bytes a 1 byte/record file can have is 2147483647. This number cannot be made larger, say to 64 bits unless all file intrinsics that reference a record pointer are changed to allow the larger value. Now the fact that tobyte didn't give an error when converting too large of a file should be considered a bug.

Mark Wonsil added:

All is not lost though. Instead of writing the whole input file to one output file, consider piping the result of tobyte to split. Check out the man page.

Q: A disk drive failed on a user volume. How can I determine the accounts and groups on that user volume?

A: John Clogg replied:

Try REPORT @.@;ONVS=<volset>

Jeff Woods added:

In addition to the suggestion to use ":REPORT @.@;ONVS=volset" (which may fail because it's actually trying to look at the group entries on the volume set if I recall correctly) you can do a ":LISTGROUP @.@" and scan the listing for groups where HOMEVS is your uservolumesetname. The advantage of LISTGROUP is that it uses only the directory entries on the system volume set. You may want to redirect the output of LISTGROUP to a file and then search that rather than trying to scan the listing directly.

Finally, from Larry Barnes:

If you have VeSoft's VEAUDIT, you can say

VEAUDIT listgroup @.@(homevolumeset="vol_name") > grpnames

Q: The SYSINFO program just crashed an N-class system running MPE/iX 7.0 pp1. HP has told us don't run the program. My question is whether the problem is related to the N-class or is it related to MPE/iX 7.0?

A: John Burke replied:

SYSINFO is one of those darling little programs that is available from HP on every system but technically unsupported. The catch 22 comes in when in various documentation HP suggests you run SYSINFO to check something or other but then will not support you if something goes wrong.

SYSINFO in the past was notorious for crashing loaded, multi-processor systems when "all", "mem", "module" or "cpu" commands were called. Been there, done that. As far as I know, this is still a potential problem. It also had the nasty habit of breaking mirrors in a Mirror/iX environment though I believe that has been fixed.

Now as of 6.5 with STM (may it die a horrible death) there are additional complications; for example, “mem” can just start looping chewing up CPU time and never returning information if STM is not running correctly. There are other reports about bogus information being returned.

That said, SYSINFO can be a very useful program for displaying information about your system. However, it must be run with great care

Q: The question was about ABORTJOB not working.

A: Jeff Vance replied:

The PINFO() CI function can provide many more details than SHOWPROC (:help pinfo to see if it is on your system). An early Communicator describing the many PINFO options is on Jazz at:

http://jazz.external.hp.com/papers/Communicator/7.0/exp1/ci_enhancements.html

:NSCONTROL KILLSESS=#Snnnn may also prove useful.

Last, there is a script and a UDC on Jazz that tries to do an intelligent abortjob. Wildcards are supported. Minusing is supported. It will do an NSCONTROL killsess= and ABORTIOs. See: http://jazz.external.hp.com/src/scripts

Q: Can anyone advise what to do if my system log file appears to be locked up? When I type SWITCHLOG, this is what I get:

:switchlog

NMEV#200@221 System logging is not enabled at the present time.

System Logging message 900

I do not want to reboot the box, but I am stuck.

A: Chris Bartram replied:

This happens on two of my 3000s (running 6.5) two or three times a week! Assuming RESUMELOG does not work, as in my case, I wrote a command file and use the SYSLOG program from Allegro in a job that runs once a day.

!# make sure system logging is enabled

!xeq chekslog.xeq.sys

!if NOT !_systemlogging then

! continue

! run syslog.pub.allegro;info="syslog enable"

!endif

You can get syslog from the gurus at Allegro (www.allegro.com). Here's the chekslog.xeq file:

# chekslog: check system logging

# (determine if system logging is enabled or not)

showlog > xx123m

input _lgout1 < xx123m

if lft("!_lgout1",17)="SYSTEM LOG FILE #" then

echo System Logging Running

setvar _systemlogging true

else

echo System Logging *NOT* Running

setvar _systemlogging false

endif

Paul Christidis noted something to look out for:

The last time that something similar happened to our site restarting our machine using the most recent SLT tape caused it. It turned out that when the system logging process tried to open a new log file a file with the same name was already on the system and thus logging was suspended. See if a log file 'name conflict' is at work and, if so, remove/rename that file (and any subsequent ones) out of the way. Then you can try the command RESUMELOG.