The results of the 2004 SIB are in. There were few surprises. This
year’s SIB was radically different, as befits the current status of the HP
e3000, MPE/iX and IMAGE. The ballot was split into two, one a ballot on
strategic items and the other on the more traditional tactical items. For the
results, see www.interex.org/advocacy/survey/2004mpe_results.html. We now must
wait for HP’s response. Count on it being reported and analyzed in the 3000
Newswire. As I write this, voting is underway for the Board of Directors of the
recently re-organized OpenMPE. Hopefully you are a member. If not, why not
since membership is still free? We need to energize the MPE-IMAGE community.
OpenMPE is the best shot for MPE-IMAGE to exist post 2006 in some supportable
and maintainable form. If you have not joined OpenMPE and are not planning to,
contact me and give a chance to talk you into changing your mind.
March saw more of the now
all-too-familiar off topic threads about Iraq or religion or politics, or Iraq
and religion and politics. This time we added a long thread on taxes. However,
there was still a surprisingly large amount of good technical content, some of
which we summarize below.
I always like to hear from readers of net.digest and Hidden Value. Even negative comments are welcome. If you think I’m full of it or goofed, or a horse's behind, let me know. If something from these columns helped you, let me know. If you’ve got an idea for something you think I missed, let me know. If you spot something on HP3000-L and would like someone to elaborate on what was discussed, let me know. Are you seeing a pattern here? You can reach me at john@burke-consulting.com.
If
you use hardware mirroring, is there any value in user volumes?
We
got a good argument going between some real MPE pros on this question. In favor
of user volumes were the arguments that, even in a robust hardware mirroring
scenario,
· User volumes give you an extra measure of protection in case of a catastrophic failure requiring a re-install; and,
· User volumes may improve performance, especially in a multi-cpu system,
Against user volumes were the arguments that,
· The overhead of maintaining the account structure with user volumes more than offsets any small performance gain; and,
· The overhead of maintaining user volumes more than offsets the miniscule risk of a catastrophic failure requiring a re-install. Only in the case of 100s of GB of storage might user volumes make sense.
For
what it is worth, here is what I think. The person who asked the question
already had two Model 20s with all storage configured as part of the system
volume set. Even if you were in favor of user volumes, changing to user volumes
for this customer would entail a re-install and a delicate manipulation of
account structure. Clearly the claimed benefits do not justify such a drastic
measure. Similarly, if you are moving from an unprotected environment where you
used user volumes to a hardware RAID environment, the benefits of going to a
single volume set do not justify the delicate operation required on the account
structure. If you were starting out from scratch with a hardware RAID solution
then I would probably not recommend user volumes unless you were looking at
several hundred GB of total storage and/or more than a dozen or so LUNs.
You’ve
GOT to be joking
This is not strictly technical, but is so funny I could not let it pass. Someone wrote on HP3000-L, “This is the message that I get after I connect to our HP3000 from a newly arrived, but used, HP e3000 and try to issue any FTP command.” What is going on? After several people speculated on possible errors, James Hofmeister of WTEC replied, “This couldn't possibly be coming from the FTPSRVR code? ARGH... Ok, joking aside it is true; this message is coming from the MPE/iX FTPSRVR. I checked the code and verified the cause for this goofy messages is: ‘port.tcp_addr > 1023’. The standard TCP ports for FTP are 20 & 21.”
Mike Hornsby of
Beechglen first reported this problem, “The problem surfaces if you attempt to
reboot your system while it is out of permanent space on the system volume set.
The reboot process will hang while attempting to build the next NMLG (network
log) file. In other words, rebooting with no available disk space can render
your system unbootable. Completely filling the permanent disk space in the
system volume set can happen more easily than one might expect. Inadvertently
restoring large file sets, batch processes that loop, and enabling low level
logging events, to name a few.
“One workaround we have developed involves running stand-alone offline diagnostic after the fact, to patch the volume information. This is a delicate and time-consuming process, but certainly beats the alternative of a re-install. The other workaround is to build a temporary file of sufficient size to 'reserve' some space that will be recovered when the system is rebooted. :BUILD TAKESPAC;DISC=20000,1,1;DEV=1;TEMP The simplest place to put this would be in a job like JINETD. Check your HPSWINFO.PUB.SYS file to determine whether you are at risk. Systems that have one of the following patches installed are susceptible.”
Release 6.5:
NMSGDT1A
Release 7.0:
NMSGDT2A
Release 7.5:
NMSGDV1A
James Hofmeister confirmed the problem, “This problem is resolved in beta test patches NMSHD77 (6.5), NMSHD78 (7.0) & NMSHD79 (7.5). You can contact the HP-Response Center to request the beta fixes for SR 8606351808.
There was a fascinating thread about XM and memory scanning that brought in a number of MPE heavyweights (no pun intended). You can read the whole thread at (watch the wrap)
http://raven.utc.edu/cgi-bin/WA.EXE?S2=hp3000-l&q=&s=%22internals%2C+and+patches%22&f=&a=&b=
I am going to copy most of the last posting to the thread. Bill Cadier of vCSY made it. You know you are a geek if you enjoy reading it. “I thought I'd mention that the ‘enhanced checkpoint’ feature (ALTERCHKPTSTAT in VOLUTIL) is not used as of 6.5, the command remains in volutil (no idea why!) but it does nothing. The feature used a bit map to track changed pages and that didn't scale with large files. And the ‘system-wide’ semaphore mentioned is also gone replaced with a more granular object based locking scheme.
“And I thought I'd also share some historical information about the early days of 6.5 and 7.0 with large memory and large files that might help put the ‘scanning memory’ statement into perspective. Some of this might have been discussed here several years ago. The memory manager scans memory. When XM needs to ensure that pages of files have posted to disk (been made durable) so that it can reuse a log half it calls memory management (MM) routines to do that.
“In addition to
handling post requests or fetch (or prefetch) requests MM has to try to keep memory
organized. These activities include making present pages into ‘recoverable
overlay candidates’, or ‘roc’ pages if they have not been accessed recently.
This may also include starting a background write if the page is dirty. MM will
also try to take pages from the roc list and make them absent (free) if they
remain un-accessed and if their background write has finished. This activity
will become more urgent as the pool of free pages drops below certain
thresholds and can include bumping the priority of background writes so they
complete more quickly and being more aggressive about roc'ing present pages.
“These list
management algorithms scale based on memory size. And unlike on MPE V where
this activity could occur during idle periods because there was far less memory
to manage, on MPE/iX we don't have that luxury. The memory manager has to try
to do some amount of list maintenance almost any time it is called.
“Early in 6.5 on
systems with large memory and large files we found that these algorithms did
not scale as well as we would have liked. They might take too long by trying to
do too much, too frequently. And this may be where the ‘scanning memory’
observation was made.
“We made a number
of enhancements to speed memory management activities. This was several years
ago and by now I'd hope most 6.5 and 7.0 systems have these patches. Here are
some of the more significant of those large memory performance improvement
patches:
“MPELXG6 - The first of two enhancements to the memory manager list maintenance algorithms. This reduced the length of time MM would spend traversing its lists and while doing so, keeping parts of memory locked.
“MPELXH8 - The
second of two, further shortening the amount of time memory manager locks are
held and changing the frequency and location of some of the free page
replenishment activities.
“MPELXF8 -
Enhancement to storage management allowing ‘big’ files of 1GB or more to be
held on a ‘least recently used’ list longer. This list holds ‘GUFD's’ or
‘global unique file descriptors’ of files that are closed and have NO
accessors. The expectation being that normal memory management activity might
whittle away at the file pages posting them and minimizing the impact of the
post that would have to occur when the ‘GUFD’ structure needed to be reused for
another file.
“MPELXH5 - The
‘whittling’ wasn't happening fast enough so we added code to do that. It's done
in small increments from the end of the file upwards so if the file reopens
before it is fully mapped out we minimize the page faults needed to bring it
back into memory.
“MPELXF2 - An
enhancement to a memory manager internal api called make_pages_roc that allowed
callers to make pages free rather than just recoverable. Rather than letting
the memory manager discover these unneeded pages, it can be told that the pages
are no longer needed and can be tossed out of memory right away.
“MPELXJ9 - A
further improvement to that enhanced api so both present and recoverable pages
would be made free. The initial code change in LXF2 unnecessarily skipped roc
pages.
“These patches are all superseded by others. For 6.5, patch MPEMXE5A will install all these changes and many more. The 7.0 patch is MPEMXC7A. These changes were submitted to 7.5 so there are no 7.5 versions needed.”