GTM errors...

400 views
Skip to first unread message

kdt...@gmail.com

unread,
Mar 14, 2007, 1:50:55 PM3/14/07
to Hardhats
Hey all,

I have gotten these error messages, which don't look good to me. Any
suggestions? I was searching through the patient file, and it dropped
to the command prompt...

GTM>w $ZSTATUS
150372922,SCR+1^DIO2,%GTM-E-GVGETFAIL, Global variable retrieval
failed. Failur
e code: ZZZZ.,%GTM-I-GVIS, Global variable: ^DPT(3653,0)
GTM>h

Leaving GT.M, returning to Linux...

[kdt0p@poweredge ~]$ sh rundown
Running down database...
/var/local/OpenVistA_UserData/g/mumps.dat -> File is in use by another
process.
%GTM-W-MUNOTALLSEC, WARNING: not all global sections accessed were
successfully rundown
[kdt0p@poweredge ~]$ sh runvista

GT.M VistA Startup Script
Entering GT.M system now...

GTM>zwr ^DPT(3653,*)
%GTM-E-GVDATAFAIL, Global variable $DATA function failed. Failure
code: ZZZZ.
%GTM-I-GVIS, Global variable: ^DPT(3653)

GTM>


Thanks
Kevin

Michael Zacharias

unread,
Mar 14, 2007, 2:36:48 PM3/14/07
to Hard...@googlegroups.com
It looks like your rundown didn't work. i would try identifying all
active mumps processes and shut them down. From the linux prompt try:

ps -ef | grep mumps

mupip kill /pid/ <- replace pid with the process id of the mumps
process still running

then rundown the db.

After that you can try to recheck the global...


Michael

David Whitten

unread,
Mar 14, 2007, 4:02:22 PM3/14/07
to Hard...@googlegroups.com
It appears you only did the in-memory rundown.
ie: you have to use rundown two times:

mupip rundown # to clean in-memory semaphores
mupip rundown -r "*" # to clean disk images

remember that you need to use ps -C mumps
and verify that no other GT.M processes are running.

Chris

unread,
Mar 14, 2007, 7:32:41 PM3/14/07
to Hardhats
Cool; I didn't know 'mupip rundown' cleaned in-memory semaphores. I
assumed using 'mupip rundown -r "*"' also cleaned the in-memory
semaphores.

My two cents follow...

You may also use 'lsof mumps.dat' to determine active processes using
mumps.dat

If Taskman was running when you received the %GTM-W-MUNOTALLSEC,
WARNING; then thats normal in the sense that a process [such as
Taskman] is still "using" the database.

Your first error - %GTM-E-GVGETFAIL: "This indicates that a database
lookup of a global variable encountered an error. xxxx contains the
failure codes for the four attempts. It is very likely that the
database may have integrity errors or that the process-private data
structures are corrupted."

You may want to run mupip integ -region "*" to see if everything
checks out clean.

Mahalo,
Chris

On Mar 14, 10:02 am, "David Whitten" <whitten.da...@gmail.com> wrote:
> It appears you only did the in-memory rundown.
> ie: you have to use rundown two times:
>
> mupip rundown # to clean in-memory semaphores
> mupip rundown -r "*" # to clean disk images
>
> remember that you need to use ps -C mumps
> and verify that no other GT.M processes are running.
>

K.S. Bhaskar

unread,
Mar 15, 2007, 2:38:35 AM3/15/07
to Hard...@googlegroups.com
Kevin --

What was the history of this database? Was it journaled? Was there a
previous system crash or abnormal termination (including process
termination with kill -9)?

You may want to check the Messages and Recovery Procedures manual
(http://www.fidelityinfoservices.com/user_documentation/V44MsgRecProc/index.htm)
for the GVGETFAIL error message.

What is the result of running a mupip integ on the database?

Regards
-- Bhaskar

kdt...@gmail.com

unread,
Mar 16, 2007, 9:57:39 PM3/16/07
to Hardhats
Bhaskar,

Thanks for your reply.

Answers below:


On Mar 15, 2:38 am, "K.S. Bhaskar" <ks.bhas...@fnf.com> wrote:
> Kevin --
>
> What was the history of this database? Was it journaled? Was there a
> previous system crash or abnormal termination (including process
> termination with kill -9)?

No system crash. I am not aware of any kill -9. I use mupip stop #

> You may want to check the Messages and Recovery Procedures manual

> (http://www.fidelityinfoservices.com/user_documentation/V44MsgRecProc/...)


> for the GVGETFAIL error message.

I have found the portions documenting the error. Chris has already
quoted part of the entry for that error. Advised action is to "Report
this database error to the group responsible for database integrity at
your operation." That would be me :-)

> What is the result of running a mupip integ on the database?

Here is the output:

MUPIP> INTEG -FILE /var/local/OpenVistA_UserData/g/mumps.dat
%GTM-W-MUTNWARN, Database file /var/local/OpenVistA_UserData/g/
mumps.dat is approaching 4G transaction number limit. Renew database
with MUPIP INTEG TN_RESET

%GTM-E-DBRSIZMX,
1D3BB:8 0 Record too large
Directory Path: 1:15, CB47:3D1
Path: 2F5B:1B, 1307E:71B, 1D3BB:8
Keys from ^DPT(3652,.000*) to ^DPT(3669,.12) are suspect.
%GTM-E-DBKEYMX,
1B2A1:8 0 Key too long
Directory Path: 1:15, CB47:DC7
Path: E555:21, 28172:63B, 1B3B9:3F, 1B2A1:8
Keys from ^TIU(8925,48586,"TEXT",15,0) to ^TIU(8925,48587,"TEXT",14,0)
are suspect.
%GTM-E-DBRSIZMX,
1CAF1:8 0 Record too large
Directory Path: 1:15, CB47:DC7
Path: E555:21, 28172:778, 1CBBA:548, 1CAF1:8
Keys from ^TIU(8925,55794,"TEXT",20,0) to ^TIU(8925,55796,"TEXT",48,0)
are suspect.
%GTM-E-DBRSIZMN,
289AC:8 0 Record too small
Directory Path: 1:15, CB47:E29
Path: E5AC:31C, 29F85:D99, 289AC:8
Keys from ^VAT(391.71,"D",93589,93589) to ^VAT(391.71,"D",93931.8) are
suspect.
%GTM-E-DBKEYMX,
1B1FC:8 0 Key too long
Directory Path: 1:15, CB47:F40
Path: 10798:134, 266C6:7DA, 1B1FC:8
Keys from ^XUTL("XUSYS",11493,"M") to ^XUTL("XUSYS",11612,0161*) are
suspect.
%GTM-E-DBKEYMX,
11573:8 0 Key too long
Directory Path: 1:15, CB47:F40
Path: 10798:134, 266C6:F82, 11573:8
Keys from ^XUTL("XUSYS",21564,0156*) to ^XUTL("XUSYS",21613,0161*) are
suspect.
%GTM-E-DBKEYMX,
A721B:8 0 Key too long
Directory Path: 1:15, CB47:F40
Path: 10798:141, 107DC:460, A721B:8
Keys from ^XUTL("XUSYS",28289,0828*) to ^XUTL("XUSYS",28341,"M") are
suspect.
%GTM-W-DBLOCMBINC,
80E00:0 FF Local bit map incorrect
%GTM-W-DBMRKBUSY,
80F75:0 FF Block incorrectly marked busy
%GTM-W-DBMBPFLDLBM,
80E00:0 FF Master bit map shows this map full, agreeing with
disk local map
%GTM-W-DBLOCMBINC,
86400:0 FF Local bit map incorrect
%GTM-W-DBMRKBUSY,
865E7:0 FF Block incorrectly marked busy
%GTM-W-DBMRKBUSY,
865E9:0 FF Block incorrectly marked busy
%GTM-W-DBMRKBUSY,
865EA:0 FF Block incorrectly marked busy
%GTM-W-DBMRKBUSY,
865EC:0 FF Block incorrectly marked busy
%GTM-W-DBMRKBUSY,
865EE:0 FF Block incorrectly marked busy
%GTM-W-DBMRKBUSY,
865EF:0 FF Block incorrectly marked busy
%GTM-W-DBMRKBUSY,
865F0:0 FF Block incorrectly marked busy
%GTM-W-DBMRKBUSY,
865F1:0 FF Block incorrectly marked busy
%GTM-W-DBMRKBUSY,
865F2:0 FF Block incorrectly marked busy
Maximum number of incorrectly busy errors to display: 10, has been
exceeded
18 incorrectly busy errors encountered

Total error count from integ: 30.

Type Blocks Records % Used Adjacent

Directory 4 355 26.354 NA
Index 56219 966712 7.810 48187
Data 910845 120820331 72.564 485943
Free 932 NA NA NA
Total 968000 121787398 NA 534130
%GTM-E-INTEGERRS, Database integrity errors


Now, I had already written a program that ordered through 0 nodes of
all the records of all the files, looking for errors, trapping them
and writing the location. I found these same corrupted records in
^DPT and ^TIU(8925, though missed some of the other errors. As I
write this, I am running INTEG a second time to see if the first run
fixed the problems itself.

I have read about the message:
mumps.dat is approaching 4G transaction number limit. Renew database
with MUPIP INTEG TN_RESET
--But I am concerned that if I have the database rewrite the
transaction numbers on all the blocks, it might create more problems
if I don't get this first problem fixed.


Thanks for your help.

Kevin

Bhaskar, KS

unread,
Mar 16, 2007, 11:43:40 PM3/16/07
to Hard...@googlegroups.com

Kevin --

These errors are characteristic of a crash or other unusual event (e.g., an ipcrm or something bad - can't say based on the evidence).  Assuming you weren't journaling:

1. Repair the errors.

2. Do a TN reset or upgrade to V5 (which has 64 bit transaction numbers vs. 32 bit on V4).

For the repair, you may want to consider dumping the contents of the blocks, marking them as free, and loading the dumped data.  No matter what, it will be tedious and error prone. Diagram what you plan to do before doing it, and take notes.  Read the chapter on database repair in the Admin & Ops Guide before doing anything.

Incidentally, when was the last time you ran an integ before this?  Remember that an error may be detected a long time after it occurs, so the previous integ would time box it when (after you repair) you get to trying to figure out the cause.

If you were journaling, you can use the journal files to recover the database.

Regards
-- Bhaskar
--------------------------
Sent from my BlackBerry Wireless Handheld

kdt...@gmail.com

unread,
Mar 18, 2007, 12:00:05 PM3/18/07
to Hardhats
I had a long response to this post written out regarding what I have
encountered so far, but then lost it somehow. I'll not take up
message board bandwith further here, and will talk to Bhaskar off
line.

Thanks


On Mar 16, 11:43 pm, "Bhaskar, KS" <KS.Bhas...@fnf.com> wrote:
> Kevin --
>

> These errors are characteristic of a crash or other unusual event (e.g., an ipcrm or something bad - can't say based on the evidence). ...

Reply all
Reply to author
Forward
0 new messages