Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

QWCRTVCA & MCH3402

130 views
Skip to first unread message

Mr. K.V.B.L.

unread,
Sep 11, 2014, 4:25:05 PM9/11/14
to
I've had 3 jobs die mysteriously because of a routine I wrote that involves QWCRTVCA. I feel my boss had something to do with it but I need to understand what could have possibly happened. Here is a job log:

5770SS1 V7R1M0 100423 Job Log QQBNA 09/10/14 16:31:52 Page 1
Job name . . . . . . . . . . : DFCUAT_ENG User . . . . . . : DFCUSER Number . . . . . . . . . . . : 938513
Job description . . . . . . : QKQTEST Library . . . . . : DOROOT
MSGID TYPE SEV DATE TIME FROM PGM LIBRARY INST TO PGM LIBRARY INST
CPF1124 Information 00 07/16/14 15:51:43.478996 QWTPIIPP QSYS 04C0 *EXT *N
Message . . . . : Job 938513/DFCUSER/DFCUAT_ENG started on 07/16/14 at
15:51:43 in subsystem QBATCH in QSYS. Job entered system on 07/16/14 at
15:51:43.
CPI1125 Information 00 07/16/14 15:51:43.479146 QWTPCRJA QSYS 0110 *EXT *N
Message . . . . : Job 938513/DFCUSER/DFCUAT_ENG submitted.
Cause . . . . . : Job 938513/DFCUSER/DFCUAT_ENG submitted to job queue
QS36EVOKE in QGPL from job 937396/BEAK/QPADEV0008. Job
938513/DFCUSER/DFCUAT_ENG was started using the Submit Job (SBMJOB) command
with the following job attributes: JOBPTY(5) OUTPTY(5) PRTTXT()
RTGDTA(QCMDB) SYSLIBL(QSYS QSYS2 QHLPSYS QUSRSYS) CURLIB(QGPL)
INLLIBL(TEST_ENGV2 UATPGMSOLD TRAN_ENGV2 DOROOT TFROOT QGPL)
INLASPGRP(*NONE) LOG(4 00 *NOLIST) LOGCLPGM(*NO) LOGOUTPUT(*JOBEND)
OUTQ(/*DEV) PRTDEV(PRT01) INQMSGRPY(*RQD) HOLD(*NO) DATE(*SYSVAL)
SWS(00000000) MSGQ(QUSRSYS/BEAK) CCSID(37) SRTSEQ(*N/*HEX) LANGID(ENU)
CNTRYID(US) JOBMSGQMX(64) JOBMSGQFL(*PRTWRAP) ALWMLTTHD(*YES) SPLFACN(*KEEP)
ACGCDE(*SYS).
*NONE Request 07/16/14 15:51:43.563698 QWTSCSBJ *N QCMD QSYS 0195
From user . . . . . . . . . : BEAK
Message . . . . : -STRTQQENG ENGLIB(TEST_ENGV2) ENGNAME(TRAN_ENGV2)
PORT('00000000000000012349') NUMCHLD('00000000000000000001')
PATH('/home/quikq/1.4/uat')
*NONE Information 07/16/14 15:51:43.785321 *N *N *N QCMD QSYS 01C8
Message . . . . : PATH [/home/quikq/1.4/uat
]
*NONE Information 07/16/14 15:51:43.785390 *N *N *N QCMD QSYS 01C8
Message . . . . : PORT [12349 ]
CPCA08B Completion 00 07/16/14 15:51:43.833889 QP0LCCHC QSYS *STMT *N *N *N
From module . . . . . . . . : QP0LCCHC
From procedure . . . . . . : send_message__FPcT1iT3T1
Statement . . . . . . . . . : 10
Message . . . . : Current directory changed.
Cause . . . . . : The current directory was changed to /home/quikq/1.4/uat.
MCH3402 Escape 40 09/10/14 16:31:51.747910 AiProcess 000820 TRAN_ENGV2 *STMT
To module . . . . . . . . . : GETJBSWTCH
To procedure . . . . . . . : GetCurrentJobSwitches__FRQ2_3std12basic_string
XTcTQ2_3std11char_traitsXTc_TQ2_3std9allocatorXTc__
Statement . . . . . . . . . : 3
Message . . . . : Tried to refer to all or part of an object that no longer
exists.
5770SS1 V7R1M0 100423 Job Log QQBNA 09/10/14 16:31:52 Page 2
Job name . . . . . . . . . . : DFCUAT_ENG User . . . . . . : DFCUSER Number . . . . . . . . . . . : 938513
Job description . . . . . . : QKQTEST Library . . . . . : DOROOT
MSGID TYPE SEV DATE TIME FROM PGM LIBRARY INST TO PGM LIBRARY INST
Cause . . . . . : The most common cause is that a stored address to an
object is no longer correct because that object was deleted or part of the
object was deleted.
MCH3402 Escape 40 09/10/14 16:31:51.812883 VOXEXRUN 004118 QMHSNDPM QSYS 0712
Message . . . . : Tried to refer to all or part of an object that no longer
exists.
Cause . . . . . : The most common cause is that a stored address to an
object is no longer correct because that object was deleted or part of the
object was deleted.
CPC1219 Completion 50 09/10/14 16:31:51.984977 QWTPITP2 QSYS 0636 *EXT *N
Message . . . . : This job ended abnormally.
Cause . . . . . : An error occurred that caused this job to end abnormally.
Recovery . . . : See the previously listed messages in the job log for
this job. Correct the errors and try the request again.
CPF1164 Completion 00 09/10/14 16:31:52.198834 QWTMCEOJ QSYS 014A *EXT *N
Message . . . . : Job 938513/DFCUSER/DFCUAT_ENG ended on 09/10/14 at
16:31:52; .102 seconds used; end code 30 .
Cause . . . . . : Job 938513/DFCUSER/DFCUAT_ENG completed on 09/10/14 at
16:31:52 after it used .102 seconds processing unit time. The job had
ending code 30. The job ended after 1 routing steps with a secondary ending
code of 0. The job ending codes and their meanings are as follows: 0 - The
job completed normally. 10 - The job completed normally during controlled
ending or controlled subsystem ending. 20 - The job exceeded end severity
(ENDSEV job attribute). 30 - The job ended abnormally. 40 - The job ended
before becoming active. 50 - The job ended while the job was active. 60 -
The subsystem ended abnormally while the job was active. 70 - The system
ended abnormally while the job was active. 80 - The job ended (ENDJOBABN
command). 90 - The job was forced to end after the time limit ended
(ENDJOBABN command). Recovery . . . : For more information, see the Work
management topic collection in the Systems management category in the IBM i
Information Center, http://www.ibm.com/systems/i/infocenter/.

Here is my routine. All I'm using this for is to get the current job switches:

#include <iostream>
#include <string>
#include <sstream>
#include <stdexcept>
#include <qusec.h>
#include <qwcrtvca.h>

using namespace std;

typedef struct {
Qus_EC_t ec_fields;
char Exception_Data[200];
} error_code_t;

typedef _Packed struct {
int Number_Fields_Rtnd;
int Length_Field_Info_Rtnd;
int Key_Field;
char Type_Of_Data;
char Reserved[3];
int Length_Data;
char switches[8];
} RTVC0100_t;


void GetCurrentJobSwitches(string& sw)
{
RTVC0100_t jobSwitches;
int attribKeys[1];
error_code_t error_code;

attribKeys[0] = 1006;
error_code.ec_fields.Bytes_Provided = sizeof(error_code_t);

QWCRTVCA(&jobSwitches, sizeof(jobSwitches), "RTVC0100", 1, &attribKeys[0], &error_code);
if (error_code.ec_fields.Bytes_Available) {
stringstream msg;
msg << "GetCurrentJobSwitches() QWCRTVCA() error " << string(error_code.ec_fields.Exception_Id, 7);
if (strlen(error_code.Exception_Data)) {
msg << " " << error_code.Exception_Data;
}
throw runtime_error(msg.str());
}

string switches(jobSwitches.switches, 8);
sw = switches;
}

Apparently what is happening is the "throw" is working. My mistake for forgetting that I even put the code in there and then didn't put the try-catch code in. MCH3402 is the error that is occurring. What "object" could it be referring to?

CRPence

unread,
Sep 11, 2014, 6:19:50 PM9/11/14
to
On 11-Sep-2014 15:25 -0500, Mr. K.V.B.L. wrote:
> I've had 3 jobs die mysteriously because of a routine I wrote that
> involves QWCRTVCA. I feel my boss had something to do with it but I
> need to understand what could have possibly happened.

Happened just once? Never since?

Had PTFs been applied *IMMED [i.e. not delayed, applied while system
was active] and then perhaps someone had issued a Clear Library (CLRLIB)
request against the Replace Object (QRPLOBJ) library [or either of the
new QPTFOBJ1 or QPTFOBJ2 libraries that are used for PTF processing in
place of QRPLOBJ]?

> Here is a job log:
>
> <<SNIP>>
> CPCA08B Completion 00 07/16/14 15:51:43.833889
> <<SNIP>>
> Cause: The current directory was changed to /home/quikq/1.4/uat.

The prior message is the last, before the failing instruction that is
an apparent invocation, just over 40 minutes later.

> MCH3402 Escape 40 09/10/14 16:31:51.747910
> AiProcess 000820 TRAN_ENGV2 *STMT

msgMCH3402 F/AiProcess x/0820 T/usrpgm

> To module. . . : GETJBSWTCH
> To procedure. : GetCurrentJobSwitches__FRQ2_3std12basic_string
> XTcTQ2_3std11char_traitsXTc_TQ2_3std9allocatorXTc__
> Statement. . . : 3

What is stmt/3 of the user procedure GetCurrentJobSwitches ? I
suspect, from the given source code, that would be the API invocation;
not the "throw"?

> Message: Tried to refer to all or part of an object that no longer
> exists. <<SNIP>>
> MCH3402 Escape 40 09/10/14 16:31:51.812883
> VOXEXRUN 004118 QMHSNDPM QSYS 0712
> Message: Tried to refer to all or part of an object that no longer
> exists. <<SNIP>>
> CPC1219 Completion 50 09/10/14 16:31:51.984977
> QWTPITP2 QSYS 0636 *EXT *N
> Message: This job ended abnormally.
> Cause: An error occurred that caused this job to end abnormally.
> Recovery . . . : See the previously listed messages in the job log
> for this job. Correct the errors and try the request again.
> CPF1164 Completion 00 09/10/14 16:31:52.198834
> QWTMCEOJ QSYS 014A *EXT *N
> Message: Job 938513/DFCUSER/DFCUAT_ENG ended on 09/10/14 at
> 16:31:52; .102 seconds used; end code 30 .
> Cause: Job 938513/DFCUSER/DFCUAT_ENG completed on 09/10/14 at
> 16:31:52 after it used .102 seconds processing unit time. The job had
> ending code 30. <<SNIP>>

Where there any VLic Logs [VLogs] produced at ~16:31:52; most notably
any VL2800####; (2800) major code; a Licensed Internal Code log with
Major Description Activation/Invocation ?
Whatever would be referenced at inst# 0820 of the AiProcess LIC
routine would be most telling in that regard. Because the pointer does
not store an attribute with the current nor a historical /name/ of the
object, one must infer what is the object to which the pointer was
pointing before the object was destroyed. Given the LIC program that
signaled the x2202 error is involved in effecting Activations and
Invocations, one might presume either the storage to a parameter of the
program being activated or the program itself was deleted; thus the
opening inquiry regarding possibility that someone might have deleted a
copy of the system program to which the C program presumably could have
already obtained a pointer for the failing invocation.

--
Regards, Chuck

ex-PFC Wintergreen

unread,
Sep 11, 2014, 6:28:49 PM9/11/14
to
On 9/11/2014 1:25 PM, Mr. K.V.B.L. wrote:
> I feel my boss had something to do with it but

I'm not laughing at your problem or frustration, but that comment did
make me laugh.

Mr. K.V.B.L.

unread,
Sep 12, 2014, 9:56:21 AM9/12/14
to
Well, he was going to move some data around from one library to another or something so I thought that had something to do with it. But since these jobs don't involve any SQL or other file access at all then that may have had nothing to do with it. Yesterday when I told him of my suspicions he stated that he didn't even do anything with that data at all, so I can't blame him (yet).

Mr. K.V.B.L.

unread,
Sep 12, 2014, 5:01:54 PM9/12/14
to
On Thursday, September 11, 2014 5:19:50 PM UTC-5, CRPence wrote:
> On 11-Sep-2014 15:25 -0500, Mr. K.V.B.L. wrote:
>
> > I've had 3 jobs die mysteriously because of a routine I wrote that
>
> > involves QWCRTVCA. I feel my boss had something to do with it but I
>
> > need to understand what could have possibly happened.
>
>
>
> Happened just once? Never since?

Just this once ever. No indication yet on why/how it could have happened.

>
>
> Had PTFs been applied *IMMED [i.e. not delayed, applied while system
>
> was active] and then perhaps someone had issued a Clear Library (CLRLIB)
>
> request against the Replace Object (QRPLOBJ) library [or either of the
>
> new QPTFOBJ1 or QPTFOBJ2 libraries that are used for PTF processing in
>
> place of QRPLOBJ]?
>

Actually, this machine has had no PTFs applied since it began 24/7 production. We will be taking it down for maintenance soon because it has a failing cache battery.

> > Here is a job log:
>
> >
>
> > <<SNIP>>
>
> > CPCA08B Completion 00 07/16/14 15:51:43.833889
>
> > <<SNIP>>
>
> > Cause: The current directory was changed to /home/quikq/1.4/uat.
>
>
>
> The prior message is the last, before the failing instruction that is
>
> an apparent invocation, just over 40 minutes later.
>

Check the date. This job was started July 16, crashed September 10.
I don't see anything incriminating here, but I've never viewed vlogs before. I started SST, choose option 1, then option 5, option 1, then 00000000 for Starting Entry ID and chose 09/10/14 00:00:00 - 09/10/14 23:59:59. I really don't know how to read the logs but I did not see anything in the time range for the job logs
0 new messages