Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Info-Ingres] Ingres 10.1 - iidbms process randomly disappear

181 views
Skip to first unread message

Dennis

unread,
Feb 5, 2014, 7:17:13 PM2/5/14
to info-...@kettleriverconsulting.com
We're testing a 10.1 installation and randomly the iidbms processes will
disappear.

There is nothing in the log files (except messages about no server
running) and can't seem to correlate this with any particular activity
or process. In other words, not enough info to log a ticket.

Just wondering if anyone else has seen this?

II 10.1.0 (a64.lnx/126)NPTL

Dennis
_______________________________________________
Info-Ingres mailing list
Info-...@kettleriverconsulting.com
http://ext-cando.kettleriverconsulting.com/mailman/listinfo/info-ingres

Karl Schendel

unread,
Feb 5, 2014, 7:38:37 PM2/5/14
to Ingres and related product discussion forum

On Feb 5, 2014, at 6:33 PM, Dennis wrote:

> We're testing a 10.1 installation and randomly the iidbms processes will
> disappear.
>
> There is nothing in the log files (except messages about no server
> running) and can't seem to correlate this with any particular activity
> or process. In other words, not enough info to log a ticket.

That sounds a bit like linux's OOM killer. Do any of the system logs
have useful information?

Karl

Paul White

unread,
Feb 5, 2014, 7:40:18 PM2/5/14
to Ingres and related product discussion forum
Hi Dennis,

Is there a core dump generated?
Paul

Emma K. McGrattan

unread,
Feb 5, 2014, 7:47:27 PM2/5/14
to Ingres and related product discussion forum
In the couple of times I've seen this it was some kind of system monitoring process that killed the server because it was consuming too much CPU or Memory resources. We scratched our heads trying to figure it out only to find that it wasn't an Ingres bug. The last time I saw it was a Vectorwise customer running on RHEL and the kernel was killing the X100 server due to mem+swap exhaustion.

-----Original Message-----
From: info-ingr...@kettleriverconsulting.com [mailto:info-ingr...@kettleriverconsulting.com] On Behalf Of Dennis
Sent: Wednesday, February 05, 2014 6:33 PM
To: info-...@kettleriverconsulting.com
Subject: [Info-Ingres] Ingres 10.1 - iidbms process randomly disappear

Laframboise, André

unread,
Feb 5, 2014, 8:15:55 PM2/5/14
to paul....@shift7solutions.com.au, Ingres and related product discussion forum
Hi,

Do you have II_DBMS_LOG configured ?

I have it set up to create a log file per DBMS server and it contains more low level information than just the errlog.log.
This information may also help the support guys figure out the problem.

Dennis

unread,
Feb 6, 2014, 12:17:17 AM2/6/14
to info-...@kettleriverconsulting.com
On 2/5/2014 5:29 PM, Paul White wrote:
> Hi Dennis,
>
> Is there a core dump generated?
> Paul
>

No core dump generated that I could see. Would this be somewhere in the
II_SYSTEM directory structure. None of the other Ingres processes
disappear, just iidbms.

Dennis

unread,
Feb 6, 2014, 12:17:17 AM2/6/14
to info-...@kettleriverconsulting.com
On 2/5/2014 5:36 PM, Emma K. McGrattan wrote:
> In the couple of times I've seen this it was some kind of system monitoring process that killed the
> server because it was consuming too much CPU or Memory resources.
> We scratched our heads trying to figure it out only to find that it
> wasn't an Ingres bug. The last time I saw it was a Vectorwise
> customer running on RHEL and the kernel was killing
> the X100 server due to mem+swap exhaustion.
>

The only thing that seems to be in common here is running on RHEL. As
mentioned in response to Karl, this server has been running 9.x for
quite some time. But I can check with IT to see if they have some kind
of job checking for CPU or memory usage. This is running in a VM.

Dennis

unread,
Feb 6, 2014, 12:17:17 AM2/6/14
to info-...@kettleriverconsulting.com
On 2/5/2014 6:03 PM, Laframboise, André wrote:
> Hi,
>
> Do you have II_DBMS_LOG configured ?
>
> I have it set up to create a log file per DBMS server and it contains more low level information than just the errlog.log.
> This information may also help the support guys figure out the problem.
>

I don't but will check into this.

Dennis

unread,
Feb 6, 2014, 12:17:27 AM2/6/14
to info-...@kettleriverconsulting.com
On 2/5/2014 5:27 PM, Karl Schendel wrote:
>
> On Feb 5, 2014, at 6:33 PM, Dennis wrote:
>
>> We're testing a 10.1 installation and randomly the iidbms processes will
>> disappear.
>>
>> There is nothing in the log files (except messages about no server
>> running) and can't seem to correlate this with any particular activity
>> or process. In other words, not enough info to log a ticket.
>
> That sounds a bit like linux's OOM killer. Do any of the system logs
> have useful information?
>
> Karl
>
>

Good point. The next time this happens I will look through any system
logs to see if they shed any light.

Also, a 9.x installation has been running for quite some time on this
server without this issue. The 9.x installation is shut down while
testing 10.1.

Dennis

unread,
Feb 7, 2014, 5:46:54 AM2/7/14
to info-...@kettleriverconsulting.com
On 2/5/2014 4:33 PM, Dennis wrote:
> We're testing a 10.1 installation and randomly the iidbms processes will
> disappear.
>
> There is nothing in the log files (except messages about no server
> running) and can't seem to correlate this with any particular activity
> or process. In other words, not enough info to log a ticket.
>
> Just wondering if anyone else has seen this?
>
> II 10.1.0 (a64.lnx/126)NPTL
>

So, this happened again and this is what i found in the log file.

HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
dm0p.c:19482 ]: Thu Feb 6 12:42:36 2014
E_DM9397_CP_INCOMPLETE_ERROR A Buffer Manager protocol error occurred
during processing of a Consistency Point. All buffers could not be
flushed during the CP. The error occurred on buffer #1. Buffer
information: Table_id (514, 0), status: 00000003, cp_count: 0 (5),
ref_count: 0, TCB: 00000000, validation value: 0 (0).
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
dmdcheck.c:152 ]: Thu Feb 6 12:42:36 2014 E_DM9449_DMD_CHECK
DMD_CHECK called from file
/devsrc/ingres10r2/b126/src/back/dmf/dmp/dm0p.c line 19483.
0:2b7feb4190d0 [7582e8]( ... )
1:2b7feb4190e0 [75871b]( ... )
2:2b7feb4190f0 [758740]( ... )
3:2b7feb4192e0 [a23d8b]( ... )
4:2b7feb419490 [932fec]( ... )
5:2b7feb4197b0 [a0e1fb]( ... )
6:2b7feb419910 [5a4610]( ... )
7:2b7feb4199d0 [7a2da0]( ... )
8:2b7feb41e0b0 [4898af]( ... )
9:2b7feb426130 [763f1c]( ... )
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
dmdcheck.c:173 ]: Thu Feb 6 12:42:36 2014 E_DM942B_CRASH_LOG
Server has written dmd_check() information to crash log file: iicrash.log.
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
dmdcheck.c:178 ]: Thu Feb 6 12:42:36 2014 E_DMF024_INCOMPLETE_CP
Fatal error: All modified pages could not be flushed out of buffer
manager during Consistency Point.
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
sc0e.c:702 ]: >>>>>Session 00002B7FC7562240:-347969216<<<<<
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
sc0e.c:702 ]: DB Name:
(Owned by:
)
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
sc0e.c:702 ]: User: <Consistency Pt Thread> (
<Consistency Pt Thread> )
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
sc0e.c:702 ]: Session started at 5-Feb-2014 08:44:09 as
user <Consistency Pt Thread>

HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
sc0e.c:702 ]: Terminal: NONE
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
sc0e.c:702 ]: Group Id:
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
sc0e.c:702 ]: Role Id:
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
sc0e.c:702 ]: Application Code: 00000000
Current Facility: DMF (00000003)
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
sc0e.c:702 ]: Description:
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
sc0e.c:702 ]: Query:
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
sc0e.c:702 ]: Last Query:
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
ulx.c:300 ]: Thu Feb 6 12:42:37 2014
E_DM904A_FATAL_EXCEPTION A fatal error has occurred in the DMF Facility.
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
ulx.c:320 ]: Thu Feb 6 12:42:37 2014 E_UL0306_EXCEPTION
Unexpected exception number 68120 occurred, but unable to determine the
type.
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
dm2t.c:3765 ]: Thu Feb 6 12:42:37 2014
W_DM9C50_DM2T_FIX_NOTFOUND Table Control Block for specified table does
not exist.
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
scsdbfcn.c:1322 ]: Thu Feb 6 12:42:37 2014
E_SC0238_FAST_COMMIT_EXIT Fast Commit Thread terminated abnormally.
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
]: Thu Feb 6 12:42:37 2014
E_DM004A_INTERNAL_ERROR Internal DMF error detected.
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
dmfcall.c:958 ]: Thu Feb 6 12:42:37 2014
E_DM004A_INTERNAL_ERROR Internal DMF error detected.
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
scsinit.c:5032 ]: Thu Feb 6 12:42:37 2014
E_SC012B_SESSION_TERMINATE Error terminating Session.
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
scsqncr.c:3359 ]: Thu Feb 6 12:42:37 2014
E_SC0241_VITAL_TASK_FAILURE A Server Task thread necessary to the server
has terminated forcing the shutdown of the DBMS server - Server Task
name ' <Consistency Pt Thread>'.
HOSTNAME ::[56681 , 13204 , 00002b7fed806200,
scddbfcn.c:398 ]: Thu Feb 6 12:42:38 2014 E_SC010E_DB_DELETE
Error deleting database. Name: gcd Owner: ingres Added Id ECB9E640
HOSTNAME ::[56681 , 13204 , 00002b7fed806200,
dmfcall.c:958 ]: Thu Feb 6 12:42:38 2014
E_DM004A_INTERNAL_ERROR Internal DMF error detected.
HOSTNAME ::[56681 , 13204 , 00002b7fed806200,
dmfcall.c:958 ]: Thu Feb 6 12:42:38 2014
E_DM004A_INTERNAL_ERROR Internal DMF error detected.
HOSTNAME ::[56681 , 13204 , 00002b7fed806200,
scsdbfcn.c:1126 ]: Thu Feb 6 12:42:38 2014 E_SC0122_DB_CLOSE
Error closing database. Name: gcd Owner: ingres
HOSTNAME ::[56681 , 13204 , 00002b7fed806200,
scsdbfcn.c:1133 ]: Thu Feb 6 12:42:38 2014 E_SC010D_DB_LOCATION
Database Location Name: $default Physical Specification:
/opt/sdo/database2/gcd/ingres/data/default/gcd Flags: 00000003
HOSTNAME ::[56681 , 13204 , 00002b7fed806200,
scdnote.c:183 ]: Thu Feb 6 12:42:38 2014
E_SC0221_SERVER_ERROR_MAX Error count for server has been exceeded.
HOSTNAME ::[56681 , 13204 , 00002b7fed806200,
scdnote.c:183 ]: LQuery: open ~Q cursor for select * from
"ingres"."tableName1" for readonly
HOSTNAME ::[56681 , 13204 , 00002b7fed242200,
scddbfcn.c:398 ]: Thu Feb 6 12:42:38 2014 E_SC010E_DB_DELETE
Error deleting database. Name: dbName Owner: ingres Added Id EC693980
HOSTNAME ::[56681 , 13204 , 00002b7fed242200,
dmfcall.c:958 ]: Thu Feb 6 12:42:38 2014
E_DM004A_INTERNAL_ERROR Internal DMF error detected.
HOSTNAME ::[56681 , 13204 , 00002b7fed242200,
scdnote.c:183 ]: Thu Feb 6 12:42:38 2014
E_SC0221_SERVER_ERROR_MAX Error count for server has been exceeded.
HOSTNAME ::[56681 , 13204 , 00002b7fed242200,
scdnote.c:183 ]: LQuery: open ~Q cursor for select * from
"ingres"."tableName2" WHERE col1 = 'xxxxxx' for readonly
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
dmfcall.c:958 ]: Thu Feb 6 12:42:47 2014
E_DM004A_INTERNAL_ERROR Internal DMF error detected.
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
scdinit.c:4059 ]: Thu Feb 6 12:42:47 2014 E_SC0235_AVERAGE_ROWS
On 9. select/retrieve statements, the average row count returned was 1521.
HOSTNAME ::[56681 , 13204 , 00002b7fc7562240,
sc0e.c:324 ]: Thu Feb 6 12:42:47 2014
E_SC0127_SERVER_TERMINATE Error terminating Server.
HOSTNAME ::[56681 , 13204 , 00002b7fed806200,
dmfcall.c:958 ]: Thu Feb 6 12:42:47 2014
E_DM004A_INTERNAL_ERROR Internal DMF error detected.
HOSTNAME ::[54019 IIGCN, 12998 , 0000000000000001]:
Thu Feb 6 12:42:47 2014 E_GC0154_GCN_SRV_SHUTDOWN Registered server
shutdown: class INGRES, address 56681
HOSTNAME ::[36100 , 13168 , 00002acf210e8cc0,
lgmisc.c:1441 ]: Thu Feb 6 12:42:48 2014
E_DMA469_PROCESS_HAS_DIED Process (00003394) has died. A process
attached to the logging and locking system has exited without going
through normal cleanup processing. The system will now perform cleanup
processing on behalf of the failed process.
HOSTNAME ::[36100 , 13168 , 00002acf210e8cc0,
lgkinit.c:1138 ]: Thu Feb 6 12:42:48 2014
E_DMA499_DEAD_PROCESS_INFO Process (00003394) died with info '(DEFAULT)'.
HOSTNAME ::[54019 IIGCN, 12998 , 0000000000000003]:
Thu Feb 6 12:42:48 2014 E_GC0155_GCN_SRV_FAILURE Registered server
failure: class INGRES, address 55875
HOSTNAME::[37957 , 13300 , ffffffff]: Thu Feb 6 12:42:49
2014 E_RE000F_DAEMON_GET_DBEVENT Cannot get dbevent. The SQL error code
was -37000.
HOSTNAME ::[54019 IIGCN, 12998 , 0000000000000006]:
Thu Feb 6 12:42:49 2014 E_GC0154_GCN_SRV_SHUTDOWN Registered server
shutdown: class RMCMD, address 37957
HOSTNAME::[37957 , 13300 , ffffffff]: Thu Feb 6 12:42:49
2014 E_RE0002_RMCMD_DOWN Visual DBA RMCMD Server Normal Shutdown.
HOSTNAME ::[54019 IIGCN, 12998 , 0000000000000000]:
Thu Feb 6 12:42:52 2014 E_GC0139_GCN_NO_DBMS No DBMS servers (for the
specified database) are running in the target installation.

Alex Hanshaw

unread,
Feb 7, 2014, 6:32:29 AM2/7/14
to Ingres and related product discussion forum
Hi Dennis

Please log an issue and I can get someone to take a look at this for you.

Thanks

Alex

-----------------------------------
Alex Hanshaw
Director, Engineering
Actian | Engineering
Accelerating Big Data  2.0
O +44 1753 559515
M +44 7793 757929
www.actian.com


-----Original Message-----
From: info-ingr...@kettleriverconsulting.com [mailto:info-ingr...@kettleriverconsulting.com] On Behalf Of Dennis
Sent: 07 February 2014 10:19
To: info-...@kettleriverconsulting.com
Subject: Re: [Info-Ingres] Ingres 10.1 - iidbms process randomly disappear

DavidR

unread,
Feb 7, 2014, 6:47:56 AM2/7/14
to info-...@kettleriverconsulting.com
That trace proves that the DBMS server is crashing and not being killed
by an external monitoring process.

I think you need to contact Actian support for a work around / patch.

Dennis

unread,
Feb 7, 2014, 7:18:01 AM2/7/14
to info-...@kettleriverconsulting.com
On 2/7/2014 4:19 AM, Alex Hanshaw wrote:
> Hi Dennis
>
> Please log an issue and I can get someone to take a look at this for you.
>
> Thanks
>
> Alex
>

Thanks. I've logged a ticket and attached errlog.log and iicrash.log

Dennis

James K. Lowden

unread,
Feb 7, 2014, 8:01:00 PM2/7/14
to info-...@kettleriverconsulting.com
On Wed, 5 Feb 2014 19:27:13 -0500
Karl Schendel <sche...@kbcomputer.com> wrote:

> > There is nothing in the log files (except messages about no server
> > running) and can't seem to correlate this with any particular
> > activity or process. In other words, not enough info to log a
> > ticket.
>
> That sounds a bit like linux's OOM killer. Do any of the system logs
> have useful information?

Karl, can I ask a naive question? Why let Ingres start on Linux if
oversubscribed memory is configured? It's terrible for a DBMS in every
way. Why not just fail to start with a message about what to fix, or
use a tiny setuid binary to adjust the setting before starting the
daemon?

--jkl

Karl Schendel

unread,
Feb 10, 2014, 10:27:48 AM2/10/14
to Ingres and related product discussion forum

On Feb 7, 2014, at 7:46 PM, James K. Lowden wrote:

>
> Karl, can I ask a naive question? Why let Ingres start on Linux if
> oversubscribed memory is configured? It's terrible for a DBMS in every
> way. Why not just fail to start with a message about what to fix, or
> use a tiny setuid binary to adjust the setting before starting the
> daemon?

You might be trying to run on a box with underconfigured swap.
In that situation, memory might appear to be oversubscribed but
in fact there's no real memory pressure as long as you don't
overload the machine.

When running Vectorwise there's also something goofy about the way
the x100 server allocates memory; I forget the details, but I think
it allocates large (multiple Gb) chunks of virtual address space
without necessarily using it all.

Karl
0 new messages