Need Help: Mysql cluster failed with corrupt schema file error

50 views
Skip to first unread message

Deepashree Sathe

unread,
Nov 15, 2016, 5:10:15 AM11/15/16
to codership
We are facing a problem of currupt schema file with MySQL cluster. The error occurred is :  Forced node shutdown completed. Occured during startphase 4. Caused by error 2355: 'Failure to restore schema(Resource configuration error). Permanent error, external action needed'.

We have been using MySQL cluster versioned mysql-cluster-gpl-7.4.11-linux-glibc2.5-x86_64 as the prime data store for our services. We have been using above setup for over two months (from 1st Sept 2016).
Suddenly in last week, we faced a major snag which brought down the entire data store and is still not recovered.

Here is the series of events prior to the mentioned issue :
1. Initially cluster had 4 data nodes, 3-api nodes and 2 management nodes
2. Cluster memory reached 65% of allocated storage, Hence we added 2 data nodes.
The procedure followed is as mentioned in this document.[https://dev.mysql.com/doc/refman/5.7/en/mysql-cluster-online-add-node-example.html]
3. Setup was brought up through rolling restart and it continued to serve requests.
4. Then we began reorganising previous tables, while performing one of the reorganisations cluster went down.
5. And then it failed to start breaking on Invalid schema file error.

Here are the current configuration parameters from config.ini :
[TCP default]
SendBufferMemory=32M
ReceiveBufferMemory=32M

[NDBD DEFAULT]
TotalSendBufferMemory = 128M
NoOfReplicas=2
DataMemory=12G
IndexMemory=4G
FragmentLogFileSize=256M
NoOfFragmentLogFiles=72
RedoBuffer=128M
MaxNoOfTables=4096
MaxNoOfAttributes=24756
MaxNoOfOrderedIndexes=2048
MaxNoOfUniqueHashIndexes=512
MaxNoOfConcurrentOperations=1000000
Diskcheckpointspeed=10M
Diskcheckpointspeedinrestart=100M
TimeBetweenGlobalCheckpoints=1000
SharedGlobalMemory=384M
TimeBetweenLocalCheckpoints=6
DiskPageBufferMemory=3072MB
MaxNoOfConcurrentScans=500
MaxNoOfLocalScans=50000
MaxNoOfExecutionThreads=8
TransactionDeadlockDetectionTimeout=300000
CrashOnCorruptedTuple=false
TimeBetweenWatchDogCheck=60000
LcpScanProgressTimeout=300

We tried multiple ways to bring it up but no luck. It would be great if you could help us on this problem.

Bill G

unread,
Mar 27, 2018, 12:57:14 AM3/27/18
to codership
Hi,

Were you able to figure a solution to this problem? I ran into the exact same problem yesterday.

Many thanks!
Reply all
Reply to author
Forward
0 new messages