Yusuke Ueda
unread,Sep 5, 2022, 3:34:54 AM9/5/22Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to MariaDB ColumnStore
Hello everyone.
We are building Columnstore as a Docker container on the server and trying to perform large-scale aggregation.
However, if importing data for aggregation is performed continuously, it will fail stochastically.
The data exists by year, with 200 columns, 200 million records, and 100 tables with 50GB of raw TSV data.
Initially restarting and re-running the Docker container will succeed, but after loading a certain number of tables the import will fail even with a restart and will never succeed again.
If I truncate another table that was already imported, the failed import will succeed, but reimporting the truncated table will now fail.
I thought about running out of storage, but there is enough free space.
I would appreciate it if you could help me to solve the problem.
Details are described below.
I would appreciate it if you could help me to solve the problem.
Emvironment:
36 Cores CPU
256GB RAM
5TB SSD
CentOS Stream 8
MariaDB 10.6 / Columnstore 6.x (docker image)
Launch Configration:
----- docker-compose.yml -----
version: "3.8"
services:
mcs-container:
container_name: mcs-container
hostname: mcs-container
image: mariadb/columnstore
restart: always
volumes:
- config:/etc/columnstore/
- data:/var/lib/columnstore/
- mysql:/var/lib/mysql/
- log:/var/log/
ports:
- 3306:3306
deploy:
resources:
limits:
cpus: '32'
memory: '128g'
volumes:
config:
driver: local
data:
driver: local
log:
driver: local
mysql:
driver: local
----- docker-compose.yml end -----
Error logs:
Aug 12 11:08:39 mcs-container ddlpackageproc[212]: 39.801352 |376|1140|0| D 23 CAL0041: Start SQL statement: TRUNCATE TABLE mnjk_item_2017;|tabulation|
Aug 12 11:08:41 mcs-container ddlpackageproc[212]: 41.957874 |376|1140|0| D 23 CAL0042: End SQL statement
Aug 12 11:08:42 mcs-container cpimport.bin[33961]: 42.756248 |0|0|0| I 34 CAL0086: Initiating BulkLoad: -L /var/log/mariadb/columnstore/cpimport/ -s \t -P pm1-33961 -T SYSTEM -u5e686595-cc2b-45c6-89ae-d5cf824799f9 tabulation mnjk_item_2017
Aug 12 11:08:42 mcs-container cpimport.bin[33961]: 42.900782 |0|0|0| I 34 CAL0081: Start BulkLoad: JobId-34265; db-tabulation
Aug 12 11:16:30 mcs-container messagequeue[48]: 30.761348 |0|0|0| W 31 CAL0000: Client read close socket for InetStreamSocket::readToMagic: Remote is closed
Aug 12 11:16:30 mcs-container controllernode[48]: 30.761665 |0|0|0| C 29 CAL0000: DBRM Controller: Network error reading from node 1. Reading response to command 44, length 1755. Will see if retry is possible.
Aug 12 11:16:30 mcs-container controllernode[48]: 30.771493 |0|0|0| C 29 CAL0000: DBRM Controller: undo(): warning, could not contact worker number 1#012
Aug 12 11:21:29 mcs-container controllernode[48]: 29.846436 |0|0|0| C 29 CAL0000: A node is unresponsive for cmd = 44, no reconfigure in at least 300 seconds. Setting read-only mode.
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.847104 |0|0|0| E 34 CAL0087: BulkLoad Error: extendColumnNewExtent: error creating BRM extent after column OID-34409; DBRoot-1; part-0; seg-1; newDBRoot-1; newpart-0; a BRM Allocate extent error. [BRM error status: DBRM is in READ-ONLY mode]; Error allocating extent stripe for table 34265; DBRoot: 1
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859007 |0|0|0| E 34 CAL0087: BulkLoad Error: extendColumnNewExtent: error creating BRM extent after column OID-34406; DBRoot-1; part-0; seg-1; newDBRoot-1; newpart-0; a BRM Allocate extent error.; Previous error allocating extent stripe for table 34265; DBRoot: 1
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859057 |0|0|0| E 34 CAL0087: BulkLoad Error: extendColumnNewExtent: error creating BRM extent after column OID-34408; DBRoot-1; part-0; seg-1; newDBRoot-1; newpart-0; a BRM Allocate extent error.; Previous error allocating extent stripe for table 34265; DBRoot: 1
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859113 |0|0|0| E 34 CAL0087: BulkLoad Error: parseDict: error extending column: OID-34406; a BRM Allocate extent error.
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859143 |0|0|0| E 34 CAL0087: BulkLoad Error: parseDict: error extending column: OID-34409; a BRM Allocate extent error.
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859176 |0|0|0| E 34 CAL0087: BulkLoad Error: parseDict: error extending column: OID-34408; a BRM Allocate extent error.
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859210 |0|0|0| E 34 CAL0087: BulkLoad Error: Bulkload Parse (thread 0) Failed for Table tabulation.mnjk_item_2017 during parsing. Terminating this job.
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859241 |0|0|0| E 34 CAL0087: BulkLoad Error: Bulkload Parse (thread 1) Failed for Table tabulation.mnjk_item_2017 during parsing. Terminating this job.
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859285 |0|0|0| E 34 CAL0087: BulkLoad Error: Bulkload Parse (thread 2) Failed for Table tabulation.mnjk_item_2017 during parsing. Terminating this job.
Aug 12 11:21:30 mcs-container writeengine[33961]: 30.929164 |0|0|0| I 19 CAL0084: ClearTableLock: Starting bulk rollback for table tabulation.mnjk_item_2017 (OID-34265); lock-1088; initiated by cpimport.bin.
Aug 12 11:21:30 mcs-container writeengine[33961]: 30.929653 |0|0|0| E 19 CAL0085: ClearTableLock: Ending bulk rollback for table tabulation.mnjk_item_2017 (OID-34265); lock-1088; initiated by cpimport.bin. (rollback failed; Bulk rollback for table tabulation.mnjk_item_2017 (OID-34265) not performed; BRM is in read-only state.).
Aug 12 11:21:30 mcs-container writeengine[33961]: 30.929706 |0|0|0| I 19 CAL0085: ClearTableLock: Ending bulk rollback for table tabulation.mnjk_item_2017 (OID-34265); lock-1088; initiated by cpimport.bin. (rollback failed; Bulk rollback for table tabulation.mnjk_item_2017 (OID-34265) not performed; BRM is in read-only state.).
Aug 12 11:21:30 mcs-container cpimport.bin[33961]: 30.929759 |0|0|0| E 34 CAL0087: BulkLoad Error: Error rolling back table tabulation.mnjk_item_2017; Bulk rollback for table tabulation.mnjk_item_2017 (OID-34265) not performed; BRM is in read-only state.
Aug 12 11:21:30 mcs-container cpimport.bin[33961]: 30.929865 |0|0|0| I 34 CAL0082: End BulkLoad: JobId-34265; status-FAILED