Importing data using the cpimport tool fails.

58 views
Skip to first unread message

Yusuke Ueda

unread,
Sep 5, 2022, 3:34:54 AM9/5/22
to MariaDB ColumnStore
Hello everyone.

We are building Columnstore as a Docker container on the server and trying to perform large-scale aggregation.
However, if importing data for aggregation is performed continuously, it will fail stochastically.
The data exists by year, with 200 columns, 200 million records, and 100 tables with 50GB of raw TSV data.
Initially restarting and re-running the Docker container will succeed, but after loading a certain number of tables the import will fail even with a restart and will never succeed again.
If I truncate another table that was already imported, the failed import will succeed, but reimporting the truncated table will now fail.
I thought about running out of storage, but there is enough free space.
I would appreciate it if you could help me to solve the problem.

Details are described below.

I would appreciate it if you could help me to solve the problem.



Emvironment:
36 Cores CPU
256GB RAM
5TB SSD
CentOS Stream 8
MariaDB 10.6 / Columnstore 6.x (docker image)


Launch Configration:
----- docker-compose.yml -----
version: "3.8"

services:
  mcs-container:
    container_name: mcs-container
    hostname: mcs-container
    image: mariadb/columnstore
    restart: always
    volumes:
      - config:/etc/columnstore/
      - data:/var/lib/columnstore/
      - mysql:/var/lib/mysql/
      - log:/var/log/
    ports:
      - 3306:3306
    deploy:
      resources:
        limits:
          cpus: '32'
          memory: '128g'

volumes:
  config:
    driver: local
  data:
    driver: local
  log:
    driver: local
  mysql:
    driver: local
----- docker-compose.yml end -----


Error logs:
Aug 12 11:08:39 mcs-container ddlpackageproc[212]: 39.801352 |376|1140|0| D 23 CAL0041: Start SQL statement: TRUNCATE TABLE mnjk_item_2017;|tabulation|
Aug 12 11:08:41 mcs-container ddlpackageproc[212]: 41.957874 |376|1140|0| D 23 CAL0042: End SQL statement
Aug 12 11:08:42 mcs-container cpimport.bin[33961]: 42.756248 |0|0|0| I 34 CAL0086: Initiating BulkLoad: -L /var/log/mariadb/columnstore/cpimport/ -s \t -P pm1-33961 -T SYSTEM -u5e686595-cc2b-45c6-89ae-d5cf824799f9 tabulation mnjk_item_2017
Aug 12 11:08:42 mcs-container cpimport.bin[33961]: 42.900782 |0|0|0| I 34 CAL0081: Start BulkLoad: JobId-34265; db-tabulation
Aug 12 11:16:30 mcs-container messagequeue[48]: 30.761348 |0|0|0| W 31 CAL0000: Client read close socket for InetStreamSocket::readToMagic: Remote is closed        
Aug 12 11:16:30 mcs-container controllernode[48]: 30.761665 |0|0|0| C 29 CAL0000: DBRM Controller: Network error reading from node 1.  Reading response to command 44, length 1755.  Will see if retry is possible.        
Aug 12 11:16:30 mcs-container controllernode[48]: 30.771493 |0|0|0| C 29 CAL0000: DBRM Controller: undo(): warning, could not contact worker number 1#012        
Aug 12 11:21:29 mcs-container controllernode[48]: 29.846436 |0|0|0| C 29 CAL0000: A node is unresponsive for cmd = 44, no reconfigure in at least 300 seconds.  Setting read-only mode.        
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.847104 |0|0|0| E 34 CAL0087: BulkLoad Error: extendColumnNewExtent: error creating BRM extent after column OID-34409; DBRoot-1; part-0; seg-1; newDBRoot-1; newpart-0;  a BRM Allocate extent error. [BRM error status: DBRM is in READ-ONLY mode]; Error allocating extent stripe for table 34265; DBRoot: 1
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859007 |0|0|0| E 34 CAL0087: BulkLoad Error: extendColumnNewExtent: error creating BRM extent after column OID-34406; DBRoot-1; part-0; seg-1; newDBRoot-1; newpart-0;  a BRM Allocate extent error.; Previous error allocating extent stripe for table 34265; DBRoot: 1
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859057 |0|0|0| E 34 CAL0087: BulkLoad Error: extendColumnNewExtent: error creating BRM extent after column OID-34408; DBRoot-1; part-0; seg-1; newDBRoot-1; newpart-0;  a BRM Allocate extent error.; Previous error allocating extent stripe for table 34265; DBRoot: 1
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859113 |0|0|0| E 34 CAL0087: BulkLoad Error: parseDict: error extending column:  OID-34406;  a BRM Allocate extent error.
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859143 |0|0|0| E 34 CAL0087: BulkLoad Error: parseDict: error extending column:  OID-34409;  a BRM Allocate extent error.
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859176 |0|0|0| E 34 CAL0087: BulkLoad Error: parseDict: error extending column:  OID-34408;  a BRM Allocate extent error.
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859210 |0|0|0| E 34 CAL0087: BulkLoad Error: Bulkload Parse (thread 0) Failed for Table tabulation.mnjk_item_2017 during parsing.  Terminating this job.
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859241 |0|0|0| E 34 CAL0087: BulkLoad Error: Bulkload Parse (thread 1) Failed for Table tabulation.mnjk_item_2017 during parsing.  Terminating this job.
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859285 |0|0|0| E 34 CAL0087: BulkLoad Error: Bulkload Parse (thread 2) Failed for Table tabulation.mnjk_item_2017 during parsing.  Terminating this job.
Aug 12 11:21:30 mcs-container writeengine[33961]: 30.929164 |0|0|0| I 19 CAL0084: ClearTableLock: Starting bulk rollback for table tabulation.mnjk_item_2017 (OID-34265); lock-1088; initiated by cpimport.bin.
Aug 12 11:21:30 mcs-container writeengine[33961]: 30.929653 |0|0|0| E 19 CAL0085: ClearTableLock: Ending bulk rollback for table tabulation.mnjk_item_2017 (OID-34265); lock-1088; initiated by cpimport.bin. (rollback failed; Bulk rollback for table tabulation.mnjk_item_2017 (OID-34265) not performed; BRM is in read-only state.).
Aug 12 11:21:30 mcs-container writeengine[33961]: 30.929706 |0|0|0| I 19 CAL0085: ClearTableLock: Ending bulk rollback for table tabulation.mnjk_item_2017 (OID-34265); lock-1088; initiated by cpimport.bin. (rollback failed; Bulk rollback for table tabulation.mnjk_item_2017 (OID-34265) not performed; BRM is in read-only state.).
Aug 12 11:21:30 mcs-container cpimport.bin[33961]: 30.929759 |0|0|0| E 34 CAL0087: BulkLoad Error: Error rolling back table tabulation.mnjk_item_2017; Bulk rollback for table tabulation.mnjk_item_2017 (OID-34265) not performed; BRM is in read-only state.
Aug 12 11:21:30 mcs-container cpimport.bin[33961]: 30.929865 |0|0|0| I 34 CAL0082: End BulkLoad: JobId-34265; status-FAILED

Ferran Gil Cuesta

unread,
Sep 7, 2022, 10:57:06 AM9/7/22
to MariaDB ColumnStore
Hi Yusuke,

We saw very bad performance using a MariaDB/ColumnStore inside docker container VS installing MariaDB/ColumnStore directly into the OS. You may want to give it a try.

I cannot give you any other advice related to the specific problem, though.

Cheers,
Ferran

drrtuy

unread,
Sep 16, 2022, 10:04:43 AM9/16/22
to MariaDB ColumnStore
Hi Yusuke,

The main reason why you see the failures is the default shared memory limitation. Here is the error message that clearly tells about this b/c BRM is all in shared memory(/dev/shm/ to be exact).
Aug 12 11:21:29 mcs-container cpimport.bin[33961]: 29.859113 |0|0|0| E 34 CAL0087: BulkLoad Error: parseDict: error extending column:  OID-34406;  a BRM Allocate extent error
By default the containers have a low limit so you need to raise shm-size value in your compose yml file. We had seen this issue in our cloud previously.

Regards,
Roman

среда, 7 сентября 2022 г. в 17:57:06 UTC+3, fg...@g-n.com:

Yusuke Ueda

unread,
Sep 20, 2022, 4:50:26 AM9/20/22
to MariaDB ColumnStore
Hi,  Ferran,  Roman,

After increasing the shm-size you pointed out, cpimport succeeded.
Could this be the cause of the problem of slow performance in the docker environment?

Also, this time we set the shm-size to 16g, which is quite large, but what is an appropriate value for our amount of data?

Regards,
Yusuke

2022年9月16日金曜日 23:04:43 UTC+9 drrtuy:

drrtuy

unread,
Jan 17, 2023, 2:44:00 AM1/17/23
to MariaDB ColumnStore
Sorry for the delay,

There are multiple components that uses shared memory but the two that contributes the most are:
- extent map that is a linear array of extents(around 128 bytes each) and extent is 8 mln or a column values so you can take the number of records in the database / 8 mln * number of columns and get the estimate.
- extent map index, that is a hash map and roughly 75-110% of EM size

With you I don't expect the container will utilize more than 200 MB of RAM.

Regards,
Roman

вторник, 20 сентября 2022 г. в 11:50:26 UTC+3, Yusuke Ueda:
Reply all
Reply to author
Forward
0 new messages