MariaDB: rocksdb_commit_in_the_middle + TRANSACTION

106 views
Skip to first unread message

Juan Telleria

unread,
Nov 27, 2018, 9:20:42 AM11/27/18
to MyRocks - RocksDB storage engine for MySQL
    
RocksDB is great for small transactions (in memory), but if sometimes it is necessary to load big tables during nightly loads:  

In such cases, it is mandatory to enable rocksdb_commit_in_the_middle = 1 or rocksdb_bulk_load = 1. But if these loads fail, we do not want to have corrupted data in the tables, nor a loss of data.

START TRANSACTION;
    
    SET 
        rocksdb_bulk_load = 1,
        rocksdb_commit_in_the_middle = 1
    FOR
         CREATE OR REPLACE TABLE database_name.table_name_MyRocks
         AS
              SELECT
                   *
              FROM
                   database_name.table_name_Aria;
    
    
COMMIT; // ROLLBACK;

So my question is:

> If the Transaction fails, and a ROLLBACK is applied, will we be able to recover all of the data in the table, as it was previously to the transactions? 

> Or would the table get modified, and we would get as a result a table with corrupted data?

Thank you!

Juan Telleria

Juan Telleria

unread,
Nov 27, 2018, 9:41:59 AM11/27/18
to MyRocks - RocksDB storage engine for MySQL
* Typo Fix: SET STATEMENT ... FOR

Sergey Petrunya

unread,
Nov 30, 2018, 2:06:25 PM11/30/18
to jtell...@gmail.com, myroc...@googlegroups.com
rocksdb_commit_in_the_middle=1 causes MyRocks to issue a RocksDB commit every rocksdb_bulk_load_size row operations.  There's no way to undo the commit, so the changes that have been commited will stay there, even if the transaction fails.  You may end up with partially loaded data.  The data will look as if the transaction has stopped making modifications after updating N rows.  The data will not be corrupted - the data in primary/secondary indexes will be in agreement, no garbage data, etc.

rocksdb_bulk_load=1 is a different mode. (I'm not sure if it makes any sense to have both enabled together, by the way)... rocksdb_bulk_load=1 will dump all loaded data into SST files, and then inject these SST files into RocksDB. If an error occurs before the injection started, all changes made while in bulk load mode will be discarded (thus, no inconsistencies).  I am not sure if the injection of SST files itself is atomic (or it could fail in the middle and only inject some of the files).

BR
--Sergei Petrunia

Juan Telleria

unread,
Dec 10, 2018, 3:50:09 AM12/10/18
to MyRocks - RocksDB storage engine for MySQL

Thank you Sergei for the clarification!

Based on such definitions, some actions could be taken for RocksDB implementation in MariaDB:

1) Update "RocksDB Server Variables" MariaDB Knowledge Base Webpage (https://mariadb.com/kb/en/library/myrocks-system-variables/), and clarify such variables behavior:
 
rocksdb_commit_in_the_middle=1 causes MyRocks to issue a RocksDB commit every rocksdb_bulk_load_size row operations.  There's no way to undo the commit, so the changes that have been commited will stay there, even if the transaction fails.  You may end up with partially loaded data.  The data will look as if the transaction has stopped making modifications after updating N rows.  The data will not be corrupted - the data in primary/secondary indexes will be in agreement, no garbage data, etc.

rocksdb_bulk_load=1 is a different mode. (I'm not sure if it makes any sense to have both enabled together, by the way)... rocksdb_bulk_load=1 will dump all loaded data into SST files, and then inject these SST files into RocksDB. If an error occurs before the injection started, all changes made while in bulk load mode will be discarded (thus, no inconsistencies).  I am not sure if the injection of SST files itself is atomic (or it could fail in the middle and only inject some of the files).

2) Change how MariaDB works with "rocksdb_bulk_load" and "rocksdb_commit_in_the_middle". For INSERT INTO...; ALTER TABLE; CREATE TABLE t_name AS SELECT ... "big" transactions which cause MariaDB's OOM-killer to kill such queries (for being stored in memory) or hitting "rocksdb_max_row_locks" limit, "rocksdb_bulk_load " could be auto-enabled when "rocksdb_max_row_locks" is passed (Only if "rocksdb_bulk_load" has a transactional behavior).

MariaDB Knowledge base however says for "rocksdb_bulk_load": "Use bulk-load mode for inserts. This disables unique_checks and enables rocksdb_commit_in_the_middle." So I do not know if it is safe at all, or it can cause partial data loads...

I will be moving from RocksDB to InnoDB back again :( As I have some issues with bulk load transactions; RocksDB tables not storing table Creation / Update Date Metadata; and some Index Creation Getting Stuck...

Thank you!
Juan
Reply all
Reply to author
Forward
0 new messages