Write/Insert limitations

485 views
Skip to first unread message

__

unread,
Jul 15, 2018, 10:03:21 AM7/15/18
to ClickHouse
Hi there,

I'm sending INSERTS in bulk but even with this I ran into limitations. We have a lot of data points and soon as we bulk more than 200 inserts into one call we exceed some limitations though at only 200 inserts per bulk we do 10-20 bulk inserts per second which is bad to say the least. Can we increase the bulk size without causing any issues? I'm not sure if this is some sort of http limit or the max_insert_block_size setting. Any help would be much appreciated.

Denis Zhuravlev

unread,
Jul 15, 2018, 11:33:58 AM7/15/18
to ClickHouse
Not a bulk:
insert into t values (?,?);
insert into t values (?,?);

Bulk:
insert into t values (?,?), (?,?), (?,?), (?,?), (?,?), (?,?), (?,?), (?,?), .... (?,?)*100000 times ;

Try to use buffer table (Engine = Buffer)  https://clickhouse.yandex/docs/en/table_engines/buffer/
Or accumulate input data in TSV/CSV file and execute [ cat accumulate.tsv|clickhouse-client -q "insert into t format TSV" ] once a minute.

Jack Gao

unread,
Jul 25, 2018, 3:38:06 AM7/25/18
to ClickHouse
you should increase the bulk size to 10000 or more.

ClickHouse works better with bulk insert.

Cause ClickHouse use background merge, which means each write is a part, the parts will be merged to a bigger one. If each write is too small, the merge progress will not catch the insert speed, and will  throw an exception 'too many parts'

The exception is controlled by the following parameters:
parts_to_delay_insert parts_to_throw_insert max_delay_to_insert
the merge progress is controlled by 'background_pool_size', which is 16 default, it's fine for most situation.


在 2018年7月15日星期日 UTC+8下午10:03:21,__写道:

Yijian Huang

unread,
Jul 27, 2018, 1:29:44 AM7/27/18
to ClickHouse
What it is "records inserted per second" can we usually achieve for a single ClickHouse server ?

__

unread,
Jul 29, 2018, 10:06:51 AM7/29/18
to ClickHouse
I'm already doing these bulks. Yeah I could probably use a different method where I would not have the limitation of 200 per bulk. Though one main issue still remains which is that I need real time data. Meaning a delay of +30 seconds is a no go.

Denis Zhuravlev

unread,
Jul 30, 2018, 10:56:06 PM7/30/18
to ClickHouse
>more than 200 inserts into one call we exceed some limitations
What is 200 limit you talking about? CH able to insert millions rows by one insert.


On Sunday, 15 July 2018 11:03:21 UTC-3, __ wrote:

__

unread,
Jul 31, 2018, 7:10:53 AM7/31/18
to ClickHouse
I'm using https://github.com/nikepan/clickhouse-bulk which is inserting the data with http requests, I think there is some limit with this http method. Can this be increased?

__

unread,
Jul 31, 2018, 7:11:58 AM7/31/18
to ClickHouse
As soon as I set the bulk to more than 200 the data is cutoff and the entire query fails due to incompletion. (the inserts are quite long)

Denis Zhuravlev

unread,
Jul 31, 2018, 12:51:52 PM7/31/18
to ClickHouse
You should ask an author of this tool https://github.com/nikepan/clickhouse-bulk/issues
CH itself does not have such limits.
I insert upto 50 millions by one insert using clickhouse-client.

On Sunday, 15 July 2018 11:03:21 UTC-3, __ wrote:
Reply all
Reply to author
Forward
0 new messages