"Too much parts. Merges are processing significantly slower than inserts" error while loading data

3,831 views
Skip to first unread message

Alon Edelman

unread,
Feb 22, 2017, 2:32:15 PM2/22/17
to ClickHouse
Hi,

I am getting a strange error when i am loading data into a table, i have 300GB of data that i am tying to load to a single table.
ater ~230Gb of data loaded into the table i am getting this error :
Code: 252. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception: Too much parts. Merges are processing significantly slower than inserts..

the load is done file by file using CSV and once i hit the error i can't load anymore data into the table.
it loaded a ton of data into the table :

:) select count(*) from dumptest;

SELECT count(*)
FROM dumptest

┌───count()─┐
│ 247809104 │
└───────────┘

here is the create table :
CREATE TABLE default.dumptest ( ifa String,  Timestampdata Int32,  win1b Int32,  r1 Int32,  counter1 Int32,  cat_1 Int8,  cat_2 Int8,  cat_3 Int8,  cat_4 Int8,  cat_5 Int8,  cat_6 Int8,  cat_7 Int8,  cat_8 Int8,  cat_9 Int8,  cat_10 Int8,  cat_11 Int8,  cat_12 Int8,  cat_13 Int8,  cat_14 Int8,  cat_15 Int8,  cat_16 Int8,  cat_17 Int8,  cat_18 Int8,  cat_19 Int8,  cat_20 Int8,  cat_21 Int8,  cat_22 Int8,  cat_23 Int8,  cat_24 Int8,  cat_25 Int8,  cat_26 Int8,  cat_27 Int8,  cat_28 Int8,  cat_29 Int8,  cat_30 Int8,  cat_31 Int8,  cat_32 Int8,  cat_33 Int8,  cat_34 Int8,  cat_35 Int8,  cat_36 Int8,  cat_37 Int8,  cat_38 Int8,  cat_39 Int8,  cat_40 Int8,  cat_41 Int8,  cat_42 Int8,  cat_43 Int8,  cat_44 Int8,  cat_45 Int8,  cat_46 Int8,  cat_47 Int8,  cat_48 Int8,  cat_49 Int8,  cat_50 Int8,  cat_51 Int8,  cat_52 Int8,  cat_53 Int8,  cat_54 Int8,  cat_55 Int8,  cat_56 Int8,  cat_57 Int8,  cat_58 Int8,  cat_59 Int8,  cat_60 Int8,  cat_61 Int8,  cat_62 Int8,  cat_63 Int8,  cat_64 Int8,  cat_65 Int8,  cat_66 Int8,  cat_67 Int8,  cat_68 Int8,  cat_69 Int8,  cat_70 Int8,  cat_71 Int8,  cat_72 Int8,  cat_73 Int8,  cat_74 Int8,  cat_75 Int8,  cat_76 Int8,  cat_77 Int8,  cat_78 Int8,  cat_79 Int8,  cat_80 Int8,  cat_81 Int8,  cat_82 Int8,  cat_83 Int8,  cat_84 Int8,  cat_85 Int8,  cat_86 Int8,  cat_87 Int8,  cat_88 Int8,  cat_89 Int8,  cat_90 Int8,  cat_91 Int8,  cat_92 Int8,  cat_93 Int8,  cat_94 Int8,  cat_95 Int8,  cat_96 Int8,  cat_97 Int8,  cat_98 Int8,  cat_99 Int8,  cat_100 Int8,  dayaType Array(String),  stamId Array(Int16),  stamdtId Array(Int16),  multi_1 Array(String),  multi_2 Array(String),  multi_3 Array(String),  multi_4 Array(String),  multi_5 Array(String),  myDate Date) ENGINE = CollapsingMergeTree(myDate, ifa, 12288, Sign)

as for the data, the date is the same date in all the rows, the OK is ifa (300M rows), and i need to slice and dice the 300M rows using the cat's. 
If someone can point me in the right path...  :0

10x
Alon

Alex Zatelepin

unread,
Feb 28, 2017, 8:23:46 AM2/28/17
to ClickHouse
Hi!

ClickHouse does not handle high rate of INSERTs well - for each INSERT statement the new "table part" is created, and these parts are then merged in the background into bigger parts. If the rate of creating new parts is bigger than the rate at which the parts are merged, then at some point INSERTs will fail with the "Too much parts" error. So the client is responsible for throttling and batching INSERTs.

How big are the csv files that you are uploading and how many of them do you have? Insertion will perform better if you concatenate them before sending and do a small number of big INSERTs (parallel insertion is OK). See also: https://clickhouse.yandex/reference_en.html#Performance%20on%20data%20insertion.

Alon Edelman

unread,
Feb 28, 2017, 11:00:03 AM2/28/17
to ClickHouse
Alex, 
Thank you for the info, so the problems was that after i got the error i should not load even 1 more row into the DB.
i also found that the table has 300 parts and the code has the 300 limit, so no more data for me.
the CSV was 1.6GB each and i have 200 of them. but again once i got the error not even one new row was allowed.

once i change the engine type from CollapsingMergeTree to MergeTree i was able to load all me data and more.

But still strange.

A.

Alex Zatelepin

unread,
Feb 28, 2017, 2:23:01 PM2/28/17
to ClickHouse
So maybe the problem was that the merges were not completing at all. Glad to hear that you have been able to load your data into ClickHouse but if the problem still bothers you, could you look into your server logs for clues on why were the merges performing badly or not completing at all? I am mostly interested in exception messages.

Stephane Gauvin

unread,
Feb 14, 2023, 10:59:30 AM2/14/23
to ClickHouse
Had the same issue, which led me here. FWIW -- I more or less fixed this with a throttling loop (python script) checking the number of active parts for  the table I am loading. Sleep 1 second when the count > 50. Something like:

while True:
  the_parts = select count() from system.parts where table like '%my_table%' and active
  if the_parts < 50:
      break
  time.sleep(1)

From other posts, and the error messages I was getting, looks like the critical number of active parts is a little more that 300. With a cutoff of 50, sleep of 1 sec and 48 parallel threads, I see parts going up to close to 200, and then rapidly falling down to less than 50. I achieve error free ingestion of 1B rows/s.

Derek Chia

unread,
Feb 14, 2023, 10:26:09 PM2/14/23
to ClickHouse
You should be referring to `parts_to_throw_insert` which defaults to 300. Take note that this is the number of active parts in a single partition, and not across all partitions in a table. You'll need to group by partition if you have it in your table. 

Alexey Milovidov

unread,
Feb 15, 2023, 8:32:48 AM2/15/23
to Derek Chia, ClickHouse
You were using a very old ClickHouse version from at least a few years ago.
The upgrade should be performed immediately.

--
You received this message because you are subscribed to the Google Groups "ClickHouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clickhouse+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/clickhouse/0d9519ae-ca1a-484a-8769-e3602fac7d00n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages