How to dump bigger table?

2,616 views
Skip to first unread message

christian.ho...@wu.ac.at

unread,
Mar 1, 2017, 7:50:22 AM3/1/17
to ClickHouse
I want to export a bigger table using the following command.


 clickhouse-client --query="SELECT * FROM videos FORMAT CSV"

But I got the following error:

Received exception from server:
Code: 241. DB::Exception: Received from xxx:9000, 137.208.30.97. DB::Exception: Memory limit (for query) exceeded: would use 9.31 GiB (attempt to allocate chunk of 1048576 bytes), maximum: 9.31 GiB: while reading column rec_lat at /opt/clickhouse/data/default/videos.

How can I export bigger tables? The strange thing is, that I have a lot RAM on this machine (512GB) so it is also weird that I get this error. Is there another way to dump big tables? and why must he allocate all this memory?

Yegor Andreenko

unread,
Mar 1, 2017, 12:41:51 PM3/1/17
to ClickHouse
Looks like videos isn't ordinary table, but view with join. You have max_memory_usage settings(either on session level, or at server) to increase memory for join operation.

Alex Zatelepin

unread,
Mar 1, 2017, 1:11:13 PM3/1/17
to ClickHouse
You can reduce memory usage by reducing the number of threads (option max_threads) used by SELECT:

clickhouse-client --max_threads=1 --query="SELECT * FROM videos FORMAT CSV"

Also, as you have a lot of RAM, you should increase max_memory_usage (max memory used by a single query on a single server) option (10GB by default). ClickHouse uses its own memory accounting system, so that excessive memory usage is an exception, not an OOM kill. This should help other queries too (heavy aggregations/sorts etc. use a lot of memory).

One more thing not related to performance, but you may find it more convenient: you can dump the result of any query into a file in the interactive mode using INTO OUTFILE clause:
SET max_threads=1
SELECT
* FROM videos INTO OUTFILE 'videos.csv'
FORMAT CSV

christian.ho...@wu.ac.at

unread,
Mar 2, 2017, 3:32:03 AM3/2/17
to ClickHouse
@Yegor It's not a view, just a "ordinary" Table. I created a Log Table and used the client to import a 300GB TSV file.

@Alex Where can I set this option on the server-side? I looked at the guide, but was not able to find something. So I tried to put 

<max_memory_usage>50GB</max_memory_usage>

into my config.xml as child of the xandex-root element (without success). Is it the unit (should I use bytes instead)?

However, I was able to set it on the client-side and it looks good (the query is still running :) ). Thanks for the INTO OUTFILE hint, this is really nice. Is there a possibility to split & compress the output? At the moment I do it like this:

clickhouse-client --max_memory_usage=300000000000 --query="SELECT * FROM videos FORMAT CSV" | split -d -C 4GB --filter='pigz > $FILE.gz' -a 4 - /path/to/output/folder/


Alex Zatelepin

unread,
Mar 2, 2017, 8:20:45 AM3/2/17
to ClickHouse
Correct, you should use the number of bytes: 
<max_memory_usage>50000000000</max_memory_usage>

There are the following methods to set most session-level settings:
  • Globally in the server config (xml element <max_memory_usage>, placed as a direct descendant of <yandex> tag)
  • In the client command line (--max_memory_usage=50000000000)
  • In the interactive mode using SET command (SET max_memory_usage=50000000000)
Last two methods are only available if the user profile is not readonly.

Is there a possibility to split & compress the output?

There is no such feature at the moment. 

christian.ho...@wu.ac.at

unread,
Mar 2, 2017, 3:12:21 PM3/2/17
to ClickHouse
Thanks, works now :) And needs much less memory as it could! However, it would be nice if you allow other units in the config, writing such big numbers is error prone.
Reply all
Reply to author
Forward
0 new messages