Best practices when evolving schema

336 views
Skip to first unread message

Maxim Fridental

unread,
Jan 28, 2017, 4:38:18 PM1/28/17
to ClickHouse
Hi there,

in the usual databases, when adding new columns in a table, the value for already existing rows can be set via UPDATE statements. 

How you guys do that in Clickhouse?

Do you create a new table with the same columns plus the new one, then copy all rows from the old table, and then drop the old table?

Or you create a new table with the new column only, and create a view joining the old and the new tables?

Or something else?

Thanks, Maxim

Alex Zatelepin

unread,
Jan 30, 2017, 2:47:41 PM1/30/17
to ClickHouse
Greetings!

Currently there is no easy way to set the values for newly added columns. One complication is that although no values for old rows are present on disk immediately after ALTER, a merge can be triggered at any time that will 'materialize' default values on disk and you won't be able to change them afterwards.

You can add default expression for the new column and calculated values will be gradually written to disk as the parts are merged. Unfortunately, there is no way to force this process (apart from calling OPTIMIZE FINAL for each partition, but that amounts to rewriting the whole table).

For this case ALTER TABLE ... UPDATE COLUMN command (rewrite values for the whole column) can help and there are some plans to implement it.
Reply all
Reply to author
Forward
0 new messages