How to rename columns to made Hive work with Druid?

226 views
Skip to first unread message

Andrew

unread,
Apr 10, 2019, 11:15:51 AM4/10/19
to Druid User
Hi!

I'm trying to use druid and hive together. The problem is that all columns must be lowercase, because hive will not query column if any letter in column name will be uppercase.

I read it here:



And now I can't undertend how to rename columns in existing datasource.

I know that I could use firehourse to made exist datasource as input for spec, also I know how actually rename dimensions from here https://support.imply.io/hc/en-us/articles/360005727614-How-to-rename-dimension-during-raw-data-ingestion

But my spec doesn't parse floats. Also i have 400gb of data and 1600 intervals. Index task is very slow.

(I don't use hadoop index, i use index task (also i can't use index_parallel because of json firehorse input)).

So very slow spec that renames columns and didn't parse floats:


Can anybody help me?

Thanks! 

jon...@imply.io

unread,
Apr 15, 2019, 6:00:24 PM4/15/19
to Druid User
The ingest spec has `appendToExisting` set to true, which means the new segments will not overwrite the old ones, you'll want to set that to false.

You could also break the task apart into smaller intervals (e.g., reingest a week at a time).

Andrew

unread,
May 12, 2019, 8:17:54 AM5/12/19
to Druid User
Thanks!!

вторник, 16 апреля 2019 г., 1:00:24 UTC+3 пользователь jon...@imply.io написал:
Reply all
Reply to author
Forward
0 new messages