Has anyone ingest Iceberg tables into Druid before?

Didip Kerabat

unread,

Oct 21, 2021, 1:09:26 PM10/21/21

to Druid User

index_parallel as-is cannot just download the parquet files because of how Iceberg does versioning.

Samarth Jain

unread,

Oct 21, 2021, 2:18:31 PM10/21/21

to druid...@googlegroups.com

We do, but we use the hadoop based Druid indexer. We have APIs that are able to get the files within an iceberg table snapshot that cover the intervals of data you want to ingest.

It would be good to get this work with native Druid indexers though.

On Thu, Oct 21, 2021 at 10:09 AM Didip Kerabat <did...@gmail.com> wrote:

index_parallel as-is cannot just download the parquet files because of how Iceberg does versioning.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/ed467415-6400-43ef-b5f0-34459ec26c4an%40googlegroups.com.

Didip Kerabat

unread,

Oct 21, 2021, 9:51:58 PM10/21/21

to druid...@googlegroups.com

You can do this with the MapReduce API? wow, how does it interact with the iceberg catalog? Did you make modifications to the index_hadoop?

You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/r3WdIIThiHI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CAMfSBK%2BAQA069hV%3DHGCfV7h%3DPL9mGTQsOQZrHvJdjwGGx2r5VQ%40mail.gmail.com.

Reply all

Reply to author

Forward