Hi,
We have to migrate data stored in deep storage of our druid which is HDFS to another deep storage which is again an HDFS cluster. The metadata storage we have is of type mysql. We also want to migrate from existing metadata DB to new DB. We are thinking of following steps for migration.
1. Copying data from an existing HDFS cluster to another HDFS using DistCp.
2. Taking an sql dump of config, dataSource, supervisors, and segments metadata tables in a file
3. Changing the location of segments in segments table in sql dump file which produced in above step to new deep storage location
4. Import sql dump file into new metadata db.
New druid cluster will be configured with new deep storage and new metadata DB addresses. Also the druid version we are using currently is 0.19.0 and we are thinking of setting up a druid with the latest version (24.0.0) in the new cluster and then follow above steps for migration.
We are seeking help here in terms of below queries
1. Is there any better way to migrate data from one HDFS to another HDFS specifically from a Druid point of view?
2. Will there be a segment compatibility issue in two different druid versions (0.19.0 and 24.0.0) ? This could happen if the way segments are stored is changed in mentioned versions.
3. What are more possible challenges that we may encounter and we may need to take care of them early?
4. Any recommendations for anything here that can be done better?
Thanks,
Saurabh Pande.