I have used writable Voldemort stores for several years now. We have always handled versioning external to Voldemort (just stored bytes in Voldemort).
I just started investigating read only stores (BnP) and Avro Schema Evolution. I have created a simple Build And Push example using the latest Voldemort code and the "run-bnp.sh" script (i.e. not via Azkaban). Works as expected. I then tried to configure in Avro Schema Evolution. I got it working, but, to do so I had to:
- Initial stores.xml file was created for me.
- When I tried to load a second set of data with a new (compatible) schema and it failed.
- Version 0 was reused for the new schema.
- I then Fetched, manually updated (added the Version 1 schema), and pushed the stores.xml file to the Voldemort server.
- Once I did this a running client (that periodically reads a record) bootstraped itself and started displaying the default value for the new field.
- With this change I was able to BnP new data with the new Schema.
- The client still displayed only the default value for the new field.
- I then added a manual description of the both schema versions to the BnP job via the "push.force.schema.value" and reloaded the new data.
- Now the client started displaying the actual value for the new data.
With all that background my questions are:
- Is the user required to manually configure the Avro versions as soon as you have more than one in both the stores.xml file and BnP job's file?
- If you use Azkaban to coordinate the workflow does it handle any of this configuration work for you?
- My guess is no, but, I haven't look into that approach yet.
- Is there any easier way to do this?