Build And Push And Avro Schema Evolution

38 views
Skip to first unread message

James Lent

unread,
Sep 6, 2018, 6:34:46 PM9/6/18
to project-voldemort
I have used writable Voldemort stores for several years now.  We have always handled versioning external to Voldemort (just stored bytes in Voldemort).

I just started investigating read only stores (BnP) and Avro Schema Evolution.  I have created a simple Build And Push example using the latest Voldemort code and the "run-bnp.sh" script (i.e. not via Azkaban).  Works as expected.  I then tried to configure in Avro Schema Evolution.  I got it working, but, to do so I had to:
  • Initial stores.xml file was created for me.
  • When I tried to load a second set of data with a new (compatible) schema and it failed.
    • Version 0 was reused for the new schema.
  • I then Fetched, manually updated (added the Version 1 schema), and pushed the stores.xml file to the Voldemort server.
    • Once I did this a running client (that periodically reads a record) bootstraped itself and started displaying the default value for the new field.
  • With this change I was able to BnP new data with the new Schema.
    • The client still displayed only the default value for the new field.
  • I then added a manual description of the both schema versions to the BnP job via the "push.force.schema.value" and reloaded the new data.
    • Now the client started displaying the actual value for the new data.
With all that background my questions are:
  • Is the user required to manually configure the Avro versions as soon as you have more than one in both the stores.xml file and BnP job's file?
  • If you use Azkaban to coordinate the workflow does it handle any of this configuration work for you?
    • My guess is no, but, I haven't look into that approach yet.
  • Is there any easier way to do this?

Felix GV

unread,
Sep 6, 2018, 6:55:46 PM9/6/18
to James Lent, project-...@googlegroups.com
Hi James,

Azkaban would not change much to the process, besides the ability to specify certain config settings by default for all jobs.

I don't believe there is an easier way to achieve schema evolution in Voldemort RO at this time, though of course it could be made more convenient with some code changes. If you are looking to contribute to Voldemort, please feel free to do so!

--
Felix GV
Staff Software Engineer
Data Infrastructure
LinkedIn
 
f...@linkedin.com
linkedin.com/in/felixgv


From: project-...@googlegroups.com <project-...@googlegroups.com> on behalf of James Lent <jwl...@gmail.com>
Sent: Thursday, September 6, 2018 3:34 PM
To: project-voldemort
Subject: [project-voldemort] Build And Push And Avro Schema Evolution
 
--
You received this message because you are subscribed to the Google Groups "project-voldemort" group.
To unsubscribe from this group and stop receiving emails from it, send an email to project-voldem...@googlegroups.com.
Visit this group at https://groups.google.com/group/project-voldemort.
For more options, visit https://groups.google.com/d/optout.

James Lent

unread,
Sep 7, 2018, 8:52:04 AM9/7/18
to project-voldemort
Thanks for the quick reply. My next option to investigate is having the BnP job work with an existing store that uses an "identity" serializer and continue handling Avro serialization and versioning external to Voldemort.
Reply all
Reply to author
Forward
0 new messages