Hi Chris,
I added an Exporter class to ElephantDB so that you can create an
ElephantDB domain just using MapReduce. This code is available on
GitHub and in the maven repo at Clojars.
What you need to do is create a directory on HDFS containing K/V pairs
in SequenceFiles (both keys and values are BytesWritable). Then you
can export the k/v pairs into elephantdb using a command like the
following:
Exporter.export(sequenceFileDirPath, edbDomainPath, new DomainSpec(new
JavaBerkDB(), 32));
This will run a job that will create a 32 shard Java Berkeley DB
domain at edbDomainPath.
You can plug in your own incremental updating code by passing in
Exporter.Args instead of a DomainSpec and setting the updater within.
If the updater is set to null, it will disable incremental updates and
each new version of the domain will only contain what you export to
it.
Hope that helps,
Nathan