I thought I would try to get a conversation started regarding some of
the questions in Michael Giarlo’s kind post on CDL’s microservice
work [1].
| # Is Dflat dependent on ReDD? One would assume not since there's an
| optional property in the dflat-info.txt file for specifying a
| delta scheme. But, say, could you stub out the v001 directory
| (reserved to hold the initial version of a digital object) and use
| a system such as git or bazaar?
| One might argue that these established delta schemes, if you want
| to call them that, have many more developers and users than a
| system such as ReDD and thus should persist longer and have more
| tools built around them. I imagine the micro-service viewpoint
| would acknowledge that point, but counter that the spirit of these
| specs is to avoid dependencies from outside the filesystem?
The relation between ReDD and Dflat and whether they can be separated
as a spec is in contention. My own feeling is that Dflat should be
defined with ReDD but allow for extensions in a Dflat-like spec whose
only difference is to not use ReDD.
| # Is the ReDD specification meaningful outside of a Dflat given that
| any one ReDD directory knows nothing of its successors and
| predecessors, or is it dependent upon Dflat?
I think ReDD could be useful outside of Dflat but I am not sure that
ReDD can stand alone without something around it, e.g. Dflat.
But these are my opinions - I know that John Kunze, for instance, has
different ones.
So if anybody else has opinions about these specs, please speak up!
| # How many tools exist for these specs? I notice there's code in
| CPAN for Pairtree and Namaste, which is a fabulous start. Tools
| are the difference between YAMF (Yet Another Messy Filesystem) and
| reliably managed curation services. Granted, tools such as cp and
| emacs already exist and are part of the appeal of these
| micro-services, but there's also tremendous room for error if
| operations are all done "by hand."
You probably know all the tools by now; the only ones CDL has released
so far is John Kunze’s CPAN code. We are working on other code here
but it has not been released yet.
best,
Erik
1. http://lackoftalent.org/michael/blog/2009/09/27/exploring-curation-micro-services/
Thanks for getting the ball rolling, Erik and Mike. First, my answer to
the related question, "is ReDD dependent on Dflat?" is "no". Imagine a
command tool, "redd", called like "diff" with 2 directories,
redd directory1 directory2
The caller needs to communicate "beforeness" and "afterness" in the way
that it supplies parameters. The fact that there might be directory3,
directory4, ..., directory9 waiting in the wings for their deltas to be
applied or computed is something that only the caller needs to know (eg,
the manager of the Dflat). So ReDD can stand alone just like diff/patch
can stand alone; the burden is on the caller to use both sensibly.
As for whether Dflat should be defined to use ReDD, I also think "no".
Consider two structures that both conform to the Dflat spec, but for
version chain compression one uses "diff/patch" and the other uses "redd".
Moreover, assume they both declare which compression scheme they use --
explicitly via dflat-info.txt and implicitly via Namaste tags. This is
just two Dflats configured differently. There's always a risk that some
Dflat software will fail with a "I don't support diff/patch" message,
but that's a small price to pay compared to forcing a new name on the
differently configured structure.
Note that while I think diff/patch is easily swappable for ReDD in Dflat,
I don't know enough about git or bazaar to make the same claim for their
respective differencing schemes.
-John