You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to datreant
Hey all,
First, happy new year, and I hope your 2016 is worth living! :D
I wanted to take a moment to lay out my overall vision for
datreant, which has been in near constant flux over the latter
part of 2015, but which I think has now settled to something
coherent enough to share. What once was just the existing core of
MDSynthesis is progressing into a more general-purpose package,
addressing what I consider a fairly universal pain point among
scientists. That is, datreant is giving pythonic form to the
fundamental data structure of scientific research: directory
trees.
Max is
responsible for this broader view. Originally the only thing
datreant did with directories was store data structures using the
`Treant.data` interface, but a more general API is not only
possible, but makes complete sense. Toward this end the core
functionality of a Treant should be pythonic tree manipulation,
and by extension the files that live in them.
This broader picture means a few big changes. First, the core of
datreant should exclude heavy dependencies with limited appeal,
such as the use of HDF5. This means breaking out the `Treant.data`
limb and others into a separate module
that upon import attaches these interfaces to the appropriate
classes. This makes datreant's core very light, but the same
machinery that makes it possible also makes it easy to build other
modules with future appeal, such as `datreant.blaze`
or `datreant.dask`.
The less-centralized structure keeps us from bloating the core
library, while giving the freedom to experiment.
This change requires the core of datreant to move to a new
namespace, such as `datreant.core`, since `datreant` itself must
become a namespace package. See this issue
for discussion on that particular change.
What does this all mean for MDSynthesis? On the surface, nothing
will change. Whereas `datreant` will be an a la carte
style collection of subpackages with a core, domain-specific
packages like MDSynthesis will still come "batteries-included",
and will include any datreant submodules as dependencies it needs.
But because of this larger change in datreant, `Sim` objects will
see all the same interfaces `Treant`s can obtain through imports
of datreant subpackages, and of course they'll also get
improvements to the core `Treant` object itself.
One last thing: state files are moving to
a JSON format. This solves many of the issues we had with
using HDF5 for state files, and doesn't come at a real performance
hit for typical `Treant`s, or even MDSynthesis `Sims`. A script
for converting existing state files in HDF5 format is already available
for use.
All of these major changes should (finally!!) coalesce into a
series of releases for each of these packages. It's been a long
road, with most of the concerns relating to release the fact that
we are also supporting a file format which must be made to change
gracefully with time. I think we are finally nearing the point
where we come move to regular and frequent releases!
All of this is open to revision, of course, and I welcome any
alternative ideas anyone has for what our course should be. The
packages should be useful, after all. :D