Hi All
Development has been quiet on the surface for a while now. The reason on my side was that I worked on a big change to the code base that I wasn't able to share until somewhat complete.
Here are some key changes I made:
renamed Dataset_File to DataFile, since I changed a lot of this code anyway I took the opportunity to fix past mistakes. Django has a unenforced CamelCase naming rule for models, which are assumed by some pieces of code such as permissions parsing.
removed Replica, Location and associated models. These models did not disappear when not needed and there was no straight forward way to use different storage backends such as S3. It turned out to be easier to redesign than to adopt.
added DataFileObject, StorageBox, StorageBoxOption, StorageBoxAttribute models. A datafile has the metadata, DFOs point to data and each datafile can have several. However, DataFile.file_object now always supports reading and writing and it just uses the default storage. This means in a lot of code the abstraction of DataFiles can be ignored. StorageBoxes are storage locations that build on django-storage compatible modules and are assigned on a Dataset level. Apart from one default these are database-defined.
SFTP. This is a bonus to make the changes go down easier ;) Using the paramiko server component I built a sftp server into mytardis. Run it with bin/django sftpd. For added scalability I tested that it can be load balanced with HA-Proxy, just make sure the host-key is the same on each node. SaltStack states to set it all up are going to be released eventually as well.
removed anything METS related.
removed tests that were now broken. Many of the tests broke after those changes. Many were also unnecessary. More tests are needed though, just not here yet.
formatting changes. MyTardis tries to follow PEP8, and Travis runs a linter already, but with most checks turned off. Since I changed so much code, I also changed most of the files I visited to PEP8 as I went. Why PEP8? Mostly because it includes rules against trailing newlines, superfluous spaces and other diff-polluting things, so by using a PEP8-linting editor we can all avoid that. It is easier to follow all of PEP8 than to discuss which rules we're going to follow. These changes might allow me to include more checks in the pylint run, too.
The model changes come with scheme and data migrations which I have tested on a large deployed database, but which are not tested against each and every possible configuration, so caution is advised.
In other news, Carlo sent a pull request for AAF authentication in MyTardis. I will test and accept it soon.
The near term plans are currently:
Continue development on storagebox functionality such as copies, ownership, reporting, testing and listing of backend modules that work, squashfs support.
A new stable release with some of the new functionality.
An updated SaltStack deployment system release (it’s in production already)
There are also plans for API v2 that exposes the storagebox logic and is based on Django Rest Framework.
That's all I can think of for now.
Your questions, input and feedback is as always most welcome!
If you have any difficulty moving your existing code to the new storage backend, please contact me, I'm happy to help/advise etc. to get it all working.
Cheers