Hello,
I am in the process of building an iRODS infrastructure composed of "unixfilesystem" resources replicated on "tape" resources for disaster recovery.
I want to reduce costs as much as possible so I don't want to use a proprietary HSM, my first choice was to test dsmarc+univMSS [1].
Today, I plan to try "Phobos" to manage the tapes through a univMSS resource.
Phobos is an opensource project, which is now part of the io-sea project [2].
Like dsmarc, Phobos is not able to rename an archived file: you have to get then put with the new name, which is impossible with many files.
Phobos accepts metadata, but it can't be modified either.
These restrictions exist because most of the information in the database is written to tape in order to be able to retrieve the information on tape in case of database corruption.
My problem is that iRODS does not use a persistent/consistent UUID to name files in the backends (when a user rename/move a file, this file is renamed/moved in backend).
I don't know if there is a way to configure/modify this behavior.
I can choose the name/id of the object when I put it in Phobos, so I'm studying how to get a unique and persistent identifier to use in the univMSS.sh script (syncToArch and stageToCache).
Is the "data_id" database entry persistent for an object?
If yes I suppose I could make a query for each put/get to retrieve this field, and compose a Phobos OID like "zoneName+data_id".
However this solution will not be very efficient by design, is there a better way to do this?
For the moment I'm doing my tests with a "replication" resource, but next I'll try to find a way to do the replication automatically in asynchronous mode, in a delayed way (this will be my first step with iRODS rules!).
Regards,
Antoine
1 :
https://github.com/KTH-PDC/irods-dsmarc2 :
https://github.com/phobos-storage