irods+phobos = object name must be persistent

6 views
Skip to first unread message

Antoine Migeon

unread,
Mar 16, 2023, 7:13:19 AM3/16/23
to iRODS-Chat
Hello,

I am in the process of building an iRODS infrastructure composed of "unixfilesystem" resources replicated on "tape" resources for disaster recovery.

I want to reduce costs as much as possible so I don't want to use a proprietary HSM, my first choice was to test dsmarc+univMSS [1].
Today, I plan to try "Phobos" to manage the tapes through a univMSS resource.
Phobos is an opensource project, which is now part of the io-sea project [2].

Like dsmarc, Phobos is not able to rename an archived file: you have to get then put with the new name, which is impossible with many files.
Phobos accepts metadata, but it can't be modified either.
These restrictions exist because most of the information in the database is written to tape in order to be able to retrieve the information on tape in case of database corruption.

My problem is that iRODS does not use a persistent/consistent UUID to name files in the backends (when a user rename/move a file, this file is renamed/moved in backend).
I don't know if there is a way to configure/modify this behavior.

I can choose the name/id of the object when I put it in Phobos, so I'm studying how to get a unique and persistent identifier to use in the univMSS.sh script (syncToArch and stageToCache).

Is the "data_id" database entry persistent for an object?
If yes I suppose I could make a query for each put/get to retrieve this field, and compose a Phobos OID like "zoneName+data_id".
However this solution will not be very efficient by design, is there a better way to do this?


For the moment I'm doing my tests with a "replication" resource, but next I'll try to find a way to do the replication automatically in asynchronous mode, in a delayed way (this will be my first step with iRODS rules!).

Regards,
Antoine

1 : https://github.com/KTH-PDC/irods-dsmarc
2 : https://github.com/phobos-storage

Kory Draughn

unread,
Mar 20, 2023, 1:41:30 PM3/20/23
to irod...@googlegroups.com
Hello Antoine,

Yes, the DATA_ID is persistent for data objects. Of course, please test and make sure it enables what you want.
If you come across a case where it doesn't do what you need, please create an issue in the repo. We're happy to think through solutions.

As for whether there is a more efficient design, that will be difficult to answer without measuring performance and considering other factors.

Thanks,

Kory Draughn
Chief Technologist
iRODS Consortium


--
--
The Integrated Rule-Oriented Data System (iRODS) - https://irods.org
 
iROD-Chat: http://groups.google.com/group/iROD-Chat
---
You received this message because you are subscribed to the Google Groups "iRODS-Chat" group.
To unsubscribe from this group and stop receiving emails from it, send an email to irod-chat+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/irod-chat/516e7375-5a6a-4238-85e5-330ad1eddb51n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages