irods+phobos = object name must be persistent

29 views
Skip to first unread message

Antoine Migeon

unread,
Mar 16, 2023, 7:13:19 AM3/16/23
to iRODS-Chat
Hello,

I am in the process of building an iRODS infrastructure composed of "unixfilesystem" resources replicated on "tape" resources for disaster recovery.

I want to reduce costs as much as possible so I don't want to use a proprietary HSM, my first choice was to test dsmarc+univMSS [1].
Today, I plan to try "Phobos" to manage the tapes through a univMSS resource.
Phobos is an opensource project, which is now part of the io-sea project [2].

Like dsmarc, Phobos is not able to rename an archived file: you have to get then put with the new name, which is impossible with many files.
Phobos accepts metadata, but it can't be modified either.
These restrictions exist because most of the information in the database is written to tape in order to be able to retrieve the information on tape in case of database corruption.

My problem is that iRODS does not use a persistent/consistent UUID to name files in the backends (when a user rename/move a file, this file is renamed/moved in backend).
I don't know if there is a way to configure/modify this behavior.

I can choose the name/id of the object when I put it in Phobos, so I'm studying how to get a unique and persistent identifier to use in the univMSS.sh script (syncToArch and stageToCache).

Is the "data_id" database entry persistent for an object?
If yes I suppose I could make a query for each put/get to retrieve this field, and compose a Phobos OID like "zoneName+data_id".
However this solution will not be very efficient by design, is there a better way to do this?


For the moment I'm doing my tests with a "replication" resource, but next I'll try to find a way to do the replication automatically in asynchronous mode, in a delayed way (this will be my first step with iRODS rules!).

Regards,
Antoine

1 : https://github.com/KTH-PDC/irods-dsmarc
2 : https://github.com/phobos-storage

Kory Draughn

unread,
Mar 20, 2023, 1:41:30 PM3/20/23
to irod...@googlegroups.com
Hello Antoine,

Yes, the DATA_ID is persistent for data objects. Of course, please test and make sure it enables what you want.
If you come across a case where it doesn't do what you need, please create an issue in the repo. We're happy to think through solutions.

As for whether there is a more efficient design, that will be difficult to answer without measuring performance and considering other factors.

Thanks,

Kory Draughn
Chief Technologist
iRODS Consortium


--
--
The Integrated Rule-Oriented Data System (iRODS) - https://irods.org
 
iROD-Chat: http://groups.google.com/group/iROD-Chat
---
You received this message because you are subscribed to the Google Groups "iRODS-Chat" group.
To unsubscribe from this group and stop receiving emails from it, send an email to irod-chat+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/irod-chat/516e7375-5a6a-4238-85e5-330ad1eddb51n%40googlegroups.com.

Antoine Migeon

unread,
Apr 7, 2026, 8:40:09 AM (5 days ago) Apr 7
to iRODS-Chat
Hello,

Thank you for your reply. We had put this project on hold, which is why I’m responding so late.

I’m having trouble with “univmss” type composite resources because the filenames are passed as arguments, which causes problems when filenames contains special characters (like single quote).

I saw that it’s possible to modify the physical path (acSetVaultPathPolicy) and add a timestamp to physical object name, which is already helpful and avoids the need for rename/mv operations. But is there a way to change/choose the name of the physical object on vault?

For example, I’d like to encode the file name in Base64 before passing it to univMSS.

Regards,
Antoine

Bruno Santos

unread,
Apr 8, 2026, 9:03:44 AM (4 days ago) Apr 8
to iRODS-Chat
Hi,

Here is some info on what we have (I might help):

We have this scenario: replication from a unixfilesystem storage to a tape storage.

Handling special chars was one of the issues.
We did it with:
1. acSetVaultPathPolicy set to ramdom scheme.
2. Custom rule to replicate:
2.1. If the file name has special chars not allowed in tape: we rename it to have only allowed chars.
2.2. Replicate to tape.
2.3. If the file was renamed, we restore the original name. This will only update the logical path in the database. The phisical path in the tape resource will contain the chars used in the temporary name.


Example with a file with spaces on the name, where the space is replaced with a _ in tape:

$ ils -L 'br_tst bb'
  user001          1 tape;s3-01           31 2025-04-25.14:04 & br_tst bb
    sha2:ChL/1o9AcGBb8iCa8xJlZjWL4eoTVJwvEcQwDcOCn+Y=    generic    /s3-01/irods/5/12/br_tst_bb.718693.1745582691
$


Regards,
Bruno

Antoine Migeon

unread,
Apr 9, 2026, 3:41:33 AM (3 days ago) Apr 9
to iRODS-Chat
Hi Bruno,

I hadn't thought of doing it that way—thanks for the tip!
Just to make sure I understand correctly: you rename the "data_name" (logical name) in a rule (or PEP) before running `msisync_to_arch`, and then, once the write operation is complete, you restore the original data_name.
So during step 2.2, the client sees a file with a different name—is that right?

I think I’m still pretty far from understanding how to do that technically... If anyone has an example of a rule for this, that would be great!

Regards,
Antoine

Kory Draughn

unread,
Apr 9, 2026, 10:58:54 AM (2 days ago) Apr 9
to irod...@googlegroups.com
Antoine,

Perhaps this can aid in what you're attempting to do as well. The project looks to be no more than 2 years old. 
Kory Draughn
Chief Technologist
iRODS Consortium

Antoine Migeon

unread,
Apr 9, 2026, 12:52:39 PM (2 days ago) Apr 9
to irod...@googlegroups.com
Kory,
Yes thank you, this link was in my firt mail.

My problem does not concern Phobos directly. 
It is more a "univmss" problem.

Antoine

You received this message because you are subscribed to a topic in the Google Groups "iRODS-Chat" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/irod-chat/fOUemtxFjCE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to irod-chat+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/irod-chat/CAA-7h7n-oV7RSB30eHmyeo1MBx%3D55jpCvCGF1BgvjbJbeh%3DihA%40mail.gmail.com.

Kory Draughn

unread,
Apr 9, 2026, 1:15:13 PM (2 days ago) Apr 9
to irod...@googlegroups.com
Ok. Sounds like you already took a look at the univMSS implementation there.

Kory Draughn
Chief Technologist
iRODS Consortium

Antoine Migeon

unread,
Apr 9, 2026, 4:46:06 PM (2 days ago) Apr 9
to iRODS-Chat
Yes, I saw their script. In fact, they created it after I told them I wanted to use Phobos behind iRODS a few years ago.

Sorry, I may not have been clear in my explanation. My main concern is the handling of certain characters like single quotes because they can cause problems when passed as arguments to the "univMSS" script. I'm looking for a way to handle this issue before calling the univMSS script.

# upload file on compound resource

$ iput -R comp2 "quote'name.file"
remote addresses: 193.52.249.65 ERROR: putUtil: put error for /mesobfc/home/antoine/quote'name.file, status = -550000 status = -550000 UNIV_MSS_SYNCTOARCH_ERR

$ ils -l "quote'name.file"
  antoine           0 comp2;cache2            7 2026-04-09.22:28 & quote'name.file
  antoine           1 comp2;archive2            7 2026-04-09.22:28 X quote'name.file


# iRODS logs say that the good filename is sent

{"log_category":"legacy","log_level":"info","log_message":"execCmd:/var/lib/irods/msiExecCmd_bin/test_univMSS.sh argv:syncToArch '/irods/storage/1/cache2/home/antoine/quote'name.file' '/irods_phobos/home/antoine/quote'name.file'","request_api_name":"FILE_SYNC_TO_ARCH_AN","request_api_number":525,"request_api_version":"d","request_client_user":"antoine","request_host":"193.52.249.65","request_proxy_user":"rods","request_release_version":"rods5.0.2","server_host":"irods-dev-disk-02.mesobfc.fr","server_pid":3539631,"server_timestamp":"2026-04-09T20:28:02.145Z","server_type":"agent","server_zone":"mesobfc"}

{"log_category":"legacy","log_level":"error","log_message":"[-]\t/irods_source/server/api/src/rsFileSyncToArch.cpp:182:int _rsFileSyncToArch(rsComm_t *, fileStageSyncInp_t *, fileSyncOut_t **) :  status [UNIV_MSS_SYNCTOARCH_ERR]  errno [] -- message [fileSyncToArch failed for [/irods_phobos/home/antoine/quote'name.file]]\n\t[-]\t/irods_source/server/drivers/src/fileDriver.cpp:612:irods::error fileSyncToArch(rsComm_t *, irods::first_class_object_ptr, const std::string &) :  status [UNIV_MSS_SYNCTOARCH_ERR]  errno [] -- message [failed to call 'synctoarch']\n\t\t[-]\t/irods_source/plugins/resources/src/univmss.cpp:648:irods::error univ_mss_file_sync_to_arch(irods::plugin_context &, const char *) :  status [UNIV_MSS_SYNCTOARCH_ERR]  errno [] -- message [univ_mss_file_sync_to_arch: copy of [/irods/storage/1/cache2/home/antoine/quote'name.file] to [/irods_phobos/home/antoine/quote'name.file] failed.   stdout buff [0]   stderr buff [0x55d2ca8315b0]  status [-344000]]\n\n","request_api_name":"FILE_SYNC_TO_ARCH_AN","request_api_number":525,"request_api_version":"d","request_client_user":"antoine","request_host":"193.52.249.65","request_proxy_user":"rods","request_release_version":"rods5.0.2","server_host":"irods-dev-disk-02.mesobfc.fr","server_pid":3539626,"server_timestamp":"2026-04-09T20:28:07.152Z","server_type":"agent","server_zone":"mesobfc"}


# but univMSS script receive wrong args :

$1 == syncToArch
$2 == /irods/storage/1/cache2/home/antoine/quote
$3 == name.file


Sincerely,
Reply all
Reply to author
Forward
0 new messages