Backup option on irepl

23 views
Skip to first unread message

joris luijsterburg

unread,
Mar 9, 2023, 9:44:57 AM3/9/23
to iRODS-Chat
Hey all,

I am not sure if I completely understand the -B option on irepl. In the irepl -h it says:
 -B  Backup mode - if a good copy already exists in this
     resource, don't make another copy.

So suppose I have a collection that I want to replicate from resc1 to resc2, with following contents:
/home/irods/testdata:
  irods             0 resc1            0 2023-03-08.14:20 & 3dfile
  irods             0 resc1            0 2023-03-08.14:21 & 4thfile
  irods             1 resc2            0 2023-03-09.14:26 & 4thfile
  irods             0 resc1            0 2023-03-08.14:16 & somefilehere
  irods             0 resc1            0 2023-03-03.13:43 & somefile.txt

I see that '4thfile' is alreayd on resc2, so if I execute my irepl command I will get the following output:

irepl -r -R resc2 /test/home/irods/testdata
remote addresses: 127.0.1.1 ERROR: replCollUtil: replDataObjUtil failed for /test/home/irods/testdata/4thfile. status = -169000 status = -169000 SYS_NOT_ALLOWED
remote addresses: 127.0.1.1 ERROR: replUtil: repl error for /test/home/irods/testdata, status = -169000 status = -169000 SYS_NOT_ALLOWED


The result is that now all files are on resc1 and resc2, which is what I want. Getting an error also makes sense, because the file was already on resc2. However, I would rather have the command not give an error, since the outcome is actually satisfying my goal. My guess was that I can use irepl -B for this, since it should ignore that 4thfile is already on resc2. However, when I reset my files to the original state(by itrim), and run below command, I get the exact same error.
irepl -rB -R resc2 /RDMtest/home/irods/testdata

Am I misunderstanding irepl -B?

I also tried -B without -r, -rU instead of -rB, adding -S, they all give some error.
Best regards,
Joris Luijsterburg 

Alan King

unread,
Mar 9, 2023, 1:24:06 PM3/9/23
to irod...@googlegroups.com
Hi,

First, the -B option of irepl should be deprecated and removed. I have created an issue for deprecation here: https://github.com/irods/irods/issues/6953 I thought we already did this, but here we are. :)

The -U flag does nothing in iRODS 4.2.9+ due to its prevalence in existing deployments. The replication API will now by default update existing replicas and create new replicas, if needed.

This hasn't been published in the docs yet, but we have a write-up for the "rules" of replication in iRODS at a conceptual level: https://github.com/irods/irods_docs/blob/main/docs/system_overview/data_objects.md#replicate We are in case 4 of the table: replicating a data object from a resource with a good replica to a resource with a good replica is not allowed. As such, the replication API has been implemented to return an error when this is attempted, and that error is what you are seeing from the irepl client.

You are suggesting some way of indicating to the replication API (via some option in the irepl client) that it is not an error to attempt to overwrite a good replica with a good replica and that it should silently do nothing in this case. Is that right?

We have felt pretty strongly that our only course of action from the API perspective is to report errors as they occur and the caller is responsible for dealing with those errors (including ignoring them). However, now that you've brought this up, I'm starting to reconsider just for this one instance... If the replication API is directed to overwrite a good replica, there are a few things to consider:
    1. If we are not going to return an error when overwriting a good replica with a good replica, what SHOULD we do? Silent no-op?
    2. What is the status of the source replica? Should an error be returned if a good replica will be overwritten by a stale replica?
    3. What if the source resource is the same as the destination resource (e.g. -S resc1 -R resc1)?

Perhaps also of interest: I attempted to start a discussion about whether to report errors in the presence of the recursive flag a while back: https://github.com/irods/irods/issues/5726

Sorry if that's more than you bargained for. This is an important topic!

--
--
The Integrated Rule-Oriented Data System (iRODS) - https://irods.org
 
iROD-Chat: http://groups.google.com/group/iROD-Chat
---
You received this message because you are subscribed to the Google Groups "iRODS-Chat" group.
To unsubscribe from this group and stop receiving emails from it, send an email to irod-chat+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/irod-chat/77734db7-aae2-4acf-8c23-caefcaa97614n%40googlegroups.com.


--
Alan King
Senior Software Developer | iRODS Consortium

joris luijsterburg

unread,
Mar 10, 2023, 3:20:13 AM3/10/23
to iRODS-Chat
Alan,

Thanks, that does clear things up! And like these discussions, so your efforts aren't wasted on me! Maybe to give some background where my line of thinking is from. In my regular deployments I am using ansible. In ansible every operation is idempotent, so when I run the operation again, it will have the exact same outcome. Also, in ansible typically you do not describe the action you take, but more the desired outcome(.e.g I want /etc/irods to be a directory with 0640), and the underlying ansible automation will handle the if-else logic. This influences my thinking about automation. 

However, at this moment I am not doing ansible but trying to incorporate irepl in a rule, or actually, msiDataObjRepl and/or msiCollRepl. The problem I have is the following. If I have a rule that performs msiCollRepl on a collection, and for some reason one file in that collection is already replicated. The resulting irepl or msiCollRepl operation will result in an error, but the rest of the files will copy. So naturally, when I investigate and see the -B option, my line of thought was: Great, this looks like the ansible way of working, I want to use that! However, the irods way of working is not the ansible way of working, there are different ground rules. I can work either way, as long as I understand what happens exactly, and that is where the table you provided will come in handy. (I am going to have a look at itrim later, so this is a usefull link). The difficulty here is in the -r part, especially if you use it in a rule, since then catching errors on individual files is not straightforward(or can I somehow retrieve a list of these files from sParam_t * status?), so you rather have irods take care of as much as possible

For my case I am now thinking more in the line of just doing the msiCollRepl in my rule, and assume everything went fine, but next to that I need to do some monitoring on files that I want to have replicated but that are not, and generic monitoring on not-good(dirty status !=1) files.
.

That being said, if you would implement an option like this:

1. This is then what I would expect

2. This is why I ask questions, because my thinking in irods didn't go this far yet and I am very glad yours is. Your table row 7 clearly shows what I cannot imagine any irods user would want to happen typically. However, it is in principle a valid usecase, when you made a wrong update on a replica and want to rollback.

3. It could be a no-op without error, as the replica is there in the resource you want, but replicating a file or a collection to the same resource it came from doesn't really seem to make sense anyway. In automation however you might want to have no error, but also there it should be an easily preventable edgecase.


Best regards,
Joris Luijsterburg
Reply all
Reply to author
Forward
0 new messages