How to detect a python overwrite file?

18 views
Skip to first unread message

Bruno Santos

unread,
Dec 12, 2025, 8:35:42 AMDec 12
to iRODS-Chat

For context: we want to control the data_object overwrite and only allow it when some cases.


In the case of icommands: iput and icp (I still need to check others icommands), it was straightforward: in the peps pep_api_data_obj_put_pre and pep_api_data_obj_copy_pre, check for param 'forceFlag' and if present verify the desired conditions.


When it comes to the python client, the peps that need to be looked at are pep_api_data_obj_open_pre and pep_api_data_obj_write_pre. But by analyzing the available data in these peps, I’m not able to find the force flag information.

Question 1: Is the force flag info available in these peps?

 

The best I was able to get was in the pep_resource_resolve_hierarchy_pre with:

```

if api_index == '602': # DATA_OBJ_OPEN_AN

    if rule_args[3] == 'CREATE':

        # put new file

    elif rule_args[3] == 'WRITE':

        # put overwrite file

```

Question 2: Are the assumptions in the pep_resource_resolve_hierarchy_pre to find a file overwrite correct? Are they safe to use?

 

Also, I made some tests with the python client and found that: it is on the client side that verification of the overwrite flag is performed, so that when the force flag is False and the file already exists, the requests is not sent to the server.

Question 3: Is the force flag check done on the client side?

Question 3.1: If yes, can we assume that all the requests to pep_api_data_obj_open_pre are always force flag by default ?

 


Regards,

Bruno

Alan King

unread,
Dec 19, 2025, 11:15:04 AM (12 days ago) Dec 19
to irod...@googlegroups.com
Hi Bruno,

Question 1: Is the force flag info available in these peps?

The force flag should be available in the PEPs if it is being passed by the client, but its presence does not necessarily indicate that an overwrite is happening.

Question 2: Are the assumptions in the pep_resource_resolve_hierarchy_pre to find a file overwrite correct? Are they safe to use?

Yes, that PEP should reflect the correct operation. If a CREATE is initially requested and then the server finds that the object exists, it will change the operation to a WRITE, so you will know that the object is being opened for write at that point. Please let us know if that turns out not to be the case through your testing.

Question 3: Is the force flag check done on the client side? 

For python-irodsclient, yes: https://github.com/irods/python-irodsclient/blob/94b85941c91f0bf9a8135adeeaf28f8bc8d80cf4/irods/manager/data_object_manager.py#L339-L347 The force flag is checked in the client and then it is actually removed before making the API call. This is why you don't see the force flag in pep_api_data_obj_open_pre (see question 1).

Question 3.1: If yes, can we assume that all the requests to pep_api_data_obj_open_pre are always force flag by default ?

The DataObjOpen API does not consider the force flag, so that means that it will always open for write assuming a replica can be overwritten and the user has permission to do so. You could think of it as "always" having a force flag. The bottom line is that once the replica is opened for write, the client can seek anywhere in the file/object/replica and start overwriting stuff, just like in POSIX. In that sense, any open for write can be viewed as an overwrite.

The reason the behavior differs from iput is that iput is using the DataObjPut API which enforces the use of the force flag and things of that nature. python-irodsclient uses raw DataObjOpen which means that it needs to emulate the force flag behavior for its DataObjectManager.put operation in the client.

You may be able to detect an overwrite by looking for O_TRUNC in the open flags. The DataObjectManager.put operation in python-irodsclient opens the replica in "w" mode, which will truncate the data: https://github.com/irods/python-irodsclient/blob/94b85941c91f0bf9a8135adeeaf28f8bc8d80cf4/irods/manager/data_object_manager.py#L582

Hope that helps.

Alan

--
--
The Integrated Rule-Oriented Data System (iRODS) - https://irods.org
 
iROD-Chat: http://groups.google.com/group/iROD-Chat
---
You received this message because you are subscribed to the Google Groups "iRODS-Chat" group.
To unsubscribe from this group and stop receiving emails from it, send an email to irod-chat+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/irod-chat/989e8682-0982-43df-b526-1f795f8332een%40googlegroups.com.


--
Alan King
Senior Software Developer | iRODS Consortium

Daniel Moore

unread,
Dec 19, 2025, 11:51:12 AM (12 days ago) Dec 19
to irod...@googlegroups.com
Bruno,

Your assessment of the clients operation is correct. Also, FORCE_FLAG_KW assuredly will not be seen at the server because it is popped from the open_options dict object passed to the OPEN api endpoint in the client.
 
In fact, as soon as  the "pre" OPEN PEP is seen invoked during a Python client "put", you can infer that either:

   (1) the data object at the given path did not exist before the object "put" was attempted; or,
   (2) the data object did already exist and the force flag was indeed used in the client "put" call.

Therefore from the server perspective, when you see that PEP run, it would be consistent with the logical assumption that FORCE_FLAG_KW is turned on for the given call, whichever of the above cases was responsible for the server getting that far.

  As strange and arbitrary as that may seem, it is also strong reminder that the client only does streaming data object I/O (ie open followed by read or write, and then a close) and it should not be considered as in any way like a client PUT à la iput.

Hope this helps,

   Daniel Moore, Applications Engineer, iRODS Consortium


Daniel Moore

unread,
Dec 20, 2025, 1:56:49 PM (11 days ago) Dec 20
to irod...@googlegroups.com
I'll also second Alan's comments to Question 3.1 : you always assume a force flag was in place; but I think there's a caveat when the PRC is the client involved.  

If it was a pre-existing data object, and the PRC is indeed the source of the open-for-write in DataObjOpen   you can assume the force flag is on, since the client would have blocked the call otherwise - ie. in the OPEN "pre" pep you can check for the existence of the object in question, and if it exists, be able to logically deduce the force flag was used.  If the data object does not exist yet in the "pre" pep, you know it's a CREATE type of open operation, for which case it is irrelevant whether the force flag was set or not.


Reply all
Reply to author
Forward
0 new messages