Hi,
We found a strange issue that we can only describe as a file collision issue.
Different files (they are the same name but are in different collections) when replicated to a s3 resource have the same file path in irods.
As background info, it is relevant to know that our irods server has random scheme enabled and the version is 4.3.4.
Our replication process is custom made (due to issues with the replicate resource type).
Short summary of our process to move files to tapes (names are tape_1 and tape_2):
1. File is placed in local storage (name hot_1) and it has a checksum.
Replicate to tapes:
2. Replication to tape_1 without checksum calculation
3. Replication to tape_2 without checksum calculation
Integrity check:
4. Calculate checksum on tape_1.
5. Compare the checksum with the value from hot_1.
6. Calculate checksum on tape_2.
7. Compare the checksum with the value from hot_1.
Trim:
8. If all checksum match, the hot_1 replica is trimmed.
We do the upload and the checksum in separate steps because as it is a tape system, we only can read from his cache and the file will be delete from cache in a few minutes (I think it is 5 minutes). That time is not enough to calculate checksums if it is a big file. Joris created a discussion about this in the group in the past.
The code used for replication is:
```
ReplicateToTape{
writeLine("serverLog", "msiDataObjReplWrapper: Replicate *path from *resource_source to *resource_destination");
msiDataObjRepl(*path,'destRescName=*resource_destination++++rescName=*resource_source++++irodsAdmin=++++verifyChksum=0', *out_param)
writeLine("serverLog", "msiDataObjReplWrapper: Replicate *path from *resource_source to *resource_destination done: *out_param");
}
INPUT *resource_destination="tape", *path="/a/data/obj/path", *resource_source="hot_1"
OUTPUT ruleExecOut
```
The rule execution outputs the logs (with the server timestamp):
2026-04-01T07:42:26.030Z msiDataObjReplWrapper: Replicate /ZONE/COLLECTIONS/DATA/ydvJK1/data_object_name.txt from hot_1 to tape_2
2026-04-01T07:42:26.357Z msiDataObjReplWrapper: Replicate /ZONE/COLLECTIONS/DATA/ydvJK1/data_object_name.txt from hot_1 to tape_2 done: 0
2026-04-01T07:42:26.030Z msiDataObjReplWrapper: Replicate /ZONE/COLLECTIONS/DATA/cF5g5T/data_object_name.txt from hot_1 to tape_2
2026-04-01T07:42:26.365Z msiDataObjReplWrapper: Replicate /ZONE/COLLECTIONS/DATA/cF5g5T/data_object_name.txt from hot_1 to tape_2 done: 0
The tape resource details:
$ ilsresc tape_2
tape_2:passthru
└── s3-tape-02:s3
$ ilsresc -l s3-tape-02
(...)
context: S3_DEFAULT_HOSTNAME=
s3.object-archive.nl;S3_AUTH_FILE=/var/data/s3-tape-02.s3.keypair;S3_REGIONNAME=nlprd-02;S3_RETRY_COUNT=2;S3_WAIT_TIME_SECONDS=3;S3_PROTO=HTTPS;ARCHIVE_NAMING_POLICY=consistent;HOST_MODE=cacheless_attached;S3_CACHE_DIR=/s3/cache;S3_MPU_CHUNK=125
(...)
And the ils for the data objects:
$ ils -L /ZONE/COLLECTIONS/DATA/ydvJK1/data_object_name.txt
irods 1 tape_1;s3-tape-01 86 2026-04-01.09:42 & data_object_name.txt
sha2:YlBCA+WhdUmm81mfGmI0GGCatFaQyWLKm6mOUJa9AB0= generic /s3-tape-01/irods/5/0/data_object_name.txt.1775029345
irods 2 tape_2;s3-tape-02 86 2026-04-01.09:42 & data_object_name.txt
sha2:YlBCA+WhdUmm81mfGmI0GGCatFaQyWLKm6mOUJa9AB0= generic /s3-tape-02/irods/1/13/data_object_name.txt.1775029346
$
$ ils -L /ZONE/COLLECTIONS/DATA/cF5g5T/data_object_name.txt
irods 0 hot_1;local_filestore 86 2026-04-01.09:24 & data_object_name.txt
sha2:ls3PiRia8R9k75L3ZJWFMrnWXVv4o8S2IyAvBR9RJ+4= generic /irods/data/a278a33b-9081-40a3-bfc6-03e412f952e9/data_object_name.txt
irods 1 tape_1;s3-tape-01 86 2026-04-01.09:42 & data_object_name.txt
sha2:ls3PiRia8R9k75L3ZJWFMrnWXVv4o8S2IyAvBR9RJ+4= generic /s3-tape-01/irods/12/12/data_object_name.txt.1775029345
irods 2 tape_2;s3-tape-02 86 2026-04-01.09:42 & data_object_name.txt
sha2:YlBCA+WhdUmm81mfGmI0GGCatFaQyWLKm6mOUJa9AB0= generic /s3-tape-02/irods/1/13/data_object_name.txt.1775029346
$
(The server timestamp has 2 hours diff)
Our checksum validation failed because in tape_2 those 2 files have the same content (are the same).
This was the first time we found this and the process executed thousands of times before.
Any ideas?
Thanks in advance,
Bruno