dcm4chee DB study_on_fs table questions

487 views
Skip to first unread message

leogrande

unread,
Aug 25, 2013, 10:24:46 PM8/25/13
to
I was struggling to find a solution for my problem with the studies deletion on the NEARLINE storage and I found the study_on_fs table in the PostgreSQL dcm4chee database.

I do not understand why after FileCopy service had successfully copied files to the NEARLINE storage the table study_on_fs has had only records for the file system from where FileCopy copied files to the NEARLINE storage file system.
 It doesn't make any sense to me. Studies are stored on the 2 file systems and IMO, dcm4chee (at least FileSystemMgt/Delete Study Service) needs to know where all copies of the study are stored.
However, studies can be retrieved from the NEARLINE storage and the file table shows all file systems correctly.

 When study is deleted by the scheduled deletion procedure the record for that study is deleted from the study_on_fs table as well, that is why that study's copy on the NEARLINE storage can't be deleted by the scheduled deleter. I would expect to have all file systems in the study_on_fs table or somehow those study records, after study was deleted from the first file system, to be replaced (updated) with the correct file system of the copy.

I was trying to use the long scheduleStudyForDeletion() to delete the study copy on the second file system but it returned 0 and it is quite understandable why.

I do not know whether it is a bug or not, it just doesn't work as I expected it to work. I need to be able to delete studies from the NEARLINE storage by the scheduled deletion criteria/Delete Study Service

Any thoughts?





fleetwoodfc

unread,
Aug 27, 2013, 9:22:15 AM8/27/13
to dcm...@googlegroups.com
This is quite an involved subject and can depend upon what you consider NEARLINE storage to be. I think originally Nearline storage would be something like a CD/DVD Jukebox that is write once read many - and you cannot (easily) delete. 

If you want to remove a study and the corresponding files that are stored on a 'regular' R/W filesystem then take a look at using the dcm4chee.web ContentEdit Service operations.

leogrande

unread,
Aug 28, 2013, 8:49:50 AM8/28/13
to dcm...@googlegroups.com
Thank you for your answer.

I use Amazon S3/mapped bucket as a NEARLINE storage and it is RW filesystem.
Actually, I tried NEARLINE on the local file system (RW) and even created additional archive instance with another ONLINE_STORAGE_2 (RW) and FileCopy to that storage with the same result: studies that were copied by the FileCopy service cannot be deleted by  thescheduled deletion procedure and I think that missing records in the study_on_fs table for the second file system is the reason for that behavior.

I was trying to use several options with the FileCopy service setup:
1. Didn't use any HSM Module, just direct copy to the the destination file system
2. Used HSM command module with the tar destination
....no luck...

I didn't find any information in ContentEditService that can help me with my problem. I do not need to create a secondary archive (another dcm4chee box) and synchronize it with the main one.

I need to delete studies from the NEARLINE-SORAGE storage or from the ONLINE_STORAGE (FileCopy destination with the second archive instance on the same dcm4chee server) in 10 years. And it has to be scheduled (deleter criteria) procedure.


Is possible or not to implement scheduled deletion on the storage that keeps copies (FileCopy service destination) or may be I am doing something wrong?

David Davies

unread,
Aug 28, 2013, 9:17:43 AM8/28/13
to dcm...@googlegroups.com
The ContentEditService has operations that enable the deletion of a study from the system e.g. moveStudyToTrash(). The studies are now in the trash so invoking the emptyTrash() will mark the underlying files for deletion (by the "deletion of orphaned private files" process).

--
You received this message because you are subscribed to a topic in the Google Groups "dcm4che" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dcm4che/h34rJ1f6YMA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dcm4che+u...@googlegroups.com.
To post to this group, send email to dcm...@googlegroups.com.
Visit this group at http://groups.google.com/group/dcm4che.
For more options, visit https://groups.google.com/groups/opt_out.

leogrande

unread,
Aug 28, 2013, 10:22:15 AM8/28/13
to

I do not need to delete studies manually, I know how to do that. Studies that are stored on the NEARLINE storage are not orphans in terms of query/retrieve they are normal studies. They are just invisible for the scheduled deletion process.

My question: Can studies be deleted on the NEARLINE storage (or any other, see my post, that are FileCopy service's destinations) by the scheduled deletion, like it works for the ONLINE storage (source for the FileCopy service)?

Thank you.

leogrande

unread,
Aug 28, 2013, 7:13:43 PM8/28/13
to dcm...@googlegroups.com
According to this source, "As used by deleter (Oracle):

SELECT * FROM STUDY_ON_FS t0_sof, SERIES t2_s, FILESYSTEM t1_sof_fileSystem, STUDY t7_sof_study
WHERE ((t1_sof_fileSystem.fs_group_id = 'ONLINE_STORAGE' AND t1_sof_fileSystem.fs_status IN (0, 1)
AND t2_s.series_status = 0 AND t0_sof.filesystem_fk=t1_sof_fileSystem.pk))
AND t7_sof_study.pk=t2_s.study_fk AND t0_sof.study_fk=t7_sof_study.pk AND t0_sof.access_time < (sysdate - <NUM_DAYS>);

I believe that for the NEARLINE_STORAGE deleter query has to be the same (just ONLINE_STORAGE must be replaced with NEARLINE_STORAGE).

But when STUDY_ON_FS is empty deleter doesn't work of course.

I do not know that STUDY_ON_FS update mechanism.
I still do not know whether it is design or I do something wrong
 I am giving up...




On Wednesday, August 28, 2013 10:20:10 AM UTC-4, leogrande wrote:

I do not need to delete studies manually, I know how to do that. Studies that are stored on the NEARLINE storage are not orphans in terms of query/retrieve they are normal studies. They are just invisible for the scheduled deletion process.

My question: Can studies be deleted on the NEARLINE storage (or any other, see my post, that are FileCopy service's destinations) by the scheduled deletion, like it works for the ONLINE storage (source for the FileCopy service)?

Thank you.





On Wednesday, August 28, 2013 9:17:43 AM UTC-4, fleetwoodfc wrote:

leogrande

unread,
Aug 29, 2013, 12:23:34 PM8/29/13
to dcm...@googlegroups.com

UPDATE.

I found that studies on the NEARLINE_STORAGE appear in the STUDY_ON_FS table only after files on the ONLINE_STORAGE were deleted by the scheduled deletion and the study was first time retrieved from the NEARLINE_STORAGE . This retrieval time becomes an access_time in the table and it is a start time for the deleter’s settings.


That means, at least I think so, if the study has never been retrieved from the NEARLINE_STORAGE, it will never be deleted by the scheduled deletion (there are no records for this storage in the STUDY_ON_FS table) unless deletion of studies will be triggered by running out of disk space (DeleterThresholds), I do not know if it works at all in this situation.

Let's assume that for the ONLINE_STORAGE DeleteStudyIfNotAccessedFor was set to 52w and DeleteStudyOnlyIfNotAccessedFor = 10d and deletion of studies was not triggered by running out of disk space during these 52 weeks with all constraints. In 52w after the first access study (actually files of instances) will be deleted and the copy that was created on the NEARLINE_STORAGE (FileCopy service)  will never be accessed again, that study will stay on the NEARLINE_STORAGE for indefinite period of time, again unless the running out of disk space trigger works but  for Amazon S3 it is not an option anyway.

 I have already been thinking about some update triggers creation in the postgresql db to create those NEARLINE_STORAGE records in the STUDY_ON_FS, but now I see that the application does it, although, IMO, in some questionable way.

leogrande

unread,
Sep 10, 2013, 7:52:43 PM9/10/13
to dcm...@googlegroups.com
For the test purposes I have created 'INSERT' Trigger - with the trigger function to populate the table 'STUDY_ON_FS' with the NEARLINE storage records and this trigger didn't conflict with the dcm4chee inserts/updates, so now I have full control over the NEARLINE storage files deletion.
Reply all
Reply to author
Forward
0 new messages