nearline storage configuration help

1,810 views
Skip to first unread message

Ajay

unread,
Jul 8, 2013, 5:44:25 AM7/8/13
to dcm...@googlegroups.com
Hi,

I have followed the http://www.dcm4che.org/confluence/display/ee2/Using+Amazon+S3+for+NEARLINE+storage link to configure nearline storage configuration.
I am not using any HSM tools. Right now my nearline location is 'tar:/storage/nearline'.

The tar files are created and saved in tar:/storage/nearline. I have given the below criteria for deleting a study not accessed for 15 minutes.

        a)  “group=ONLINE_STORAGE,service=FileSystemMgt”
b) Set DeleteStudyIfNotAccessedFor =  (1h)
c) Set DeleteStudyOnlyIfStorageNotCommited = false
d) Set DeleteStudyOnlyIfCopyOnMedia = false
e) Set DeleteStudyOnlyIfCopyOnReadOnlyFileSystem = false
f) Set ScheduleStudiesForDeletionInterval = (15M)
g) Set DeleteStudyOnlyIfCopyOnFileSystemOfFileSystemGroup = NEARLINE_STORAGE
h) Set DeleteStudyOnlyIfCopyArchived = true 

I checked the studies after one hour but the studies are not deleted from the online storage.
I want to test the NEAR LINE functionality if any study not accessed for 1 hour. I don't want to consider the disk space criteria for deleting a study.

I can see following lines on the server console:

14:43:15,929 WARN  [FileSystemMgt2Service] No study found for clean up filesystem group ONLINE_STORAGE! Please check your configuration!
14:43:15,929 WARN  [FileSystemMgt2Service] Trigger intervall: 15m
Trigger on SeriesStored:false
All studies not accessed for 1h.
 And studies not accessed for 1h when running out of disk space!
Deleter Criteria:
  1) External Retrievable
  2) Copy on Filesystem Group NEARLINE_STORAGE
  3) Copy must be archived

Is there any where I made a mistake?  how to remove the condition " And studies not accessed for 1h when running out of disk space!" from my delete criteria?
Please suggest me the deleter criteria of studies which are not accessed for 1hr. (I don't want to consider disk space restriction).

Thank you.








Ajay

unread,
Jul 8, 2013, 10:46:42 AM7/8/13
to dcm...@googlegroups.com
Any help on this please..!!!

Ajay

unread,
Jul 9, 2013, 2:54:49 AM7/9/13
to dcm...@googlegroups.com
Following are the file system details :

FileSystem[pk=1, tar:/storage/nearline, groupID=NEARLINE_STORAGE, aet=DCM4CHEE, NEARLINE, RW+, userinfo=null]
FileSystem[pk=2, archive, groupID=ONLINE_STORAGE, aet=DCM4CHEE, ONLINE, RW+, userinfo=null]

Is there any wrong with this? The files are not getting deleted with the above configuration? 
Does dcm4chee consider available storage space on online inorder to delete the study? If so, how can we skip this check?

Ajay

unread,
Jul 9, 2013, 3:22:08 AM7/9/13
to dcm...@googlegroups.com
I am getting the following error on the server console :

12:36:39,281 INFO  [FileSystemMgt2Service] Check file system group ONLINE_STORAGE for deletion of orphaned private files
12:36:39,284 WARN  [FileSystemMgt2Service] No study found for clean up filesystem group ONLINE_STORAGE! Please check your configuration!
12:36:39,284 WARN  [FileSystemMgt2Service] Trigger intervall: 15m
Trigger on SeriesStored:false
All studies not accessed for 1h.
 And studies not accessed for 1h when running out of disk space!
Deleter Criteria:
  1) Copy on Filesystem Group NEARLINE_STORAGE
  2) Copy must be archived
12:36:39,588 INFO  [HSMCommandModule] queryStatus: mmls null/2013/7/8/12/767BCB9C/767BCB9D-2834728.tar
12:36:39,589 ERROR [HSMCommandModule] Failed to execute mmls null/2013/7/8/12/767BCB9C/767BCB9D-2834728.tar
org.dcm4chex.archive.hsm.module.HSMException: queryStatus failed!
        at org.dcm4chex.archive.hsm.module.AbstractHSMModule.doCommand(AbstractHSMModule.java:140)
        at org.dcm4chex.archive.hsm.module.HSMCommandModule.queryStatus(HSMCommandModule.java:246)
        at sun.reflect.GeneratedMethodAccessor154.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
        at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
        at org.jboss.mx.interceptor.AbstractInterceptor.invoke(AbstractInterceptor.java:133)
        at org.jboss.mx.server.Invocation.invoke(Invocation.java:88)
        at org.jboss.mx.interceptor.ModelMBeanOperationInterceptor.invoke(ModelMBeanOperationInterceptor.java:142)
        at org.jboss.mx.server.Invocation.invoke(Invocation.java:88)
        at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
        at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
        at org.dcm4chex.archive.hsm.SyncFileStatusService.queryHSM(SyncFileStatusService.java:519)
        at org.dcm4chex.archive.hsm.SyncFileStatusService.check(SyncFileStatusService.java:370)
        at org.dcm4chex.archive.hsm.SyncFileStatusService.check(SyncFileStatusService.java:348)
        at org.dcm4chex.archive.hsm.SyncFileStatusService$1$1.run(SyncFileStatusService.java:105)
        at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: Cannot run program "mmls": CreateProcess error=2, The system cannot find the file specified
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
        at java.lang.Runtime.exec(Runtime.java:593)
        at java.lang.Runtime.exec(Runtime.java:466)
        at org.dcm4che.util.Executer.<init>(Executer.java:111)
        at org.dcm4che.util.Executer.<init>(Executer.java:104)
        at org.dcm4chex.archive.hsm.module.AbstractHSMModule.doCommand(AbstractHSMModule.java:137)
        ... 17 more
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified
        at java.lang.ProcessImpl.create(Native Method)
        at java.lang.ProcessImpl.<init>(ProcessImpl.java:81)
        at java.lang.ProcessImpl.start(ProcessImpl.java:30)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
        ... 22 more

Any suggestions why the error is coming?

Aaron Boxer

unread,
Jul 9, 2013, 10:50:31 PM7/9/13
to dcm...@googlegroups.com


--
You received this message because you are subscribed to the Google Groups "dcm4che" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dcm4che+u...@googlegroups.com.
To post to this group, send email to dcm...@googlegroups.com.
Visit this group at http://groups.google.com/group/dcm4che.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Ajay

unread,
Jul 10, 2013, 6:39:26 AM7/10/13
to dcm...@googlegroups.com
Thanks Aaron for the link.

So, can I conclude that to use nearline I need to have HSM software installed on my windows machine?

Can someone suggest HSM software/tools for Windows 7 or Windows server 2008. Any free HSM tools are available which can support dcm4chee FileCopy and TarRetrieve functions.

Thanks,
Ajay


leogrande

unread,
Jul 18, 2013, 1:52:27 PM7/18/13
to dcm...@googlegroups.com
Have you followed "Using Amazon S3 for NEARLINE storage"?

From that article: "...Version Info
FYI - This feature is only available by building the code from SVN right now. http://dcm4che.svn.sourceforge.net/viewvc/dcm4che/dcm4chee/sandbox/dcm4chee-hsm-cloud/.."

Do you have that Amazon S3 HSM Module in the list of services in the JMX console?

Andres Castiblanco

unread,
Feb 12, 2016, 10:04:22 AM2/12/16
to dcm4che
Hi I dont get it, how do I build the code? Just copy and paste?

Thanks in advance.
Message has been deleted

matthe...@netscape.net

unread,
Feb 12, 2016, 11:22:46 AM2/12/16
to dcm...@googlegroups.com

Use version 2.18.3 

Don't need to build anything


You are on windows?  Found this on a google search: http://forums.dcm4che.org/jiveforums/thread.jspa?threadID=3401


In this section we will configure the various services and get an understanding of what they do.

  1. Add a file system to the NEARLINE storage group

    1. Open a web browser, and navigate to the JMX console. e.g. http://localhost:8080/jmx-console

    2. Locate the “dcm4chee.archive” section, and click on “group=NEARLINE_STORAGE,service=FileSystemMgt”

    3. Scroll down to the “List of MBean Operations” section and find the “addRWFileSystem()” operation. Enter in a path for nearline storage. This will not actually be used, since we are going to store the files in S3, but we need to configure something here so that the system knows we are using the NEARLINE storage. I have entered: “tar:/storage/nearline”. Note the tar prefix. This tells dcm4chee that all of the files going to this storage group will be tarred up.

    4. Click Invoke to create add the file system record into the database.

  2. Configure the FileCopy service
    The FileCopy service is responsible for physically copying files to your nearline storage. This is where we configure our particular plugin.

    1. Specify a value for the DestinationFileSystem. This value should equal the value you specified for your nearline storage file system so that dcm4chee knows that this FileCopy service is associated with that file system configuration. e.g. “tar:/storage/nearline”

    2. Specify a value for HSMModulServicename. This should be the JMX ObjectName of our S3 plugin module, and enables it for use within this service when storing and retrieving files. Enter: “dcm4chee.archive:service=FileCopyHSMModule,type=S3”

    3. Leave the FileStatus set to TO_ARCHIVE. This will be the status of files stored in S3. When the SyncFileStatus service runs and verifies that these files are stored properly, it will change the status to ARCHIVED.

    4. Click Apply Changes.

  3. Configure the TarRetriever service
    This service is responsible for fetching and extracting tar files from the nearline storage during retrieve requests.

    1. Specify a value for HSMModulServicename. This should be the JMX ObjectName of our S3 plugin module, and enables it for use within this service when storing and retrieving files. Enter: “dcm4chee.archive:service=FileCopyHSMModule,type=S3”

    2. Click Apply Changes.

  4. Configure the S3 HSMModule (service=FileCopyHSMModule,type=S3)
    Now we are ready to configure the S3 integration. Here are the main things to configure here:

    1. Amazon S3 bucket name

    2. Amazon AWS Access Key

    3. Amazon AWS Secret Key (this is write only, and you will not see a value after clicking Apply Changes)

    4. The Outgoing and Incoming directories in this configuration are temporary storage areas that are used for tarring and untarring files.

    5. Click Apply Changes.

  5. Configure the SyncFileStatus service
    This service will run periodically and verify the files that have been stored to S3. It will fetch the tar files and ensure that the correct files are contained within. Once it verifies the files, it will update the file status in the database to ARCHIVED.

    1. Specify a value for the MonitoredFileSystem. This value should equal the value you specified for your nearline storage file system so that dcm4chee knows that this service is associated with that file system configuration. e.g. “tar:/storage/nearline”

    2. Specify a value for HSMModulServicename. This should be the JMX ObjectName of our S3 plugin module, and enables it for use within this service when fetching tar files. Enter: “dcm4chee.archive:service=FileCopyHSMModule,type=S3”

    3. Specify a TaskInterval. It is set to NEVER by default, so you you should set it to a proper interval that is good for your workflow, preferably not during peak business hours.

Summary

At this point you should be able to store DICOM objects to dcm4chee and it will archive them to Amazon S3. The S3 key will be a hierarchical path, which should look familiar to you if you have looked at how dcm4chee stores objects on a file system. For example, here is a screenshot of my Amazon Management Console showing the archived path:



In the database, the files should have a changed file status, and should reflect their tar path as shown in this screenshot:



If you can’t read it in the picture, the filepaths look like this: 2011/8/3/16/745ABFED/CF024730-323397.tar!CF024730/000004A0

Note the “tar” designator in the path. This tells the system that the file is contained within a tar file.

Setting up Retention Rules to Remove Studies from Online

We probably don’t want two copies of the study forever, so lets set up some rules now so that studies are deleted from the ONLINE storage group after a period of time. This will leave the remaining copy on S3.

Note that this is only an example. Your retention/deletion requirements may differ!

  1. Configure deletion of ONLINE studies

    1. Open a web browser, and navigate to the JMX console. e.g. http://localhost:8080/jmx-console

    2. Locate the “dcm4chee.archive” section, and click on “group=ONLINE_STORAGE,service=FileSystemMgt”

    3. Set DeleteStudyIfNotAccessedFor = your retention period (52w or whatever your SLA requires)

    4. Set DeleteStudyOnlyIfStorageNotCommited = false

    5. Set DeleteStudyOnlyIfCopyOnMedia = false

    6. Set DeleteStudyOnlyIfCopyOnReadOnlyFileSystem = false

    7. Set ScheduleStudiesForDeletionInterval = a reasonable time interval for the system to check the database and schedule deletion jobs.

    8. Set DeleteStudyOnlyIfCopyOnFileSystemOfFileSystemGroup = NEARLINE_STORAGE

    9. Set DeleteStudyOnlyIfCopyArchived = true (only delete studies that have been verified by the SyncFileStatus service. If you don’t care about that or are not running that service, you can set this false.)

    10. Click Apply Changes

At this point, dcm4chee will look for studies in ONLINE that meet these criteria and schedule them for deletion. After they are deleted, and the only copy is on S3, a retrieve request will trigger a fetch from Amazon. The tar file(s) will be fetched, images extracted and sent to the destination.

That’s it!

Alvaro G. [Andor]

unread,
Feb 12, 2016, 11:28:05 AM2/12/16
to dcm...@googlegroups.com
I'm not really sure because it lacks detail, but, what I think you are saying down here is not to connect directly to S3 but:
  • Install Castor, which AFAIK it's not called "Castor" any more, and I think a license and an additional server is needed for that, isn't it?
  • Configure an external storage module for Castor that connects to S3
  • Connect dcm4chee to Castor
Isn't it?

El 12/02/16 a las 10:16, matthe...@netscape.net escribió:
Use version 2.18.3

No need to build the code.

In this section we will configure the various services and get an understanding of what they do.
1. Add a file system to the NEARLINE storage group
1. Open a web browser, and navigate to the JMX console. e.g. http://localhost:8080/jmx-console
2. Locate the “dcm4chee.archive” section, and click on “group=NEARLINE_STORAGE,service=FileSystemMgt”
3. Scroll down to the “List of MBean Operations” section and find the “addRWFileSystem()” operation. Enter in a path for nearline storage. This will not actually be used, since we are going to store the files in S3, but we need to configure something here so that the system knows we are using the NEARLINE storage. I have entered: “tar:/storage/nearline”. Note the tar prefix. This tells dcm4chee that all of the files going to this storage group will be tarred up.
4. Click Invoke to create add the file system record into the database.
2. Configure the FileCopy service
The FileCopy service is responsible for physically copying files to your nearline storage. This is where we configure our particular plugin.
1. Specify a value for the DestinationFileSystem. This value should equal the value you specified for your nearline storage file system so that dcm4chee knows that this FileCopy service is associated with that file system configuration. e.g. “tar:/storage/nearline”
2. Specify a value for HSMModulServicename. This should be the JMX ObjectName of our S3 plugin module, and enables it for use within this service when storing and retrieving files. Enter: “dcm4chee.archive:service=FileCopyHSMModule,type=CAStor ”
3. Leave the FileStatus set to TO_ARCHIVE. This will be the status of files stored in S3. When the SyncFileStatus service runs and verifies that these files are stored properly, it will change the status to ARCHIVED.
4. Click Apply Changes.
3. Configure the TarRetriever service
This service is responsible for fetching and extracting tar files from the nearline storage during retrieve requests.
1. Specify a value for HSMModulServicename. This should be the JMX ObjectName of our S3 plugin module, and enables it for use within this service when storing and retrieving files. Enter: “dcm4chee.archive:service=FileCopyHSMModule,type=CAStor”
2. Click Apply Changes.
4. Configure the CAStor HSMModule (service=FileCopyHSMModule,type=CAStor)
Here are the main things to configure here:
1. IP address of storage node
2. How long to store image (in weeks, 52000 is 100 years)


5. Configure the SyncFileStatus service
This service will run periodically and verify the files that have been stored to S3. It will fetch the tar files and ensure that the correct files are contained within. Once it verifies the files, it will update the file status in the database to ARCHIVED.
1. Specify a value for the MonitoredFileSystem. This value should equal the value you specified for your nearline storage file system so that dcm4chee knows that this service is associated with that file system configuration. e.g. “tar:/storage/nearline”
2. Specify a value for HSMModulServicename. This should be the JMX ObjectName of our CAStor plugin module, and enables it for use within this service when fetching tar files. Enter: “dcm4chee.archive:service=FileCopyHSMModule,type=CAStor”
3. Specify a TaskInterval. It is set to NEVER by default, so you you should set it to a proper interval that is good for your workflow, preferably not during peak business hours.
Summary
At this point you should be able to store DICOM objects to dcm4chee and it will archive them to Amazon S3. The S3 key will be a hierarchical path, which should look familiar to you if you have looked at how dcm4chee stores objects on a file system. For example, here is a screenshot of my Amazon Management Console showing the archived path:

In the database, the files should have a changed file status, and should reflect their tar path as shown in this screenshot:

If you can’t read it in the picture, the filepaths look like this: 2011/8/3/16/745ABFED/CF024730-323397.tar!CF024730/000004A0
Note the “tar” designator in the path. This tells the system that the file is contained within a tar file.
Setting up Retention Rules to Remove Studies from Online
We probably don’t want two copies of the study forever, so lets set up some rules now so that studies are deleted from the ONLINE storage group after a period of time. This will leave the remaining copy on CAStor.
Note that this is only an example. Your retention/deletion requirements may differ!
1. Configure deletion of ONLINE studies
1. Open a web browser, and navigate to the JMX console. e.g. http://localhost:8080/jmx-console
2. Locate the “dcm4chee.archive” section, and click on “group=ONLINE_STORAGE,service=FileSystemMgt”
3. Set DeleteStudyIfNotAccessedFor = your retention period (52w or whatever your SLA requires)
4. Set DeleteStudyOnlyIfStorageNotCommited = false
5. Set DeleteStudyOnlyIfCopyOnMedia = false
6. Set DeleteStudyOnlyIfCopyOnReadOnlyFileSystem = false
7. Set ScheduleStudiesForDeletionInterval = a reasonable time interval for the system to check the database and schedule deletion jobs.
8. Set DeleteStudyOnlyIfCopyOnFileSystemOfFileSystemGroup = NEARLINE_STORAGE
9. Set DeleteStudyOnlyIfCopyArchived = true (only delete studies that have been verified by the SyncFileStatus service. If you don’t care about that or are not running that service, you can set this false.)
10. Click Apply Changes
At this point, dcm4chee will look for studies in ONLINE that meet these criteria and schedule them for deletion. After they are deleted, and the only copy is on S3, a retrieve request will trigger a fetch from Amazon. The tar file(s) will be fetched, images extracted and sent to the destination.
That’s it!




Added by Damien Evans, last edited by Jan Pechanec on May 04, 2011  (view change)

On Friday, February 12, 2016 at 7:04:22 AM UTC-8, Andres Castiblanco wrote:
--
You received this message because you are subscribed to the Google Groups "dcm4che" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dcm4che+u...@googlegroups.com.
To post to this group, send email to dcm...@googlegroups.com.
Visit this group at https://groups.google.com/group/dcm4che.
For more options, visit https://groups.google.com/d/optout.

matthe...@netscape.net

unread,
Feb 12, 2016, 11:31:42 AM2/12/16
to dcm...@googlegroups.com
I edited that to post the S3 instruction.  Sorrry.  I posted my notes first.

Please delete the Castor instructions so they don't confuse anyone.

See my edit post.

Thanks



Rady

unread,
Jun 6, 2017, 10:21:08 AM6/6/17
to dcm4che
Dear dcm4chee users, 

I am trying to setup AWS s3 as our NEARLINE storage location. Tried to follow the steps stated in this link. The same steps are also suggested in this thread. We are using dcm4chee 2.18.3 from SourceForge MySQLEdition. All fine until we reached "Configure the S3 HSMModule (service=FileCopyHSMModule,type=S3)", we could not locate this service in JMXConsole. Please suggest if we are missing something. 

with regards
Rady
Reply all
Reply to author
Forward
0 new messages