Using Amazon S3 for NEARLINE storage - dcm4chee-2.x - Confluence
However, we are not too clear on how exactly to build dcm4chee from SVN. Can anyone please provide step-by-step
instructions? Is there any way to simply add an "S3 module" to an existing dcm4chee archive?
Or has anyone got a better way of setting up nearline storage to archive to S3 or Amazon glacier?
Thank you.
You can definitely get nearline archiving working on an existing DCM4CHEE archive, but it’s a bit of a challenge. Here’s a digest of my notes:
1. There is no binary available, so you need to check out the source and build it yourself. The URL is: https://svn.code.sf.net/p/dcm4che/svn/dcm4chee/sandbox/dcm4chee-hsm-cloud/
2. To get the source and some of the dependencies, install a SVN client like Tortoise SVN for Windows.
3. Use the SVN client to check out the project from the above URL. If you use Tortoise, it’s just right click a folder in Windows Explorer, SVN checkout…, paste in the URL, and click OK.
4. There’s no build script and anyway you need to modify the source code (see below) so import this project into your IDE as a standard Java project.
5. The following dependencies are needed to build the project: aws-java-sdk-1.2.5.jar, commons-io-1.3.1.jar, dcm4che.jar, dcm4chee-ejb-client.jar, dcm4chee.jar, httpclient-4.2.3.jar, httpcore-4.2.jar, jboss-common.jar, jboss-j2ee.jar, jboss-jmx.jar, jboss-system.jar, log4j-1.2.16.jar. Many can be found in the lib of your DCM4CHEE installation. Add these to the Java project lib.
6. Now you can build your binary but you need to fix an issue found by Jonathan Morra (https://groups.google.com/forum/#!topic/dcm4che/rXbDH4QlJM8), which is to add fetchHSMFileFinished(fsID, filePath, file) to the end of storeHSMFile method.
7. Also note carefully from that post that when you deploy the service as described in the instructions you will find that it registers itself as
dcm4chee.archive:service=FileCopyHSMModule,type=CLOUD
-NOT-
dcm4chee.archive:service=FileCopyHSMModule,type=S3
Anywhere in the instructions that you are told to enter type=S3, change this to type=CLOUD
8. Now you can finish following the installation instructions.
9. Depending on your deployment, to avoid filling the system drive and crashing your server, you may want to change the following directory locations in the TarRetriever service:
CacheRoot and CacheJournalRootDirectory: set these to some non-system mount point.
And for similar reasons in FileCopyHSMModule service, type = CLOUD, change the following:
OutgoingDirectory and IncomingDirectory to a non-system mount point.
10. Some of the jar dependencies above are needed at runtime and therefore need to be copied to the DCM4CHEE deployment lib to make the HSM cloud module work, so add the following to /dcm4chee-root/server/default/lib: aws-java-sdk-1.2.5.jar, httpclient-4.2.3.jar, httpcore-4.2.jar
11. Once you’ve set this up, keep in mind that if a study is in online storage and nearline storage, then the web interface will only report online availability. You can verify in the files DB table that the study is stored in both locations. To manually move files to S3 as a test, do the following:
a. Log in to the JMX console and bring up the FileCopy service
b. Use the copyFilesOfStudy() to specify a study by UID to copy to S3
c. Verify with the AWS system admin that the files made it to the AWS S3 bucket
d. Run a query on the PACS DB to verify that each file of the study now has a record assigned to the new file system and has the keyword 'tar' in the file path
e. Bring up the FileSystemMgt (ONLINE_STORAGE group) and use the scheduleStudyForDeletion() method to schedule the study for deletion from the online group.
f. Upon logging in to the web admin interface, you should now see that the availability of the test study is NEARLINE rather than ONLINE.
That's it. You should now have the ability to schedule studies for copy to S3 and delete from online storage.
| DefaultAvailability | java.lang.String | RW | Default Availability, which will be associated with new file systems added by operation addRWFileSystem. Enumerated values: "ONLINE", "NEARLINE","OFFLINE", "UNAVAILABLE". | |
| DefaultUserInformation | java.lang.String | RW | Default User Information, which will be associated with new file systems added by operation addRWFileSystem. | |
| DefaultStorageDirectory | java.lang.String | RW | Default Storage Directory, used on receive of the first object if no Storage File System was explicit configured by operation addRWFileSystem. A relative path name is resolved relative to <archive-install-directory>/server/default/. Use "NONE" to disable auto-configuration to force failure to receive objects, if no Storage File System was explicit configured. |
Hi all, is there any more nearline storage that we can use?