How to get study size in MB or GB in dcm4chee-2.17

363 views
Skip to first unread message

Danny Kim

unread,
May 29, 2018, 5:06:15 PM5/29/18
to dcm4che
Hello

I am currently running an old dcm4chee (version 2.17) on an Ubuntu system.

I would like to know how I might get a study size from a particular study data from the dcm4chee.

For example, I would like to how big SUBJECT1's study size was

SUBJECT1_scan1 = 2.17 GB
SUBJECT1_scan2 = 1.8 GB
SUBJECT1_scan3 = 2.18 GB

SUBJECT2_scan1 = 1.75 GB
SUBJECT2_scan2 = 1.45 GB

and so on.

Is there an easy way to accomplish this?

Thanks in advance dear experts,


Danny Kim

Jon Ander Zuccaro

unread,
May 29, 2018, 6:31:16 PM5/29/18
to dcm4che
You don't specify what tools / scripts are you willing to use.

If you are willing to run a simple query against the pacsdb database you could try something like this:

This is easier if you don't have more than one configured file system... so do a query against the files table using instance_fk to only get files from instances that belong to a particular study (you'll need to go instance -> series -> study) using the Study Instance UID to uniquely identify the study.

This will basically give you a list of all files that belong to a study, feed that output to something that calculates the size of each file and then the total sum.

Example:

select filepath from files,instance,series,study where instance.pk = files.instance_fk and series.pk = instance.series_fk and study.pk = series.study_fk and study.study_iuid = "1.2.826.0.1.3680043.9.67.1499365157610.33333";

Remember, this only works if you have a single file system , if you have more than one, you'll need to take the filesystem table into account because the full file path is going to be filesystem.dirpath + files.filepath, same if you are using NEARLINE storage since the same study could be duplicated in both storages.

You could always simply take a look at the path where the files are stored from within the GUI (See attachment)

All instances of a study are going to be stored (generally) on the same folder, they could be spread among many folders if additional series were sent to the PACS at different times, but if you identify a single folder simply check the size of it.
Screenshot from 2018-05-29 18-27-33.png

Docjay

unread,
May 30, 2018, 12:14:04 PM5/30/18
to dcm4che
I'm running mysql as my database - -

This may not be very helpful, but under 'Dashboard - Reports' I have this report that I added with this mysql code:

select sum(file_size) / 1073741824.0 as "GB Used Today"
from files AS f
where created_time >= curdate();

It adds up studies with today's date and gives it to you as 'GB Used Today'

Danny Kim

unread,
May 30, 2018, 1:24:16 PM5/30/18
to dcm4che
Hi Jon

Thank you for your advice.

Because I'm a beginner at mysql, just wanted to confirm that you run the below query after connecting to the pacsdb using mysql.

mysql -upacs -ppacs pacsdb

Best,

Danny

Jon Ander Zuccaro

unread,
May 30, 2018, 2:03:05 PM5/30/18
to dcm4che
Yes, exactly, but try to avoid writing the password directly into the command and it is probably a good idea to eventually change the default "pacs" password for security reasons if this is a production server.



Good luck

Danny Kim

unread,
May 30, 2018, 2:40:16 PM5/30/18
to dcm4che
Thanks Jon and Docjay

I'm connect to mysql and running a command that is something like this:
select sum(file_size) from files,instance,series,study,patient where instance.pk = files.instance_fk and series.pk = instance.series_fk and study.pk = series.study_fk and patient.pk = study.patient_fk and pat_id LIKE "SUBJECT_1_%";

It appears to be getting the file size fairly quickly.

Thanks very much!

Danny

Danny Kim

unread,
May 30, 2018, 5:05:39 PM5/30/18
to dcm4che
Hi Jon or Docjay

I just had one more question:
when I query the file_size in mysql with just a wildcard character "%" I get the below
mysql> select sum(file_size) from files,instance,series,study,patient where instance.pk = files.instance_fk and series.pk = instance.series_fk and study.pk = series.study_fk and patient.pk = study.patient_fk and pat_id LIKE "%";
+----------------+
| sum(file_size) |
+----------------+
|  2649418880718 |
+----------------+
1 row in set (4 min 22.82 sec)

That comes out to about
2467.463613 GB

But on the dashboard in dcm4chee-web3, the used for online_storage says it's at 3765.23GB

That's about 1 TB of unaccounted for data. Is there anything that gets missed when I'm querying the mysql db?

Best,

Danny

Jon Ander Zuccaro

unread,
May 30, 2018, 5:29:58 PM5/30/18
to dcm4che
Check the size of your "archive" folder, see if it matches the total sum of all file sizes.
I could be mistaken, but I believe the dashboard shows disk usage as in free space and used space for the entire disk where the file system resides.

Docjay

unread,
May 30, 2018, 5:50:31 PM5/30/18
to dcm4che
I would also do my query so that it is displayed in GB..

select sum(file_size) / 1073741824.0 from files,instance,series,study,patient where instance.pk = files.instance_fk and series.pk = instance.series_fk and study.pk = series.study_fk and patient.pk = study.patient_fk and pat_id LIKE "%";


Reply all
Reply to author
Forward
0 new messages