Character set support while creating DICOM File

204 views
Skip to first unread message

harithacha...@gmail.com

unread,
Apr 14, 2022, 12:54:08 PM4/14/22
to dcm4che
Hi,

I am trying to convert PDF to DICOM with japanese patient name アーミア using the dcm4che pdf2dcm binary. I have also added the tag 00080005 with ISO-IR 13. 

But on viewing the DICOM file the patient name is displayed as ????

Please let me know what can be to done to resolve the issue

Thanks,
Haritha C Mouli

Vrinda Nayak

unread,
Apr 15, 2022, 3:20:29 AM4/15/22
to dcm4che
It depends upon how you encoded these characters and also the character set you used while converting pdf to dicom. I used the same name equivalent (アーミア) as in your example, but as encoded and available in JIS X 0201 - Single-byte character set of Shift JIS table for Katakana
pdf2dcm-JapaneseName.png

harithacha...@gmail.com

unread,
Apr 18, 2022, 12:59:10 AM4/18/22
to dcm4che
Hi Vrinda,

Thanks for your quick response.I tried the same with dcm4che - 5.22.6 binary. Please find the attached images for the command and the dcmdump of converted DICOM file.

Also i used xml which had all the tag values. Please refer IWATTPDF.xml for the same.

Could u please let me know where i am going wrong. 

Thanks a lot in advance.
IWATTPDF.xml
pdf2dcm-Japanese name.png
pdf2dcm command.png

harithacha...@gmail.com

unread,
Apr 18, 2022, 7:05:26 AM4/18/22
to dcm4che
Also , could you please elaborate this point.

 I used the same name equivalent (アーミア) as in your example, but as encoded and available in JIS X 0201 - Single-byte character set of Shift JIS table for Katakana

how did u encode it and then use it in the command?

Vrinda Nayak

unread,
Apr 19, 2022, 8:01:30 AM4/19/22
to dcm4che
アーミア are full width characters, whereas if you try with half width characters for the same Katakana alphabets eg. アーミア you would see the Patient Name correctly encoded in your converted DICOM object. This is because only half width characters of JIS X 0201 are supported in UTF-8

Vrinda Nayak

unread,
Apr 19, 2022, 8:24:17 AM4/19/22
to dcm4che
I rectify my previous statement
- This is because only half width characters of JIS X 0201 are supported in UTF-8

Only half-width Katakana characters are part of JIS X 0201 as specified in https://en.wikipedia.org/wiki/JIS_X_0201.
JIS_X_0201-Katakana.png

harithacha...@gmail.com

unread,
Apr 19, 2022, 9:05:28 AM4/19/22
to dcm4che
Hi Vrinda,

Thanks a lot, it worked! 
Another quick question,

The characterset ISO_IR 192. i used this for the above mentioned japanese name and a german named. It worked as expected. Can i use this ISO_IR 192 for all multi byte character names?

if not, what would be the characterset used for german and chinese names? and like for japanese how there was JIS X 0201, is there something similar for other multi byte characters?

Gunter Zeilinger

unread,
Apr 19, 2022, 9:21:26 AM4/19/22
to dcm4che
With UTF-8/ISO_IR 192 you can encode character of potential all languages, but you should verify that your target environment (particular modalities and third party image displays) support them. In hospitals in Europe there are still older modalities used, which only supports ISO 8859-# character sets, and I am not aware what equipment is (still) in use at healthcare facilities in Japan today.
Reply all
Reply to author
Forward
0 new messages