Pdf generated by PdfSharp encapsulation problems

542 views
Skip to first unread message

Carlos Iván Osorio

unread,
Oct 1, 2014, 6:01:19 PM10/1/14
to fo-d...@googlegroups.com
Good Day,

I'm having trouble encapsulating PDFs into DICOM files, all the images added appear correctly but each letter in the document appears rotated 180º in it's own axis

I'm generating PDFs using PdfSharp (both GDI+ and WPF compilation), to contain a report for endoscopy procedures, the method of generation for the PDF is preety straightforward, as described below:
- Create Page
- Create Graphics Object
- Create font (Encoded Unicode or WinAnsi and Embeded or not embeded) provides the same result
- Add endoscopy images
- Add texts of the report
- Save document.
* No rotation transformations whatsoever.

Then I'm trying to encapsulate the PDF to a DICOM file, using the method below:

public static string ImportPDF(string file, int seriesNumber)
{
DicomDataset dataset = new DicomDataset();
dataset.Add(DicomTag.SOPClassUID, DicomUID.EncapsulatedPDFStorage);
dataset.Add(DicomTag.StudyInstanceUID, Common.GenerateUid());
dataset.Add(DicomTag.SeriesInstanceUID, Common.GenerateUid());
dataset.Add(DicomTag.SOPInstanceUID, Common.GenerateUid());

dataset.Add(DicomTag.PatientID, "12345");
dataset.Add(DicomTag.PatientName, string.Empty);
dataset.Add(DicomTag.PatientBirthDate, "00000000");
dataset.Add(DicomTag.PatientSex, "M");
dataset.Add(DicomTag.StudyDate, DateTime.Now);
dataset.Add(DicomTag.StudyTime, DateTime.Now);
dataset.Add(DicomTag.AccessionNumber, string.Empty);
dataset.Add(DicomTag.ReferringPhysicianName, string.Empty);
dataset.Add(DicomTag.StudyID, "1");
dataset.Add(DicomTag.SeriesNumber, seriesNumber.ToString());
dataset.Add(DicomTag.ModalitiesInStudy, "OT");
dataset.Add(DicomTag.Modality, "OT");

byte[] fileData = ReadBytesFromFile(file);

dataset.Add(DicomTag.EncapsulatedDocument, fileData);

FileInfo fi = new FileInfo(file);
string directory = Path.Combine(fi.DirectoryName, @"DCM\DICOM");

if (!Directory.Exists(directory))
Directory.CreateDirectory(directory);

string path = Path.Combine(directory, Path.GetFileNameWithoutExtension(fi.Name) + "_" + DateTime.UtcNow.Ticks + ".dcm");

DicomFile ff = new DicomFile(dataset);

ff.Save(path);

return path;   
}

private static byte[] ReadBytesFromFile(string file)
{
using (FileStream fs = File.OpenRead(file))
{
try
{
byte[] bytes = new byte[fs.Length];
fs.Read(bytes, 0, Convert.ToInt32(fs.Length));
fs.Close();
return bytes;
}
finally
{
fs.Close();
}
}
}

I have tried changing the Transfer Syntax but doesn't seem to change anything.

If someone can provide some insight, maybe I'm missing some tag in the encapsulation or if someone has worked with both technologies, would be greatly appreciated

PD: I'm attaching for reproduction:
- screen capture for the original PDF (pdfImageCapture.png)
- screen capture for the encapsulated DICOM viewed in OpenDicomViewer (dcmImageCapture.png)
- original PDF file (tempfile.pdf)
- encapsulated DICOM file (tempfile.dcm)
PD2: all disturbing images have been stripped off the pdf for your own mental sanity and patient data has been stripped off for confidentiality reasons.
dcmImageCapture.png
pdfImageCapture.png
tempfile.dcm
tempfile.pdf
Message has been deleted

Carlos Iván Osorio

unread,
Oct 1, 2014, 6:04:51 PM10/1/14
to fo-d...@googlegroups.com
PD3: 
Not all the PDF gets rotated, and images appear correctly...
I'm using fo-dicom 1.0.38.0

Chris Horn

unread,
Oct 2, 2014, 7:23:38 PM10/2/14
to fo-d...@googlegroups.com
Im also using fo 1.0.38 and the below works for me from my test app

private void BuildData()
        {
            var pn = new DicomPersonName(DicomTag.PatientNameLastName.TextFirstName.Text);
            var Sex = GetSex();
            var dob = DateTime.Parse(Picker.Text);
 
            var dataset = new DicomDataset
            {
                {DicomTag.SOPClassUIDDicomUID.EncapsulatedPDFStorage},
                {DicomTag.StudyInstanceUIDGenerateUid()},
                {DicomTag.SeriesInstanceUIDGenerateSeriesUid()},
                {DicomTag.SOPInstanceUIDGenerateUid()},
                {DicomTag.PatientIDPatientId.Text},
                {DicomTag.PatientNamepn.Get<String>()},
                {DicomTag.PatientSexSex},
                {DicomTag.PatientBirthDatedob},
                {DicomTag.StudyDateDateTime.Now},
                {DicomTag.StudyTimeDateTime.Now},
                {DicomTag.AccessionNumberAccessionNo.Text},
                {DicomTag.StudyID"1"},
                {DicomTag.SeriesNumber"1"},
                {DicomTag.Modality"OT"},
                {DicomTag.ReferringPhysicianNameString.Empty},
                {DicomTag.NumberOfStudyRelatedInstances"1"},
                {DicomTag.NumberOfStudyRelatedSeries"1"},
                {DicomTag.NumberOfSeriesRelatedInstances"1"},
                {DicomTag.DocumentTitle"Results_ENG"},
                {DicomTag.EncapsulatedDocumentReadBytesFromFile(_filename)},
                {DicomTag.MIMETypeOfEncapsulatedDocument"application/pdf"}
            };
 
            var ff = new DicomFile(dataset);
            if (!Directory.Exists("OutputFiles"))
            {
                Directory.CreateDirectory("OutputFiles");
            }
            ff.Save(@"OutputFiles\Test.dcm");
 
            Dataset = dataset;
        }

I have zipped a copy of the test app and attached it.
to use click the "Select Pdf" button a .dcm file will be created in <App start>\OutputFiles.

click the open & extract, the pdf will be opened in the main window, and a .pdf file is exctacted to <App start>\OutputFiles. along with a .jpg for each page.

the open button will open an existing dicom encapsulated PDF file

Hope this helps
WpfPdf2Dicom_demo.zip

Chris Horn

unread,
Oct 2, 2014, 7:35:12 PM10/2/14
to fo-d...@googlegroups.com
just tried it with your sample pdf and it seems fine, whoever, the third party lib I'm using to display the pdf in inside wpf (muPdf)
seems to be failing to generate the .jpg files and display in the form...

the encapsulate and extract functions work well

Chris Horn

unread,
Oct 2, 2014, 7:37:55 PM10/2/14
to fo-d...@googlegroups.com
the extract code i'm using is :
var data = Dataset.Get<Byte[]>(DicomTag.EncapsulatedDocument);
 
            #region save PDF to disk
 
if (!Directory.Exists("OutputFiles"))
{
    Directory.CreateDirectory("OutputFiles");
}
var writer = new BinaryWriter(File.OpenWrite(@"OutputFiles\Text.dcm.pdf"));
// Writer raw data                
writer.Write(data);
writer.Flush();
writer.Close();
 
            #endregion

Carlos Iván Osorio

unread,
Oct 3, 2014, 12:58:43 PM10/3/14
to fo-d...@googlegroups.com
Hello Chris, thank you for your kind response,

The encapsulation mehtod works well, as well as the extraction method, but if you check the .dcm file with a viewer, you'll find the letters are still rotated, probably it's a problem with the PdfSharp and the MigraDoc library which apparently (I'm assuming) writes the font bytes in a different encoding (LE probably), therefore, it rotates letter by letter but they mantain their order.

Here's a code used for generating a HelloWorld script in MigraDoc, for testing purpouses: 

using (FileStream fs = new FileStream(Path.Combine(path, "pdf.pdf"), FileMode.Create, FileAccess.Write, FileShare.None))
{
using (Document doc = new Document())  {
using (PdfWriter writer = PdfWriter.GetInstance(doc, fs))
{
doc.Open();
doc.Add(new Paragraph("Hello World"));
doc.Close();
}
}

}

And here's the snippet for generating it with PdfSharp

PdfDocument document = new PdfDocument();
document.Info.Title = "Created with PDFsharp";
PdfPage page = document.AddPage();
XGraphics gfx = XGraphics.FromPdfPage(page);
XFont font = new XFont("Verdana", 20, XFontStyle.BoldItalic, new XPdfFontOptions(PdfFontEncoding.Unicode, PdfFontEmbedding.None));
gfx.DrawString("Hello, World!", font, XBrushes.Black,
new XRect(0, 0, page.Width, page.Height),
XStringFormats.Center);


string filename = Path.Combine(path, "pdf.pdf");
document.Save(filename);

I'm attaching the viewer I'm using for checking those DCM encapsulated files.



OpenDicomViewer-0.9.1.zip

Carlos Iván Osorio

unread,
Oct 7, 2014, 4:09:55 PM10/7/14
to fo-d...@googlegroups.com
Aparently, for what I've found in my investigation, it would appear that PdfSharp embeds certain font information backwards in its bytes, therefore, it just doesn't cut the DICOM PDF encapsulation, however, I've found no problem working with iTextSharp, just for future reference... I've got it to work with iText libraries.

Thanks
Reply all
Reply to author
Forward
0 new messages