[Best Practices] How i can export inline images from current document?

195 views
Skip to first unread message

Daniel Lutz

unread,
May 8, 2014, 7:14:55 AM5/8/14
to pdfne...@googlegroups.com
Hello Support,

i want to know some details about extracting images from a pdf document. I use the following code:

if (element.GetType() != Element.Type.e_image &&
                           element
.GetType() != Element.Type.e_inline_image) continue;
 
                       
var ctm = element.GetCTM();
                       
double x2 = 1, y2 = 1;
                       ctm
.Mult(ref x2, ref y2);

var result =  (new pdftron.PDF.Image(element.GetXObject())).GetBitmap();


I found a topic in the group that element.GetXObject() is not the right way for inline images. So  i used this: Link The problem is that the export image is rotated. In the page it is normal but extracted it is backflip. So how is the best practices to extract images in
unrotated original form.

Regards
Daniel

Aaron

unread,
May 8, 2014, 2:07:21 PM5/8/14
to pdfne...@googlegroups.com
These SDK articles are the correct way to extract inline images:

https://groups.google.com/d/msg/pdfnet-sdk/K1Po0-1vylU/uaeedGmYdKoJ
https://groups.google.com/d/msg/pdfnet-sdk/R_a0P8blPP8/plCls22kZ_AJ

It may be that the document reverses the image in the PDF content stream.  If you forward the document to sup...@pdftron.com, we can take a closer look. 

Another option for you could be to instead rasterize the page with PDFDraw:

http://www.pdftron.com/pdfnet/samplecode.html#PDFDraw

In that case, you would be certain that rotation would be identical to the original.


Aaron

unread,
May 14, 2014, 9:40:50 PM5/14/14
to pdfne...@googlegroups.com
If you extract the image using ImageExtract, it will show you the coordinates of the image.  This is how you could detect that the image is displayed upside-down:

--> Image: 1
    Width: 451
    Height: 294
    BPC: 8
    Coords: x1=137.43509, y1=336.31201, x2=353.51819, y2=195.25781
--> Image: 2
    Width: 440
    Height: 342
    BPC: 8
    Coords: x1=137.3755, y1=483.56018, x2=348.53619, y2=647.90338


The first image is upside-down --- its y1 coordinate is larger than its y2 coordinate.  If you want to automatically flip such images, you could detect for this case and post-process the output.  For example, using ImageMagick the command would be:

convert.exe upside_down_image.jpg -flip corrected_image.jpg

Daniel Lutz

unread,
May 15, 2014, 4:48:13 AM5/15/14
to pdfne...@googlegroups.com
Is that the complete interpretation of coordinates or is there any other possible way? When y1<y2  and x1<x2 isn't the picture rotated and when x1<x2 and y1><2  the image is upside-down and must rotate in 180 degrees? When this interpretations are only the possible cases so i can rotate the image after extracting.

Aaron

unread,
May 15, 2014, 4:19:39 PM5/15/14
to pdfne...@googlegroups.com
> When this interpretations are only the possible cases so i can rotate the image after extracting.

Note also that the coordinates as calculated by the ImageExtract sample fail to account for any page rotation.  It's possible that page rotation could be used in a PDF to compensate for image rotation.  (This should be very rare, but is possible.)  You could compensate for the rotation when calculating coordinates  (https://groups.google.com/d/msg/pdfnet-sdk/4sPgTwkaAoE/shXolsUDUs0J) or simply detect page rotation (http://www.pdftron.com/pdfnet/PDFNet/html/M_pdftron_PDF_Page_GetRotation.htm) and manually inspect the images to determine if further processing is required.

Ryan

unread,
Jan 13, 2016, 2:39:30 PM1/13/16
to PDFTron PDFNet SDK
The C# ImageExtractTest sample has been updated to handle inline images, including flipping. Below are the relevant code snippets.

if (element.GetType() == Element.Type.e_inline_image)
{
   
Image2RGB image2rgb = new Image2RGB(element);
   
FilterReader image_reader = new FilterReader(image2rgb);
    pdftron
.PDF.Image image = pdftron.PDF.Image.Create(doc, image_reader,
element.GetImageWidth(), element.GetImageHeight(), 8, ColorSpace.CreateDeviceRGB());
    image.Export(filename);
}

static byte[] FlipImage(Element element)
{
   
Image2RGB image2rgb = new Image2RGB(element);
   
int width = element.GetImageWidth();
   
int height = element.GetImageHeight();
   
int out_data_sz = width * height * 3;
   
int stride = width * 3;
   
FilterReader reader = new FilterReader(image2rgb);
   
byte[] image_data = new byte[out_data_sz];
   
byte[] flipped_data = new byte[out_data_sz];
    reader
.Read(image_data);
   
for (int row = 0; row < height; ++row)
   
{
       
Buffer.BlockCopy(image_data, row * stride, flipped_data, out_data_sz - (stride * (row + 1)), stride);
   
}
   
return flipped_data;
}



Ryan

unread,
Mar 19, 2021, 5:00:35 PM3/19/21
to PDFTron SDK
Attached is the full C# code for the modified ImageExtractTest sample containing the two code blocks above.
ImageExtractTest.cs
Reply all
Reply to author
Forward
0 new messages