I use openoffice to export it as a PDF file, and then use latex
(pdflatex) with tikz to include an image of the page in a latex
document.
The PDF created by openoffice (or the one generated by pdflatex) gives
two errors with acroread (not with xpdf). Then acroread display a blank
page and somehow hangs X. The errors are
"too few operands in path"
"an unrecognized token 820.4m was found"
What is the cause and how can I avoid/fix it ?
I have verified that removing the logo, and exporting as PDF, is fine,
so the culprit is the logo.
Also if I remove the logo and replace it with a somewhat equivalent GIF,
the export generates a well behaved PDF. But I have to play around with
the aspect ratio, and it would be a bit of a pain to do it manually all
the times.
--
----------------------------------------------------------------------
nos...@mi.iasf.cnr.it is a newsreading account used by more persons to
avoid unwanted spam. Any mail returning to this address will be rejected.
Users can disclose their e-mail address in the article if they wish so.
There is supposed to be a space between the 820.4 and the m. These are
both pointing to the same error.
> What is the cause and how can I avoid/fix it ?
How to fix it? I don't know. Just going in and adding a space will throw
off the xref table.
> I have verified that removing the logo, and exporting as PDF, is
> fine, so the culprit is the logo.
So it must be drawing the logo, rather than embedding it as a bitmap.
> Also if I remove the logo and replace it with a somewhat equivalent
> GIF, the export generates a well behaved PDF.
Because it embeds the gif, rather than trying to draw it.
If the logo is a bitmap, save the file as OOXML (.docx), unzip the file
and copy the image out of the images directory.
If it's a drawing done with Word's drawing tool, forget it. It *might*
be possible to extract the .emf code from the document.xml, and then run
emf_decode.pl (unzip it first if it's a compressed emz image) to turn it
into a dia file, and then run dia to make a png out of it, but usually
it's better if you get it redrawn in something sensible like SVG.
///Peter
> On 21/09/10 13:32, LC's No-Spam Newsreading account wrote:
>> I have received a Word doc file containing a logo.
>
> If the logo is a bitmap, save the file as OOXML (.docx), unzip the file
> and copy the image out of the images directory.
hmm ... I'm at the Linux end, I'm the one using openoffice and latex,
cannot handle docx.
> If it's a drawing done with Word's drawing tool, forget it.
I doubt it was drawn. The logo is our "corporate" logo, which is
supplied as jpg and gifs on our web site. I ignore how the
administration inserted it in their Word document (probably somebody who
is no longer with them did it).
I can make some sort of manual fix like this:
1 open doc with openoffice
2 delete logo
3 insert logo from gif (but I have to remember positions and scale
to mimic the one they used, seem to have a different aspect ratio,
probably the white borders were cropped somehow)
4 export to pdf
5 imbed entire pdf in my latex document as image
Ideally I wanted something which could automatize steps 1-4. Maybe some
way of telling them how to fix it. Or some macro to replace steps 1-4.
Or some utility (like psutils) or script to run between 4 and 5 to "fix"
the PDF (the PDF is "sort of" OK, in the sense xpdf reads it, but
acroread doesn't).
I hoped the "PDF produced with openoffice from word doc containing
logo crashes acroread" was sort of a FAQ with a standard fix (like some
that psutils fixed for postscript files produced by some common s/w and
not fully compliant with Adobe DSC).
Can you update your Linux packages? OpenOffice 3.2 handles .docx OK.
Alternatively, open the document and save as ODF from OpenOffice. The
ODF format is also a zip file so you can access the images in it. Or
save as HTML, which will force extraction of any images into external files.
> I doubt it was drawn. The logo is our "corporate" logo, which is
> supplied as jpg and gifs on our web site. I ignore how the
> administration inserted it in their Word document (probably somebody who
> is no longer with them did it).
:-) Not uncommon. Undocumented mods break the business process.
> I can make some sort of manual fix like this:
>
> 1 open doc with openoffice
> 2 delete logo
> 3 insert logo from gif (but I have to remember positions and scale
> to mimic the one they used, seem to have a different aspect ratio,
> probably the white borders were cropped somehow)
> 4 export to pdf
> 5 imbed entire pdf in my latex document as image
>
> Ideally I wanted something which could automatize steps 1-4.
Unfortunately (and IMHO carelessly), OO does not appear to have a
commandline option for saving and exiting. It ought to be possible to
use something like
oowriter -headless -o somefile.doc -w somefile.odf -exit
so that the process could be scripted. But have a look at
odf-converter-integrator
(http://katana.oooninja.com/w/odf-converter-integrator; .deb available
at sourceforge).
However, Abiword is supposed to be able to do this:
abiword --to=odf somefile.doc
but in my copy (with the plugins installed and enabled by default) this
simply does not work (does nothing at all).
You certainly could do exactly what you describe if you are working with
a .docx file, because you could write a script in XSLT to change the
content of the graphics element which holds the image.
I regularly extract images and reuse them in LaTeX, but I am working
from WordML and OOXML and ODF documents, not the obsolete .doc format.
> Maybe some way of telling them how to fix it.
Tell them they must send you .docx files, not .doc files. The .doc
format is dead.
///Peter