XML has very explicit rules on determining the encoding in a document,
it is either UTF-8, UTF-16 or UCS-4 unless explicitly stated otherwise
and an XML processor never tries to determine the encoding based on
the local machine's locale. I'm not so sure that is the case with
HTML and may differ between applications. For reference look at: http://www.w3.org/TR/REC-html40/charset.html
You don't mention what encoding your HTML is saved in or how it was
edited. I assume that the page properly renders when you load it into
a web browser, but that just may be due to the browser's convention
being in sync with how you created your content and OpenOffice.org's
guessing wrong.
I would try:
a) testing with a UTF-16 HTML document. That should eliminate any
encoding confusion.
b) Adding a <meta> tag to HTML to specify the encoding if it saved in
a local file. I think this has less chance of being effective.
c) Adjust the encoding used for the HTML file and/or your locale
settings.
2009/6/21 Arindam Lahiri <arindam...@gmail.com>
>
> But the file test1234.pdf has the same problem. I also thought that
> maybe there is some encoding issue so I used oHTMLText.getBytes
> ("UTF-8") but to no use meanwhile this strHindiHTML string is stored
> in oracle clob, again retrieved from database and renders flawlessly
> in browser.
>
Again, start by making sure that the HTML renders correctly in
OpenOffice.org, by opening the file manually in Writer.
Have a look at the FAQ: http://code.google.com/p/jodconverter/wiki/FAQ
Kind regards
Mirko
2009/6/22 Arindam Lahiri <arindam...@gmail.com>
>
> It renders perfectly manually in OpenOffice 3.1.0 and I have done
> conversion through command line also from html to pdf using
> jodconverter which is perfect too. The conversion is also perfect
> using java.io.File and Java Library of jodconverter 2.2.2 but is not
> properly taking place using streams and that is what bothers me the
> most as this is the way I intend to use it in my web application.
>
Just use files then, OOo works with files anyway. The convert() method
that accepts streams will create temporary files internally.
Kind regards
Mirko