So my question: How can I create a UTF-8 - PDF?
I tried the stabil version and so on I tried it with the newest Beta.
thanking you in anticipation
--
You received this message because you are subscribed to the Google Groups "dompdf" group.
To post to this group, send email to dom...@googlegroups.com.
To unsubscribe from this group, send email to dompdf+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/dompdf?hl=en.
Even with the upcoming release there are some limitation in character
support. The standard character encoding options in PDF only support a
limited number of characters out of the box. DOMPDF currently only
supports one of these encodings, which is a subset of Windows ANSI. If
you attempt to use the core PDF fonts you'll run into the problem you
noted.
The current recommended method to work around this limitation in
DOMPDF is to "install" a font and enable Unicode support.
> With PHP I use the function: "utf8-decode()"
I tried the function utf8_decode() but my the first part of the
teststring was deleted
(teststring:
>öäü߀ÄÖÜ<br>
>öäüßÄÖÜ<br>
>ąčęėįšųūž
the ö are a html-code für ö and so on)
> Even with the upcoming release there are some limitation in character
> support. The standard character encoding options in PDF only support a
> limited number of characters out of the box. DOMPDF currently only
> supports one of these encodings, which is a subset of Windows ANSI. If
> you attempt to use the core PDF fonts you'll run into the problem you
> noted.
>
> The current recommended method to work around this limitation in
> DOMPDF is to "install" a font and enable Unicode support.
Ok so I have to install a unicode-font.... do you know one, which I
can use (maybe with web-address ^^)?
The font doesn't have to fully support Unicode, it only needs to
support he characters you need. DOMPDF works in Unicode because that's
how you can define extended character encodings in PDF documents. So
you should be able to use any of your system fonts. If you're not
comfortable setting up the ttf2ufm on your system you can try out the
web-based font prep tool:
http://groups.google.com/group/dompdf/browse_thread/thread/9f7bc0162b04d5cf
And so on if I decode the string (utf8_decode) before rendering all
signes with an inverted roof would be changed with questionmarks.
now I tried to use a font from fontspace (http://www.fontspace.com/red-
hat-inc/liberation-sans) which normally supports symbols like 'č'.
I checked it online (it's possible to try the font by typing words in
one of the textboxes) and I tried it with the gnome font viewer.
Everytime it's possible to display the letters.
So i converted local with the load_font.php - script (with no
failures).
(./php ./load_font.php liberation /..../LiberationSans-
Regular.ttf /..../LiberationSans-Bold.ttf /..../Desktop/LiberationSans-
Italic.ttf /...../LiberationSans-BoldItalic.ttf)
Then I changed the defaultfont (define("DOMPDF_DEFAULT_FONT",
"liberation");) and generated a test-pdf with this php-script:
require_once("dompdf_config.inc.php");
$html = "ŠŘšžě";
//$html = mb_convert_encoding($inhalt, "iso-8859-2", "utf8");
$inhalt = utf8_decode($html);
$dompdf = new DOMPDF();
$dompdf->load_html($html);
$dompdf->render();
$dompdf->stream("pdf_file.pdf");
When I am using mb_convert.... the result is "©Ø¹¾ì" and
when I am using utf8_decode the result is "?????"
That's .... really .... strange o_O
I haven't tried with any of the fonts you specified. I am, however,
able to use Verdana from my own system and the characters display
correctly.
> So i converted local with the load_font.php - script (with no
> failures).
> (./php ./load_font.php liberation /..../LiberationSans-
> Regular.ttf /..../LiberationSans-Bold.ttf /..../Desktop/LiberationSans-
> Italic.ttf /...../LiberationSans-BoldItalic.ttf)
>
> Then I changed the defaultfont (define("DOMPDF_DEFAULT_FONT",
> "liberation");) and generated a test-pdf with this php-script:
>
> require_once("dompdf_config.inc.php");
> $html = "ŠŘšžě";
> //$html = mb_convert_encoding($inhalt, "iso-8859-2", "utf8");
> $inhalt = utf8_decode($html);
> $dompdf = new DOMPDF();
> $dompdf->load_html($html);
> $dompdf->render();
> $dompdf->stream("pdf_file.pdf");
>
> When I am using mb_convert.... the result is "©Ø¹¾ì" and
> when I am using utf8_decode the result is "?????"
One of the problems here is that you are not supplying a full HTML
document. It may be that even when using mb_convert_encoding() PHP is
reporting the character set of the string incorrectly to DOMPDF.
Create a full document including a meta tag. I've tried this on my
system and it appears to work correctly. Try something like:
require_once("dompdf_config.inc.php");
$html = '<html>
<head>
<meta http-equiv="Content-Type"
content="text/html;charset=ISO-8859-2" />
</head>
<body><p>ŠŘšžě</p></body>
</html>';
$dompdf = new DOMPDF();
$dompdf->load_html($html);
$dompdf->render();
$dompdf->stream("pdf_file.pdf");
I would definitely not use utf8_decode. That will, essentially, take
any characters that fall outside the ISO-8859-1 character set and
convert them to "?" ... as you have seen.
Lastly, which version of DOMPDF are you using? Do you have
DOMPDF_UNICODE_ENABLED set to true?
I am using linux to work so I don't really know is there a verdana-
font preinstalled? so when I try to locate it ("locate verdana") the
program finds only my downloaded font.
to your question: Yes I set the DOMPDF_UNICODE_ENABLED to true and I
tried it with dompdf 0.5.x and the 0.6 beta.
At the moment I am installing the beta a second time (so when there
was a problem with my first installation maybe it solve the problem by
reinstalling).
so I don't think that there is the problem.
html-code:
Ą˘Ł¤ĽŚ§¨ŠŞŤŹŽŻ°ą˛ł´ľśˇ<br>
¸šşťź˝žżŔÁÂĂÄĹĆÇČÉĘËĚÍ<br>
ÎĎĐŃŇÓÔŐÖ×ŘŮÚŰÜÝŢßŕáâă<br>
äĺćçčéęëěíîďđňóôőö÷řůú<br>
űüýţ˙<br>
pdf-code:
???¤??§¨Š???Ž?°???´???
¸š????ž??ÁÂ?Ä??Ç?É?Ë?Í
Î????ÓÔ?Ö×??Ú?ÜÝ?ß?áâ?
ä??ç?é?ë?íî???óô?ö÷??ú
?üý??
So it would be the best way to use UTF-8, because the next weeks I
have to add the russian font, too.
What can I do to use all the ISO-8859-2 - chars?
This path doesn't look to be quite right (e.g. it looks like it's
pointing to the system root then /lib/fonts). Your font and the *.afm/
*.ufm files should all go in your dompdf installation folder dompdf/
lib/fonts. You can then make the array entry look like this:
'liberation' =>
array (
'normal' => DOMPDF_FONT_DIR . 'LiberationSans-Regular',
'bold' => DOMPDF_FONT_DIR . ' LiberationSans-Bold',
'italic' => DOMPDF_FONT_DIR . ' LiberationSans-Italic',
'bold_italic' => DOMPDF_FONT_DIR . ' LiberationSans-BoldItalic'
)
this is strange, because I didn't changed the path by myself.
So I changed all entries to ...=>DOMPDF_FONT_DIR . '... but it's stil
not right.
There is a very strange thing to:
when I edit the defaut-font for example to courier and don't set a css-
style for
the body-tag, the font doesn't change to courier in the PDF.
And so on I think there domPDF doesn't use every entry of the
font_family_cache - file.
I copied 8 fonts (czech courier - reg, bold, italic and bolditalic
(.afm and .pfa)) into the /lib/fonts folder
and added this the font_family_cache:
'newcourier' =>
array (
'normal' => DOMPDF_FONT_DIR . 'Cour',
'bold' => DOMPDF_FONT_DIR . 'Courb',
'italic' => DOMPDF_FONT_DIR . 'Couri',
'bold_italic' => DOMPDF_FONT_DIR . 'Courbd',
),
But when I try to use the font for the body-tag (<style>body{....) the
pdf file has the
standardfont. Same font like I would delet the style-tag. ....
Some of the defines of my dompdf_config.inc.php:
...
define("DOMPDF_DIR", str_replace(DIRECTORY_SEPARATOR, '/',
realpath(dirname(__FILE__))));
...
define("DOMPDF_FONT_DIR", DOMPDF_DIR . "/lib/fonts/");
...
define("DOMPDF_FONT_CACHE", DOMPDF_FONT_DIR);
...
define("DOMPDF_UNICODE_ENABLED", true);
...
define("DOMPDF_DEFAULT_PAPER_SIZE", "a4");
...
define("DOMPDF_DEFAULT_FONT", "courier");
When you use the load_font.php script it updates your file with an
absolute path. I haven't seen it have any path problems when using the
script, but it's a possibility. If you didn't use the script then I'm
not sure why the entries were changes.
> There is a very strange thing to:
> when I edit the defaut-font for example to courier and don't set a css-
> style for
> the body-tag, the font doesn't change to courier in the PDF.
>
> And so on I think there domPDF doesn't use every entry of the
> font_family_cache - file.
> I copied 8 fonts (czech courier - reg, bold, italic and bolditalic
> (.afm and .pfa)) into the /lib/fonts folder
> and added this the font_family_cache:
>
> 'newcourier' =>
> array (
> 'normal' => DOMPDF_FONT_DIR . 'Cour',
> 'bold' => DOMPDF_FONT_DIR . 'Courb',
> 'italic' => DOMPDF_FONT_DIR . 'Couri',
> 'bold_italic' => DOMPDF_FONT_DIR . 'Courbd',
> ),
>
> But when I try to use the font for the body-tag (<style>body{....) the
> pdf file has the
> standardfont. Same font like I would delet the style-tag. ....
That is odd. I would expect that if you had no styles defined or font
tags that DOMPDF would use your default font. Can you post a sample
document?
> Some of the defines of my dompdf_config.inc.php:
> ...
> define("DOMPDF_DIR", str_replace(DIRECTORY_SEPARATOR, '/',
> realpath(dirname(__FILE__))));
> ...
> define("DOMPDF_FONT_DIR", DOMPDF_DIR . "/lib/fonts/");
> ...
> define("DOMPDF_FONT_CACHE", DOMPDF_FONT_DIR);
> ...
> define("DOMPDF_UNICODE_ENABLED", true);
> ...
> define("DOMPDF_DEFAULT_PAPER_SIZE", "a4");
> ...
> define("DOMPDF_DEFAULT_FONT", "courier");
These all look fine to me.
Here the actual example:
require_once("dompdf_config.inc.php");
$html = '<html>
<head>
<meta http-equiv="Content-Type" content="text/
html;charset=ISO-8859-2" />
<style>
body{
font-family: newcourier;
}
</style>
</head>
<body>
<p>
Ą˘Ł¤ĽŚ§¨ŠŞŤŹŽŻ°ą˛ł´ľśˇ<br>
¸šşťź˝žżŔÁÂĂÄĹĆÇČÉĘËĚÍ<br>
ÎĎĐŃŇÓÔŐÖ×ŘŮÚŰÜÝŢßŕáâă<br>
äĺćçčéęëěíîďđňóôőö÷řůú<br>
űüýţ˙<br>
The quick brown fox jumps over the lazy dog</p></body>
</html>';
$html = mb_convert_encoding($html, "iso-8859-2", "utf8");
$dompdf = new DOMPDF();
$dompdf->load_html($html);
$dompdf->render();
$dompdf->stream("pdf_file.pdf");
But when I delete the <style> tag out of the example, the script will
not use the courier - font.
Otherwise your script looks ok. Do you receive any error messages from
PHP, either when attempting to render your PDF or when you use the
load_font.php script?
is there no solution for problem?
plz help
Have you had any more luck getting things to work?