Hi Siddharth,
What is the utility you are using to convert ps to pdf. This seems to
be a known bug. Here's what I found from the net, which may give us a
clue
http://osdir.com/ml/text.xml.fop.devel/2003-01/msg00297.html :
On 22.01.2003 23:55:14 Arnd Beißner wrote:
> Hello there,
>
> after some research I found and fixed a bug in the PS renderer
> that can be a real nuisance.
Yeah, one that I never got round to fix.
> The problem is as follows: The ascii (and Unicode) minus
> character is mapped to the hyphen character by the PDF
> renderer. The PostScript renderer instead maps it tho the
> minus character. This happens because the generated
> PS code reencodes the fonts to ISO Latin 1 encoding, which
> handles ascii code 45 differently from the standard PS font
> encoding.
>
> Typograpically, the character at 45 in ISOLatin1 is a real minus,
> and the character at 45 in Standard Encoding is a hyphen, which
> is about half as wide as the minus in your average font. The
> difference in your PS output can be quite destructive, as FOP
> always formats assuming the width of the hyphen character...
>
> A "patch" follows. The reason I'm not yet submitting a real diff
> to Bugzilla is that I am a) extremely overloaded right now and
> b) this really needs to be discussed:
>
> Some thoughts on this
> (by 'FOP' I mean formatter+PDF renderer code):
>
> 1. Who's right and who's wrong?
> Either FOP - or - the PS renderer is right, but who?
I'm sure that the PS renderer is wrong. When I wrote it I've used
ISOLatin1 encoding because it got more characters right than with
StandardEncoding. :-) I didn't want to spend too much time on this
because at that time the PS renderer was merely a proof-of-concept.
> 2. If FOP is right, then the PS renderer must be
> fixed. This can be done either by fixing the method
> renderWordArea or by changing the PS procedures.
> However, the latter would increase PS file size
> (can't copy the ISO latin 1 enconding as opposed
> to the standard encoding), so I opted for changing
> renderWordArea.
Not happy with that on the long run. For immediately fixing this it's
ok.
When I rewrite the PS renderer for the redesign I intend to get that
right from the beginning. The problem is not just the hyphen
character.
There are others. The problem is that the base14 fonts are set to
WinAnsiEncoding (see org.apache.fop.render.pdf.fonts.Helvetica) and
the
PS renderer uses ISOLatin1. So, depending on the characters used you
get
multiple mismatches not just the hyphen character. What we probably
need
is a custom encoding scheme like Acrobat Reader uses when converting
PDF
to PostScript (PDFEncoding). That'll be some work...
> 3. If FOP is wrong, then probably someone else
> must fix it - I suppose I won't find the right place
> for the fix easily.
FOP is right.
> Personally I think the PS renderer is wrong, since
> the original Adobe PS character encoding maps
> ascii 45 to the hyphen character and Adobe usually
> knows what they're doing. Still, at that point in time,
> Unicode wasn't there yet, so...
>
> This is an issue that we may possible want
> to solve before 0.20.5 goes final. Personally, I won't
> have time before the weekend to check with
> the Unicode and/or XSL spec.
>
> Any comments/ideas?
>
> --------------- temp fix that I use ---------------------------
<snip/>
I'll put your fix in but I can't guarantee that it'll be before
Christian does the release.
Jeremias Maerki
- Vikram
On Jan 9, 8:48 pm, Siddharth Deshpande <
07.siddha...@gmail.com> wrote:
> Hi Samuel,
>
> Thanks for the reply.
>
> Once URL is opened up in Browser because it has issue as explained below. So
> when we click in PDF, one dialogue box is displayed as per PDF
> functionality, and there this link is displayed as:
http://www.abc%E2%88%
92aaa.com%00%00
> <
http://www.abc%e2%88%
92aaa.com%00%00/> and
> if we click on Allow, then Internet Explorer gets opened up with DNS error
> as explained below, and there in the address bar of IE, what URL we get is:
http://www.abc-92aaa.com%00%00, and this hyphen in address bar is quite
> bigger in size than Normal available on keyboard, we can clearly notice the
> difference. So actually instead of NON-BREAKING HYPHEN (e2 80 91) what
> character getting printed is: MINUS SIGN (e2 88 92) in the chart below, and
> also may be due to this at then %00%00 gets padded.
> So now, if we replace in adress bar this bigger hyphen to that with normal
> available on Keyboard and also remove %00%00 from the end, and then click
> Refresh, then URL works fine W/O any issues.
> Please help me getting this issue resolved, would be a great help.
>
> Thanks and regards,
> Siddharth A Deshpande
>
> On Thu, Jan 8, 2009 at 9:06 PM, Samuel Ma <
maluf...@gmail.com> wrote:
> > Hi,
> > From the Unicode website,we can see
> > - e2 88 92 MINUS SIGN ‐ e2 80 90 HYPHEN - e2 80 91 NON-BREAKING
> > HYPHEN
>
> > if you replace %E2%88%92 to %E2%80%90 or %E2%80%91 in address bar of
> > browser and refresh,is it work?
> > or you just simply replace %E2%88%92 to '-'(hyphen) and refresh,is it
> > work?
>
> > Regards
> > Samuel
>
> > On Fri, Jan 9, 2009 at 9:29 AM, Siddharth Deshpande <
> >
07.siddha...@gmail.com> wrote:
>
> >> Hi,
>
> >> Can anyone help me on the below issue, I am facing.
>
> >> *Environement details:*
> >> *Database: Oracle 10g*
> >> *Oracle Applications Release: 11.5.10*
> >> *Database Charset: UTF8*
> >> *Report developer used: Oracle Forms/reports 6i.*
>
> >> Business has one URL which needs to be displayed in the report output, and
> >> this URL has hyphen (-) included in it. for eg:
http://www.abc-aaa.com
> >> Report displaying this is registered with a Concurrent program whose
> >> output format is PostScript (PS), and this output in PS format is further
> >> re-directed to third party PostScript to PDF converter, which converts this
> >> PostScript O/P to PDF. When, generated PDF is checked, and if we click on
> >> URL printed in the report file, it actually replaces the hyphen in URL by
> >> some Unicode format value, for eg: above link then is displayed as:
> >>
http://www.abc%E2%88%
92aaa.com%00%00<
http://www.abc%e2%88%
92aaa.com%00%00/> so
> >> this totally becomes Garbage URL with no use, if we click Allow in PDF and
> >> opens this URL, it gives DNS host error as it is an Invalid URL, observation
> >> is: in browser address bar when we check hyphen value in above link, it is
> >> not a normal hyphen (-) character available on the keyboard, but can clearly
> >> notice that it is quite big than normal hyphen (-) character.
>
> >> When we checked with Oracle for the same, as per them this could be Custom
> >> issue/third party tool issue that converts Post Script O/P tp PDF, but if
> >> when we checked Post Script O/P by opening in GhostScript Viewer we can
> >> clearly see the issue there as well i.e. observation about hyphen as
> >> explained above, so Oracle further did not help and let us know to find any
> >> custom solution for the same, they have provided one link (
> >>
http://www.utf8-chartable.de/unicode-utf8-table.pl?start=7936&number=...) which