I'm trying to get to grips with Urdu fonts. We've got standard Arabic
working in our software, but our Pakistani partners say they want the
Urdu to display in the Nastaliq style - they expressed a preference
for the font Jameel Noori Nastaleeq.
Has anyone here got any experience with this? I need to know how this
font is organised: it seems to consist of thousands of characters with
no Unicode mapping, but it's useless unless there's some way of
mapping from Unicode characters to these glyphs - I guess they're
basically the same as standard ligatures. Is this information included
in the font somewhere? I guess it must be otherwise other software
wouldn't be able to use it.
Any help would be great!
> I'm trying to get to grips with Urdu fonts. We've got standard Arabic
> working in our software, but our Pakistani partners say they want the
> Urdu to display in the Nastaliq style - they expressed a preference
> for the font Jameel Noori Nastaleeq.
I'm not familiar with the font, but I did once work with Nastaliq, some
years ago now.
> Has anyone here got any experience with this? I need to know how this
> font is organised: it seems to consist of thousands of characters with
> no Unicode mapping, but it's useless unless there's some way of
> mapping from Unicode characters to these glyphs
My experience predates Unicode. Assuming this is the same font style,
there are some significant differences between regular Arabic and
Nastaliq.
To start with there are no word spaces (or any other kind of spaces) in
Nastaliq. Each alphabetic character has up to three forms; initial,
medial and terminal. You know you've reached the end of the word when
you get the terminal form.
Additionally Nastaliq is cursive, like handwriting rather then printing.
So instead of alphabetic characters, what you used to get was more akin
to a Far Eastern font, each word is a single glyph.
Justification is a problem, because you can't subtly increase the
spacing. Instead the kashida glyph is used. This is a horizontal bar at
the height of the medial forms. Words are actually degenerated into two
parts, and a kashida can be inserted between them, more spacing requires
more kashidas. Effectively the words are stretched.
These two facts may explain why you have thousands of glyphs, rather
than the dozens more normally found in an alphabetic language like
English.
> basically the same as standard ligatures. Is this information included
> in the font somewhere? I guess it must be otherwise other software
> wouldn't be able to use it.
Is this a TrueType or PostScript font ? I doubt there are Unicode code
points for every possible word, more likely there are tables which
convert multiple Unicode code points into sequences of TrueType glyph
IDs, or PostScript CIDs.
But as I said, my last experience with this subject predates Unicode
adoption, so things may be different now.
Ken
~Tom
"Danny" <dra...@well-spring.co.uk> wrote in message
news:6e87b11a-e79c-449a...@w9g2000yqa.googlegroups.com...
Thanks for your thoughts, Ken.
Most of the issues you describe are true of Arabic text in general
(the cursive rules for initial/medial/final), what seems to be
different here is that as well as changing shape, the letters change
size and position, so that as you say, the end result is that the
whole word has to be displayed as a single character
I wasn't aware that they display without word breaks, though:
> To start with there are no word spaces (or any other kind of spaces) in
> Nastaliq. Each alphabetic character has up to three forms; initial,
> medial and terminal. You know you've reached the end of the word when
> you get the terminal form.
So how does this work in terms of logical characters? I understand
that in handwriting we can say 'that was a terminal, so it must be the
end of the word', but on a computer the process is reversed: 'that's
the end of the word, so I need to put the terminal form' - there are
ArabicJoining rules in the Unicode standard to deal with this. When
someone's typing in Urdu, I'm guessing that they'll have to *type* a
space, and then this has to be converted logically into a word break
without a space - does this make sense?
> Justification is a problem, because you can't subtly increase the
> spacing. Instead the kashida glyph is used. This is a horizontal bar at
> the height of the medial forms.
Hold on a second - does this mean I've misunderstood the process? Are
all medial forms written at the same height, with the initial forms on
a higher line and the terminal on a lower? It feels like this ought to
make it much easier.
Words are actually degenerated into two
> parts, and a kashida can be inserted between them, more spacing requires
> more kashidas. Effectively the words are stretched.
Ugh. (I mean this technically rather than aesthetically - actually, so
far I think this may be the most beautiful typographic system I've
encountered)
> Is this a TrueType or PostScript font ?
It's TrueType (possibly OpenType, although I'm not entirely sure of
the difference)
I doubt there are Unicode code
> points for every possible word, more likely there are tables which
> convert multiple Unicode code points into sequences of TrueType glyph
> IDs, or PostScript CIDs.
This is what I was suspecting. As far as I can see this means that
there's no way I can use it: I'm working in Flash which uses Unicode,
and as far as I'm aware I don't have access to any glyphs outside the
Unicode points.
Thanks again, this is really helpful
Danny
Best
Danny
When I was working on this, about 20 years back, we used two techniques;
first a dicitonary lookup. The only words that could be written in the
pleasing Nastaliq form were those for which a word had been digitised.
Since we knew the characters forming all those words we knew when we
reached the end of a word. Of course typing the next character might
lengthen the word in which case we modified the preceding vharacter from
terminal to medial.
Words which we had no Nastaliq form for were handled by spelling in
regular Arabic, for this the keyboard had *lots* of letters. These were
custom keyboards and it was possible to add all the forms.
> > Justification is a problem, because you can't subtly increase the
> > spacing. Instead the kashida glyph is used. This is a horizontal bar at
> > the height of the medial forms.
>
> Hold on a second - does this mean I've misunderstood the process? Are
> all medial forms written at the same height, with the initial forms on
> a higher line and the terminal on a lower? It feels like this ought to
> make it much easier.
This is how it was done in the bad old days, before desktop publishing.
Or at least that's how it was done with the Monotype system. There were
no standards for any of this at the time.
My understanding is that native speakers prefer the 'look' when a single
calligrapher draws the entire font, because it is more uniform. So the
font we used was drawn by a single calligrapher who took extensive care
to make sure that the medial forms in the middle of words could be
broken at a consistent vertical height, for kashida insertion.
> Ugh. (I mean this technically rather than aesthetically - actually, so
> far I think this may be the most beautiful typographic system I've
> encountered)
:-) Technically speaking this was the most challenging 'language' we
worked on, and we dealt with quite a few. Harder than Chinese or
Japanese. (adding to the challenge was the fact that this was a mixed
language typesetting package, English & Nastaliq, and the reading
directions are opposed...)
> It's TrueType (possibly OpenType, although I'm not entirely sure of
> the difference)
TrueType always uses TrueType operators to draw the outline. OpenType
may use TrueType or it may use Adobe's Compact Font Format, which is
closely related to the PostScript Type 1 font operators.
From a users perspective, there is no real difference.
> Thanks again, this is really helpful
Sorry I can't help more, its been a long time since I did this work....
Ken
> From a users perspective, there is no real difference.
I'm finding this entire discussion fascinating. Thanks to everyone.
jc
Hi, we are the testers of this Jameel Nastaleeq font. It consists of
25000+ noori nastaliq ligatures. Also whole font is completely
opentype, so it can be used on any platform. Latest version can be
downloaded here:
http://www.fileden.com/files/2008/12/25/2238030/Jameel%20Noori%20Nastaleeq%202.rar
yeah, it has 25000 + ligatures and it is fully opentype!