Uniscribe + ScriptItemize/ScriptTextOut speed problems

James Brown

unread,

Jan 31, 2006, 3:17:34 PM1/31/06

to

All,

I have a fully working Unscribe wrapper which renders a line of Unicode
text,
using the low-level ScriptItemize /Layout/Shape/Place/TextOut calls. Its
working
pretty well (very well in fact) but there is still one area I am not happy
with. For a
regular string of "english" text (i.e. non-complex), ScriptItemize always
breaks the
string into individual words. For a long line of text, containing much
white-space
and punctuation, this can result in quite a number of SCRIPT_ITEMs being
returned.

This results in a large number of calls to ScriptTextOut to render the text,
which
is where the problem is - because I am required to call ScriptTextOut for
each "item-run"
in the text, this results in a fairly slow mode of operation - alot slower
than calling
ExtTextOut for the whole line for example. It's not that ScriptTextOut
itself is slow,
it is just the shear number of calls to the OS that is causing the problem.

So my idea is as follows:

After Shaping, all of the returned glyph-data for every item-run in the
string is
stored consecutively in a large buffer. Ordinarily I isolate each run in
this buffer
and draw the runs individually with ScriptTextOut.

However for a "simple" string of text (i.e one that ScriptIsComplex
recognizes as such),
I am proposing to pass the entire buffer of glyph/widths etc to
ScriptTextOut
in one go - so even if there was 30 runs of text, I would just treat this as
one run
and call ScriptTextOut just once - in essence, recombining all script-items
into one
single unit.

Assuming for the moment that I am using just one font, does anyone
see any problem in this approach? The only issue I can see is specifying a
correct
SCRIPT_ANALYSIS structure (there is a unique structure per run so which
would
I specify?)

I have seen hints that maybe ScriptTextOut performs some trickery prior to
calling
ExtTextOut (for complex scripts) and that combining runs prior to calling it
would
be bad.....but for regular english text (code-points < 255 for example)
would
this be ok?

I have tested this method, and it does seem to work - and it is *much*
faster this way...
it would be nice for a Microsoft uniscribe/typography rep to comment on this
approach.

Thanks
James

www.catch22.net
Free Win32 Source and Tutorials

Michael (michka) Kaplan [MS]

unread,

Jan 31, 2006, 11:13:10 PM1/31/06

to

Try calling ScriptIsComplex and when it is not, avoiding further calls to
Uniscribe.

--
MichKa [Microsoft]
NLS Collation/Locale/Keyboard Technical Lead
Globalization Infrastructure, Fonts, and Tools
Blog: http://blogs.msdn.com/michkap

This posting is provided "AS IS" with
no warranties, and confers no rights.

"James Brown" <not@home> wrote in message
news:oN6dneH9_sN...@pipex.net...

Michael (michka) Kaplan [MS]

unread,

Jan 31, 2006, 11:14:40 PM1/31/06

to

Try calling ScriptIsComplex and when it is not, avoiding further calls to

Uniscribe. This is precisely what lpk.dll does when Uniscribe is being
called indirectly....

--
MichKa [Microsoft]
NLS Collation/Locale/Keyboard Technical Lead
Globalization Infrastructure, Fonts, and Tools
Blog: http://blogs.msdn.com/michkap

This posting is provided "AS IS" with
no warranties, and confers no rights.

"James Brown" <not@home> wrote in message
news:oN6dneH9_sN...@pipex.net...

James Brown

unread,

Feb 1, 2006, 5:19:22 AM2/1/06

to

although this does seem a very sensible thing to do, I wanted to avoid
having two
text-rendering 'engines' if at all possible. I have to break up the string
to apply
text-colouring and other attributes anyway, and the advantages of using a
single Uniscribe code-base are numerous - word wrapping, cursor placement
etc
are all handled for me no matter if the script is complex or otherwise. In
actual
fact moving to Uniscribe has produced fantastic results for me, its just a
shame
that for certain text-files with alot of mixed punctuation / separate
letters,
the output of ScriptItemize results in very fine-grained item-runs.

I was just looking at the results of ScriptItemize on a Thai string of text,
which
included spaces between one group of characters. Strangely ScriptItemize did
not
break the string up into runs (there was just one run for the entire line).
However
an English string of text with spaces in, did get itemized into words/spaces
etc.
Strange.

thanks,

James
www.catch22.net
Free Win32 Source and Tutorials

"Michael (michka) Kaplan [MS]" <mic...@microsoft.online.com> wrote in
message news:eA2p6XuJ...@TK2MSFTNGP11.phx.gbl...