Re: Wordwrapping Demo (5k) (was PS wordwrap)

90 views
Skip to first unread message

Rob

unread,
Oct 9, 2006, 5:30:03 PM10/9/06
to
In an attempt to do no evil, the Word Wrapping demo is cool, but has a
few problems.

The original is too old to reply to directly, but is still at
http://groups.google.com/group/comp.lang.postscript/browse_frm/thread/b4612ab6a1470b20/7b4a68a97e7d3e66?lnk=st&q=postscript+word+wrap+function&rnum=2&hl=en#7b4a68a97e7d3e66

The script had a bug. If there wasn't a space at the end of the line,
it'd drop the last word. The bug is caused by this:
/S { dup spacecount { toofar? { L show show } { show show } ifelse }
repeat pop } bind def

At the end, don't pop the last word off the stack, rather show it. Try
this:
/S { dup spacecount { toofar? { L show show } { show show } ifelse }
repeat show } bind def

But even more than that, there's no need to pre-count the spaces then
loop on the spaces. Try this:

% Is the word too long for the space on the line?
/toofar? {dup stringwidth pop currentpoint pop add RM gt} bind def
% Show a word, line wrap if needed
/showword {toofar? {L} if show} def
% Show text, line wrap if needed
% in a loop, break by spaces, and print each word. If there isn't a
space left, be done.
/S { {( ) search exch showword not {exit} if show} loop} def

Now there's no need for the /spacecount or /find functions. The
/showword function will wrap if needed, using /L to wrap (either page
or line). /S is now effectively a while loop: while a space was found.
It also handles the word after the last space too.

I've also chosen not to bind def showword or S since my L (line/page
break function) uses margins I may change during the course of the
document.

Rob

DBW

unread,
Oct 10, 2006, 9:52:20 AM10/10/06
to
In article <1160429403....@i3g2000cwc.googlegroups.com>, "Rob"
<erob...@yahoo.com> wrote:

> In an attempt to do no evil, the Word Wrapping demo is cool, but has a
> few problems.
>

> The script had a bug. If there wasn't a space at the end of the line,
> it'd drop the last word.

> But even more than that, there's no need to pre-count the spaces then
> loop on the spaces. Try this:
> % Is the word too long for the space on the line?
> /toofar? {dup stringwidth pop currentpoint pop add RM gt} bind def

Rob,
Thank you very much for your improvements, although please note that RM in
the above 'toofar' proc should be lower case 'rm'.

For those interested, I have posted a revised linewrapping PS file
incorporating the suggestions at:
www.cappella.demon.co.uk/resources/linewrap.ps/

The Tinydict Mark-up has grown over the last couple of years to include
auto-hyphenation and I find it indispensable for typesetting lengthy
books.

I can't help thinking that there must be a better solution than mine to
the problem of mixing styles and sizes of font on a fully justified line.
I suspect the answer lies in reading an entire paragraph (like TEX) before
breaking it into lines after word-division. Unfortunately TEX kerns the
width of each space on the same line with a different value so every word
is made a separate string and the PS file is therefore uneditable.

DB-W

Chapman Flack

unread,
Oct 10, 2006, 11:19:09 PM10/10/06
to
DBW wrote:
> I can't help thinking that there must be a better solution than mine
> to the problem of mixing styles and sizes of font on a fully justified
> line. I suspect the answer lies in reading an entire paragraph (like
> TEX) before

David,

Out of curiosity ... you've not mentioned just what aspect of your
solution you'd like to improve on; in what way does mixing styles/sizes
not now do what you want? (I've not looked at the code recently, so
the
answer isn't jumping out at me.)

It strikes me that the question of working line-at-a-time or
paragraph-at-a-time is to a degree independent. IIRC the reason TeX
collects a whole paragraph is to do optimal breaking for the paragraph
as a whole; it can reconsider an earlier break if a lousy one later in
the paragraph would result, using a nifty but computationally demanding
dynamic-programming approach that I don't know if anybody has been
brave
enough to try in PS. If anybody's PS markup dictionary does it I'd
guess it would be Don Lancaster's, but I don't even remember clearly
that he did. Meanwhile the line-at-a-time approach can give reasonable
results with a simple implementation, if not quite as beautiful as
TeX's, and I'm not sure why size/style changes should present more of
a problem there ... except for the matter of changing /vertical/
spacing to accommodate outsized items, as TeX does with math. Is that
what you were concerned about?

> breaking it into lines after word-division. Unfortunately TEX kerns
> the width of each space on the same line with a different value so
> every word is made a separate string and the PS file is therefore
> uneditable.

This sounds mostly like dvips not being smart enough to use xshow.
But it's true that writers of big smart application programs that
do all the computation and use PS as a printer output stream often
don't think much about how the resulting PS reads as a program,
and may not clearly see what benefits, for their purposes, could
lie in thinking more about that.

-Chap

Rob

unread,
Oct 31, 2006, 2:12:14 AM10/31/06
to
David,

Good catch on RM. In my library, I set the margins as upper-case, then
compare currentpoint to them any time I want to know where I am. In my
line wrap, I also check to see if I'll go beyond the bottom, and page
break if necessary. That seemed like an easy thing to throw in there.

Oh, by the way, using the super-spiffy toofar and showword functions,
you don't need to put a space at the end of each paragraph. I see the
note in the .ps file is gone, yet each paragraph still ends in the
dubious extra space. :P

I've noticed that the postscript engine has to really haul to compute
the length of each word in comparison to the margin though. I've tried
to rough-compute this in awk to avoid hammering ghostscript so hard.

> I can't help thinking that there must be a better solution than mine to
> the problem of mixing styles and sizes of font on a fully justified line.
> I suspect the answer lies in reading an entire paragraph (like TEX) before
> breaking it into lines after word-division. Unfortunately TEX kerns the
> width of each space on the same line with a different value so every word
> is made a separate string and the PS file is therefore uneditable.

Um, pardon the tongue-n-cheek reference, but for all but the most
obscure jobs, I'd recomend Quark Express, Microsoft Word (with generous
use of styles), Adobe Acrobat, etc. For truely polished work, I like
to use a tool that can pre-think hyphenation, kerning, rivers,
paragraph orphan control, and all the other things I'm forgetting based
on a spelling and grammar dictionary, more graphically defined rules,
True Type fonts, etc, etc. As soon as I get this intense, PostScript
doesn't seem the place for it.

Rob

DBW

unread,
Nov 4, 2006, 4:53:53 PM11/4/06
to
In article <1162278734.0...@m7g2000cwm.googlegroups.com>, "Rob"
<erob...@yahoo.com> wrote:

> David,
>
> Good catch on RM. In my library, I set the margins as upper-case, then
> compare currentpoint to them any time I want to know where I am. In my
> line wrap, I also check to see if I'll go beyond the bottom, and page
> break if necessary. That seemed like an easy thing to throw in there.

In the Tinydict the pagebreaks are an automatic jump, unless specified
with an 'nj no-jump for chapter headings, etc., so that a different choice
of page format, font, size, or linespacing will repaginate the entire file
without any mousework.

<snipped>


> Um, pardon the tongue-n-cheek reference, but for all but the most
> obscure jobs, I'd recomend Quark Express, Microsoft Word (with generous
> use of styles), Adobe Acrobat, etc. For truely polished work, I like
> to use a tool that can pre-think hyphenation, kerning, rivers,
> paragraph orphan control, and all the other things I'm forgetting based
> on a spelling and grammar dictionary, more graphically defined rules,
> True Type fonts, etc, etc. As soon as I get this intense, PostScript
> doesn't seem the place for it.
>

The Tinydict was designed for four reasons.
1. Using simple codes to mark-up ascii script into a typeset archive which
remains editable for PostScript printing or PDF conversion irrespective of
operating systems or commercial software.

2. To format up to 500 book pages into a single file for laser printing on
A4 machines as four page folios.

3. To provide in a PostScript dictionary of under 100k the most frequently
used typesetting and picture importing facilities of commercial DTP.

4. It was also my attempt to emulate Bill Bates who compiled in 1985 (in
machine code) JustText, the first professional PostScript typesetting
program (110k).
Unfortunately, JustText was constrained by the 35k file lengths of early
text editors and couldn't place imported fonts accurately.

The current Tinydict with full auto-hyphenation is 62k.

DB-W

Reply all
Reply to author
Forward
0 new messages