If ps/pdf renders left to right & top to bottom, ie char-by-char,
then if the rendering could be single-stepped, then the
ascii-char corresponding to the last-rendered-char could be
manually given.
Thereby teaching the system the render-to-ascii translation.
Presumably each render of the same char uses the same data;
so there are only about 100 char-renders, including italics
and 2 sizes ?
So then the learning process would write what it's been
taught and stop for new input ?
What's wrong with this idea ?
== TIA.
Well, for starters, you original assumption is invalid. :)
It is *not* guaranteed that PS/pdf renders L->R and T->B.
A page is a page, and you can write stuff to it in any
position at _any_ time. Text written 'as strings' _is_
generally written in the native direction of the language
but even that is *not* guaranteed.
Another complication is the fact that the pdf file has undergone
at least three stages of processing in an attempt to optimize
the appearance of the text. LaTeX can be a front-end to TeX which
processes to (IIRC) dvi and then to ps. The postscript files
produced by this chain often contain manually-kerned text where
each word is chopped into pieces to squeeze the little ells
closer together. It is occasionally possible to recover the text
(with loss of formatting and occasional bizarre artifacts from
bits of string that happened to be present in the file but were
not intended to be rendered).
But the pdf might not actually contain text in any recognizable
form (until rendered). It could contain a compressed image of
a scanned printout. No ASCII in sight!
--
lxt
If you're referring to "The Design of a Pretty-printing Library" by
John Hughes, which is the page I landed at when I followed your
link ... the Postscript file available for download there *DOES* have
recognizable English text in it:
1 0 bop 349 194 a Fx(The)22 b(Design)g(of)g(a)g(Prett)n(y-prin)n(ti)q
(ng)j(Library)821 339 y Fw(John)14 b(Hughes)528 426 y
Fv(Chalmers)g(T)m(eknisk)n(a)h(H\177)-19 b(ogsk)o(ola,)14
b(G\177)-19 b(oteb)q(org,)13 b(Sw)o(eden.)183 565 y Fu(1)56
b(In)n(tro)r(duction)183 670 y Fw(On)17 b(what)h(do)q(es)g(the)g(p)q
(o)
o(w)o(er)g(of)f(functional)f(programming)e(dep)q(end?)19
which after running through ps2ascii yields:
The Design of a Pretty-printing Library
John Hughes Chalmers Tekniska H"ogskola, G"oteborg, Sweden.
1 Introduction On what does the power of functional programming
depend? Why are functional programs so often a fraction of the size of
equivalent programs in other languages? Why are they so easy to write?
I claim: because functional languages support software reuse extremely
well.
Programs are constructed by putting program components together. When
we discuss reuse, we should ask
Therefore I submit that (if you are indeed talking about this file) it
is your ps-to-pdf "distilling" process which re-encoded the ASCII
strings with new text encodings. This is something that I believe
Adobe Distiller often does, in the interest of producing a smaller
file, unless it is told not to.