Compare TeX/LaTeX with FOP?

Peter Davis

unread,

Apr 24, 2012, 2:57:28 PM4/24/12

to

I've been trying to find comparisons of TeX/LaTeX with Apache FOP as
tools for composing and typesetting document pages. Google turns up very
little that seems relevant, and most of that compares DocBook, rather
than FOP itself.

Does anyone have some relevant information? I'm trying to make a case
for why we should (or should not) use TeX/LaTeX for automated PDF
creation. Features like tables of contents, indices, equations, etc. are
very much useful. Also, the ability to put intra- and inter-document
links in the PDF would be helpful.

Thank you.
-pd

--
----
The Tech Curmudgeon
http://www.techcurmudgeon.com

Peter Flynn

unread,

Apr 24, 2012, 5:12:59 PM4/24/12

to

On 24/04/12 19:57, Peter Davis wrote:
> I've been trying to find comparisons of TeX/LaTeX with Apache FOP as
> tools for composing and typesetting document pages. Google turns up
> very little that seems relevant, and most of that compares DocBook,
> rather than FOP itself.

A lot of people do seem to confuse the markup of the source document
with the mechanism (XSL/T/FO/LaTeX) used to transform the document into
printable or viewable form.

There are two main ways to do this from XML to PDF, both of which use an
intermediate format (LaTeX or FO):

1. XML doc --> XSLT program --> LaTeX doc --> LaTeX --> PDF

2. XML doc --> XSLT:FO program --> FO doc --> FOP* --> PDF

* Not just FOP: there are several commercial FO processors as well.

> Does anyone have some relevant information? I'm trying to make a
> case for why we should (or should not) use TeX/LaTeX for automated
> PDF creation.

Unattended operation means guarding your workflow from bogus markup and
bogus characters, both in the XML source and in any generated
intermediate format.

XML is often created by people or systems using low-quality software
which allows invalid or non-well-formed documents to be output. A
non-well-formed document simply can't be processed (at all): it must be
sent back to the creator with a request to do it properly, or patched up
before it can be used.

LaTeX may gag on the 10 special characters (plus <|> if they occur
outside math mode). Any system creating LaTeX must therefore guarantee
that raw special characters (perfectly innocuous in XML apart from < and
&) will never find their way into the LaTeX source unheralded. Unlike
XML, LaTeX may have problems with some Unicode characters and require
manual fixes before the document will compile.

> Features like tables of contents, indices, equations, etc. are very
> much useful.

ToC:

LaTeX is a strictly sequential processor, so it uses a two-pass process,
gathering the ToC data on the first pass and using it on the second.

XSL (both XSLT and XSL:FO) can "walk around" the document, picking
information from here and there, as well as processing it in order, so
it can create a ToC up front simply by looking through the document to
find and collate the sections headings before it starts on the normal
body of processing.

Indexes (which is what I think you mean by "indices") -- same applies:

LaTeX gathers it first, makeindex sorts and collates it, and the second
pass of LaTeX prints it.

XSL, as before, can "look back" once it gets to the end of the document,
and pick out all the index marks, sort and collate them, and then output
them.

In both the above, LaTeX already has robust and well-established
mechanisms for ToCs (and LoFs and LoTs) and indexing and bibliographic
referencing and cross-referencing. In XSL, you pretty much need to write
your own routines from scratch, unless you are using a well-known
document type like DocBook or TEI, for which XSL code is readily available.

Equations: it depends on the markup; let us assume MathML:

I don't know anyone who uses semantic MathML or has ever tried to
convert it to anything (either FO or LaTeX), so I can't help there.

Presentation MathML is more tractable, but because a math expression can
be arbitrarily complex, any XSL code to handle it must be able of coping
with arbitrary complexity. This is hard to write comprehensively, so
it's often done by limiting the code to handle just the math that is
used in the document or group of documents being processed. Here (I
think) is the MathML for E=mc²:

<m:apply xmlns:m="http://www.w3.org/1998/Math/MathML">
<m:eq/>
<m:ci>E</m:ci>
<m:apply>
<m:times/>
<m:ci>m</m:ci>
<m:apply>
<m:power/>
<m:ci>c</m:ci>
<m:cn base="10">2</m:cn>
</m:apply>
</m:apply>
</m:apply>

The twist is that MathML uses prefix notation (the = precedes the Emc²,
and the × precedes the mc, and the power precedes the c²). David
Carlisle has written excellent XSL code to handle this.

> Also, the ability to put intra- and inter-document links
> in the PDF would be helpful.

Cross-references are done the same way in LaTeX and XML: LaTeX uses
\label{foo} and \ref{foo}; XML uses xml:id="foo" and xxx="foo" (where
xxx is an attribute of type IDREF or IDREFS). Hypertext links from XML
to LaTeX need to generate the \href{...} and ensure the use of the
hyperref package; XSL:FO has facilities to create internal and external
link sources and targets; in XSLT you write it to emit the relevant
\label and \ref commands.

HOWEVER...

XML-XSLT-LaTeX-PDF

PRO: there is one inestimable advantage: the built-in features (eg
footnoting, sectioning, floating, environments, etc) and the 4000+
packages of LaTeX. Most of the transformations I have done have
benefitted from being able to solve virtually all the formatting
requirements simply by adding in the relevant packages; typeset quality
far exceeds that of most FO processors.

CON: will gag on unheralded special characters and unknown Unicode
characters; usually generates warnings and errors which need manual
attention; extra font installation needs expert attention; very few
commercial systems using the XSLT-LaTeX route (possibly because they are
unaware of it)

XML-XSL:FO-FOP-PDF

PRO: generally does not break on unusual characters; FOP is incomplete:
for a complete implementation you need to buy a commercial FO processor;
can probably use all system-installed fonts as-is. There are MANY
commercial PDF production systems based on the FO route (possibly
because all the big vendors do this for Windows first).

CON: unless you can use the prewritten code for well-known document
types, or you are working with a toolset which includes such extras, you
have to reinvent the wheel each time for all formatting; typographic
quality is office-standard, probably not publication-standard unless
under typographic supervision.

My personal preference is for the XSLT-LaTeX route, but that's because I
already knew LaTeX before XSLT came along. There was -- still is -- the
original SGML-DSSSL-Jade-TeX path, the Omnimark SGML/XML processor, and
any number of homebrew solutions based on onsgmls and awk or Perl.

YMMV
///Peter

Manuel Collado

unread,

Apr 24, 2012, 6:13:47 PM4/24/12

to

El 24/04/2012 23:12, Peter Flynn escribió:
> On 24/04/12 19:57, Peter Davis wrote:
>> I've been trying to find comparisons of TeX/LaTeX with Apache FOP as
>> tools for composing and typesetting document pages. Google turns up
>> very little that seems relevant, and most of that compares DocBook,
>> rather than FOP itself.
>
> A lot of people do seem to confuse the markup of the source document
> with the mechanism (XSL/T/FO/LaTeX) used to transform the document into
> printable or viewable form.
>
> There are two main ways to do this from XML to PDF, both of which use an
> intermediate format (LaTeX or FO):
>
> 1. XML doc --> XSLT program --> LaTeX doc --> LaTeX --> PDF
>
> 2. XML doc --> XSLT:FO program --> FO doc --> FOP* --> PDF
>
> * Not just FOP: there are several commercial FO processors as well.
>
>> Does anyone have some relevant information? I'm trying to make a
>> case for why we should (or should not) use TeX/LaTeX for automated
>> PDF creation.

>.......

>
> HOWEVER...
>
> XML-XSLT-LaTeX-PDF
>
> PRO: there is one inestimable advantage: the built-in features (eg
> footnoting, sectioning, floating, environments, etc) and the 4000+
> packages of LaTeX. Most of the transformations I have done have
> benefitted from being able to solve virtually all the formatting
> requirements simply by adding in the relevant packages; typeset quality
> far exceeds that of most FO processors.
>
> CON: will gag on unheralded special characters and unknown Unicode
> characters; usually generates warnings and errors which need manual
> attention; extra font installation needs expert attention; very few
> commercial systems using the XSLT-LaTeX route (possibly because they are
> unaware of it)

Some of these CONs can be alleviated by using TeXML as an intermediate
step (for special characters), and using XeLaTeX+fontspec as the Tex
processor (for Unicode and native system fonts):

XML-XSLT-TeXML-XeLaTeX-PDF

I use this toolchain frequently.

IMHO, the main disadvantage of (Xe)LaTeX is the lack of automatic layout
of tables.

BTW, cheap FO processors, like FOP, also suffer of this limitation.

--
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado

Oleg Paraschenko

unread,

Apr 25, 2012, 6:05:41 AM4/25/12

to

Hello Peter,

On 24 Apr., 20:57, Peter Davis <p...@pfdstudio.com> wrote:
> ... I'm trying to make a case

> for why we should (or should not) use TeX/LaTeX for automated PDF
> creation. Features like tables of contents, indices, equations, etc. are
> very much useful. Also, the ability to put intra- and inter-document
> links in the PDF would be helpful.

I consider that XSL-FO vs TeX is just a matter of taste. The amount of
work to implement the whole process "from something to PDF" should be
the same.

Among technical challenges:

* correct sorting ("cote < côte < coté < côté" or "cote < coté < côte
< côté").
* Part N of K.
* In index, the page numbers "23, 24, 24, 24, 25, 37" are better given
as 23-25, 37

Good luck with xsl-fo.

As a pair fresh TeX PROs, I'd suggest:

* In a TeX way, it is possible to correct layout when something goes
wrong.
* XSL-FO is a mix of code and layout. With TeX, one can separate the
work of a designer (.sty) and a coder (.xsl).

More detailed: "XML to paper publishing with manual intervention”:

http://uucode.com/download/2010/xata2010/XmlToPaperWorkflow.pdf
http://uucode.com/download/2010/xata2010/xmltopaper-talk.pdf

>
> Thank you.
> -pd

--
Oleg Parashchenko olpa@ http://uucode.com/
http://uucode.com/blog/ XML, TeX, Python, Mac, Chess

Oleg Paraschenko

unread,

Apr 25, 2012, 6:08:41 AM4/25/12

to

Hello Manuel,

On 25 Apr., 00:13, Manuel Collado <m.coll...@domain.invalid> wrote:
...

> IMHO, the main disadvantage of (Xe)LaTeX is the lack of automatic layout
> of tables.

I don't know what do you mean with "automatic layout of tables", but
maybe the package "cals" fits your needs:

http://ctan.org/pkg/cals

...
> --
> Manuel Collado -http://lml.ls.fi.upm.es/~mcollado

Robin Fairbairns

unread,

Apr 25, 2012, 7:29:52 AM4/25/12

to

Oleg Paraschenko <ole...@gmail.com> writes:

> On 25 Apr., 00:13, Manuel Collado <m.coll...@domain.invalid> wrote:
> ...
>> IMHO, the main disadvantage of (Xe)LaTeX is the lack of automatic layout
>> of tables.
>
> I don't know what do you mean with "automatic layout of tables", but
> maybe the package "cals" fits your needs:
>
> http://ctan.org/pkg/cals

it's a nice package, which i've never got around to trying.

ime, people mean by "automatic layout" that there should not need to be
a column declaration (the lrc in \begin{tabular}{lrc})

html doesn't require it. the first simple table in your demo code on
ctan has it: \colwidths{{3cm}{4cm}}

personally, i think the html ability is overrated, but i tend to be
suspicious of html anyway.
--
Robin Fairbairns, Cambridge
sorry about all this posting. i'll go back to sleep in a bit.

Peter Davis

unread,

Apr 25, 2012, 10:12:23 AM4/25/12

to

On 4/25/2012 6:05 AM, Oleg Paraschenko wrote:
> Hello Peter,
>
> On 24 Apr., 20:57, Peter Davis<p...@pfdstudio.com> wrote:
>> ... I'm trying to make a case
>> for why we should (or should not) use TeX/LaTeX for automated PDF
>> creation. Features like tables of contents, indices, equations, etc. are
>> very much useful. Also, the ability to put intra- and inter-document
>> links in the PDF would be helpful.
>
> I consider that XSL-FO vs TeX is just a matter of taste. The amount of
> work to implement the whole process "from something to PDF" should be
> the same.

Thanks to all for of the comments and suggestions so far. I should
clarify a few things:

1) I'm not constrained to using XML in the workflow at all. We will be
creating software which traverses our own data structures and objects
and generates document markup in whatever format we select. We're using
XSL-FO and FOP currently, but are not obligated to continue that
workflow. However, I will have to justify the learning curve for
switching to a TeX/LaTeX workflow (my own preference.)

Also, we can suitably quote special characters, convert between string
formats, etc. as needed.

2) We are a cross-platform solution. Relying on different FO processors
for Windows, OS-X and Linux (various distros) would be a potential
liability. For this, and potential licensing concerns, a commercial FO
processor is probably not an option. That's why I'm specifically
comparing TeX/LaTeX with FOP.

3) Running multiple passes would not be a problem, as long as it can be
automated. I don't think speed will be an issue.

4) Since our software will allow generating documents on-the-fly, we
will have to bundle our document composer with the product, so we will
need licensing that allows binary redistribution. We won't modify
anything, but still, the many licensing terms of various LaTeX packages
would be something of a disadvantage here.

I was thinking of XeTeX/XeLaTeX with fontspec, though I'm open to other
ideas.

Thank you!

Peter Flynn

unread,

Apr 25, 2012, 4:28:53 PM4/25/12

to

On 25/04/12 12:29, Robin Fairbairns wrote:
> Oleg Paraschenko <ole...@gmail.com> writes:
>
>> On 25 Apr., 00:13, Manuel Collado <m.coll...@domain.invalid> wrote:
>> ...
>>> IMHO, the main disadvantage of (Xe)LaTeX is the lack of automatic layout
>>> of tables.
>>
>> I don't know what do you mean with "automatic layout of tables", but
>> maybe the package "cals" fits your needs:
>>
>> http://ctan.org/pkg/cals
>
> it's a nice package, which i've never got around to trying.
>
> ime, people mean by "automatic layout" that there should not need to be
> a column declaration (the lrc in \begin{tabular}{lrc})
>
> html doesn't require it.

Ah, but HTML is SGML/XML...a processor can "look forward" as I
described, to see what the types and contents of cells are, before
deciding how it is going to lay out the table.

To do that in LaTeX would require the table to be processed twice, to
find out what's in it.

CALS tables allow for a COLSPEC element type which fills the same
function as the table preamble argument in the LaTeX tabular environment.

> the first simple table in your demo code on
> ctan has it: \colwidths{{3cm}{4cm}}
>
> personally, i think the html ability is overrated, but i tend to be
> suspicious of html anyway.

It's made harder for browsers which have to cope with many users' (and
systems') pseudoHTML output, which is basically a random assemblage of
pointy brackets masquerading as tags.

///Peter

Peter Flynn

unread,

Apr 25, 2012, 4:33:03 PM4/25/12

to

On 25/04/12 15:12, Peter Davis wrote:
> On 4/25/2012 6:05 AM, Oleg Paraschenko wrote:
>> Hello Peter,
>>
>> On 24 Apr., 20:57, Peter Davis<p...@pfdstudio.com> wrote:
>>> ... I'm trying to make a case
>>> for why we should (or should not) use TeX/LaTeX for automated PDF
>>> creation. Features like tables of contents, indices, equations, etc. are
>>> very much useful. Also, the ability to put intra- and inter-document
>>> links in the PDF would be helpful.
>>
>> I consider that XSL-FO vs TeX is just a matter of taste. The amount of
>> work to implement the whole process "from something to PDF" should be
>> the same.
>
> Thanks to all for of the comments and suggestions so far. I should
> clarify a few things:
>
> 1) I'm not constrained to using XML in the workflow at all. We will be
> creating software which traverses our own data structures and objects
> and generates document markup in whatever format we select. We're using
> XSL-FO and FOP currently, but are not obligated to continue that
> workflow. However, I will have to justify the learning curve for
> switching to a TeX/LaTeX workflow (my own preference.)

The learning curve is steep, but both methods are stable (in the sense
that the software is reliable and well-documented).

> Also, we can suitably quote special characters, convert between string
> formats, etc. as needed.
>
> 2) We are a cross-platform solution. Relying on different FO processors
> for Windows, OS-X and Linux (various distros) would be a potential
> liability. For this, and potential licensing concerns, a commercial FO
> processor is probably not an option. That's why I'm specifically
> comparing TeX/LaTeX with FOP.

Thanks for the explanation.

> 3) Running multiple passes would not be a problem, as long as it can be
> automated. I don't think speed will be an issue.

Shouldn't be, on a modern machine, unless you are trying to process
gigabyte-sized files.

> 4) Since our software will allow generating documents on-the-fly, we
> will have to bundle our document composer with the product, so we will
> need licensing that allows binary redistribution. We won't modify
> anything, but still, the many licensing terms of various LaTeX packages
> would be something of a disadvantage here.

Others are better placed to answer that.

///Peter

Peter Davis

unread,

Apr 25, 2012, 4:54:44 PM4/25/12

to

We have lawyers who can wrangle the legal issues. What I'm really
trying to determine is: Is it worth engaging those lawyers on this? Do
TeX and LaTeX offer enough, in terms of features or quality of output,
to justify switching away from FOP.

Your earlier remarks offered some pros and cons for each, which was very
helpful. Thanks! When you say that LaTeX's "typeset quality far exceeds
that of most FO processors," what do you mean? I see similar comparisons
(and I certainly believe them), but I can't make a case for
re-architecting/implementing some software unless I can get specific or,
ideally, quantify the differences.

Is there anything I flat out can't do with either TeX/LaTeX or FOP? (I
know ... TeX can do anything, since it's a programming language.)

Thanks again!

Manuel Collado

unread,

Apr 26, 2012, 7:04:08 AM4/26/12

to

El 25/04/2012 22:28, Peter Flynn escribió:
> On 25/04/12 12:29, Robin Fairbairns wrote:
>> Oleg Paraschenko<ole...@gmail.com> writes:
>>
>>> On 25 Apr., 00:13, Manuel Collado<m.coll...@domain.invalid> wrote:
>>> ...
>>>> IMHO, the main disadvantage of (Xe)LaTeX is the lack of automatic layout
>>>> of tables.
>>>
>>> I don't know what do you mean with "automatic layout of tables", but
>>> maybe the package "cals" fits your needs:
>>>
>>> http://ctan.org/pkg/cals
>>
>> it's a nice package, which i've never got around to trying.
>>
>> ime, people mean by "automatic layout" that there should not need to be
>> a column declaration (the lrc in \begin{tabular}{lrc})

Yes. This is what I meant.

>>
>> html doesn't require it.
>
> Ah, but HTML is SGML/XML...a processor can "look forward" as I
> described, to see what the types and contents of cells are, before
> deciding how it is going to lay out the table.
>
> To do that in LaTeX would require the table to be processed twice, to
> find out what's in it.

Well, it seems that this is what the tabulary package does. I've tested
it, but the final result can be far less than acceptable even in some
common cases.

>
> CALS tables allow for a COLSPEC element type which fills the same
> function as the table preamble argument in the LaTeX tabular environment.
>
>> the first simple table in your demo code on
>> ctan has it: \colwidths{{3cm}{4cm}}
>>
>> personally, i think the html ability is overrated, but i tend to be
>> suspicious of html anyway.
>
> It's made harder for browsers which have to cope with many users' (and
> systems') pseudoHTML output, which is basically a random assemblage of
> pointy brackets masquerading as tags.

Properly used, (X)HTML can be an acceptable document authoring language.
Certainly simpler that DocBook.

I personally use a slightly customized XHTML to write (almost) all my
documentation. Even a complete 466 pages book, automatically converted
to LaTeX for production.

Robin Fairbairns

unread,

Apr 26, 2012, 8:49:05 AM4/26/12

to

Manuel Collado <m.co...@domain.invalid> writes:

> El 25/04/2012 22:28, Peter Flynn escribió:
>> On 25/04/12 12:29, Robin Fairbairns wrote:
>>> Oleg Paraschenko<ole...@gmail.com> writes:
>>>
>>>> On 25 Apr., 00:13, Manuel Collado<m.coll...@domain.invalid> wrote:
>>>> ...
>>>>> IMHO, the main disadvantage of (Xe)LaTeX is the lack of automatic layout
>>>>> of tables.
>>>>
>>>> I don't know what do you mean with "automatic layout of tables", but
>>>> maybe the package "cals" fits your needs:
>>>>
>>>> http://ctan.org/pkg/cals
>>>
>>> it's a nice package, which i've never got around to trying.
>>>
>>> ime, people mean by "automatic layout" that there should not need to be
>>> a column declaration (the lrc in \begin{tabular}{lrc})
>
> Yes. This is what I meant.
>
>>> html doesn't require it.
>>
>> Ah, but HTML is SGML/XML...a processor can "look forward" as I
>> described, to see what the types and contents of cells are, before
>> deciding how it is going to lay out the table.
>>
>> To do that in LaTeX would require the table to be processed twice, to
>> find out what's in it.
>
> Well, it seems that this is what the tabulary package does. I've
> tested it, but the final result can be far less than acceptable even
> in some common cases.

tabulary does a particular job; apply it outside the parameters of that
job, and the result may quite reasonably be rubbish. the main problem
with multiple passes in tex is side-effects. (the common problem with
footnotes in captions is such a one.)

>> CALS tables allow for a COLSPEC element type which fills the same
>> function as the table preamble argument in the LaTeX tabular environment.
>>
>>> the first simple table in your demo code on
>>> ctan has it: \colwidths{{3cm}{4cm}}
>>>
>>> personally, i think the html ability is overrated, but i tend to be
>>> suspicious of html anyway.
>>
>> It's made harder for browsers which have to cope with many users' (and
>> systems') pseudoHTML output, which is basically a random assemblage of
>> pointy brackets masquerading as tags.
>
> Properly used, (X)HTML can be an acceptable document authoring
> language. Certainly simpler that DocBook.
>
> I personally use a slightly customized XHTML to write (almost) all my
> documentation. Even a complete 466 pages book, automatically converted
> to LaTeX for production.

jooi, do you use mathml in there? i've no editor support for that in
anything i use, and i find mathml mind-bogglingly difficult to get right.

Peter Davis

unread,

Apr 26, 2012, 9:20:39 AM4/26/12

to

On 4/26/2012 7:04 AM, Manuel Collado wrote:
> El 25/04/2012 22:28, Peter Flynn escribió:
>>>
>>> ime, people mean by "automatic layout" that there should not need to be
>>> a column declaration (the lrc in \begin{tabular}{lrc})
>
> Yes. This is what I meant.

Just FYI, this is *not* a requirement for what we're doing, although
it's certainly useful to know. In our situation, we know what each
table's columns and headers will be.

Thanks,

Manuel Collado

unread,

Apr 26, 2012, 11:41:43 AM4/26/12

to

El 26/04/2012 14:49, Robin Fairbairns escribió:
> Manuel Collado<m.co...@domain.invalid> writes:
>> ...

>> Properly used, (X)HTML can be an acceptable document authoring
>> language. Certainly simpler that DocBook.
>>
>> I personally use a slightly customized XHTML to write (almost) all my
>> documentation. Even a complete 466 pages book, automatically converted
>> to LaTeX for production.
>
> jooi, do you use mathml in there? i've no editor support for that in
> anything i use, and i find mathml mind-bogglingly difficult to get right.

It was a textbook on computer programming. A lot of listings, but very
few math. Simple expressions with just sub- and super-scripts were typed
as inline text. More complex formulas were composed in OpenOffice,
exported as PDF, and inserted in LaTeX with \includegraphics.

BTW, as you probably already know, OpenOffice can also import and export
formulas as mathml markup.

William F. Adams

unread,

Apr 26, 2012, 12:24:28 PM4/26/12

to

On Apr 25, 4:54 pm, Peter Davis <p...@pfdstudio.com> wrote:
> On 4/25/2012 4:33 PM, Peter Flynn wrote:
>
> > On 25/04/12 15:12, Peter Davis wrote:
>
> >> 4) Since our software will allow generating documents on-the-fly, we
> >> will have to bundle our document composer with the product, so we will
> >> need licensing that allows binary redistribution. We won't modify
> >> anything, but still, the many licensing terms of various LaTeX packages
> >> would be something of a disadvantage here.
>
> > Others are better placed to answer that.
>
> We have lawyers who can wrangle the legal issues. What I'm really
> trying to determine is: Is it worth engaging those lawyers on this? Do
> TeX and LaTeX offer enough, in terms of features or quality of output,
> to justify switching away from FOP.

No need to include lawyers, just use the MIT/BSD-licensed TeX
Distribution Kerkis:

http://www.kergis.com/en/kertex.html

Might need some up-dating to meet your needs.

>Is there anything I flat out can't do with either TeX/LaTeX or FOP? (I
>know ... TeX can do anything, since it's a programming language.)

The limits on TeX are:

- available processing power
- available storage space
- human ingenuity

William

Peter Flynn

unread,

Apr 27, 2012, 7:05:11 PM4/27/12

to

On 25/04/12 21:54, Peter Davis wrote:
> On 4/25/2012 4:33 PM, Peter Flynn wrote:
>> On 25/04/12 15:12, Peter Davis wrote:
>>
>>> 4) Since our software will allow generating documents on-the-fly, we
>>> will have to bundle our document composer with the product, so we will
>>> need licensing that allows binary redistribution. We won't modify
>>> anything, but still, the many licensing terms of various LaTeX packages
>>> would be something of a disadvantage here.
>>
>> Others are better placed to answer that.
>
> We have lawyers who can wrangle the legal issues. What I'm really
> trying to determine is: Is it worth engaging those lawyers on this? Do
> TeX and LaTeX offer enough, in terms of features or quality of output,
> to justify switching away from FOP.

Personally, I think yes. But a lot depends on the type of work.

> Your earlier remarks offered some pros and cons for each, which was very
> helpful. Thanks! When you say that LaTeX's "typeset quality far exceeds
> that of most FO processors," what do you mean? I see similar comparisons
> (and I certainly believe them), but I can't make a case for
> re-architecting/implementing some software unless I can get specific or,
> ideally, quantify the differences.

Quantifying quality is hard, and in this case it's best done by
side-by-side comparison. Get someone (yourself, perhaps) to take some
examples of material you are currently processing through FOP, and get
it done through XSLT and LaTeX as well. You probably only need a few pages.

> Is there anything I flat out can't do with either TeX/LaTeX or FOP? (I
> know ... TeX can do anything, since it's a programming language.)

TeX cannot append to an existing file, which is IMHO a small defect, and
one that has never troubled me. TeX copes badly when flung arbitrary bad
characters; FOP never sees them, since they will gag the parser long
before that stage. There are plenty of things I might *prefer* to do in
XSLT rather than in LaTeX, simply because the two languages are so
different, but I'm not sure I've ever come across something that simply
can't be done at all...but I work in a few narrow fields that won't
expose me to every possibility (fortunately :-)

///Peter

Khaled Hosny

unread,

Apr 27, 2012, 7:23:50 PM4/27/12

to

On Thursday, April 26, 2012 6:24:28 PM UTC+2, William F. Adams wrote:
> On Apr 25, 4:54 pm, Peter Davis

> wrote:
> > On 4/25/2012 4:33 PM, Peter Flynn wrote:
> >
> > > On 25/04/12 15:12, Peter Davis wrote:
> >
> > >> 4) Since our software will allow generating documents on-the-fly, we
> > >> will have to bundle our document composer with the product, so we will
> > >> need licensing that allows binary redistribution. We won't modify
> > >> anything, but still, the many licensing terms of various LaTeX packages
> > >> would be something of a disadvantage here.
> >
> > > Others are better placed to answer that.
> >
> > We have lawyers who can wrangle the legal issues. What I'm really
> > trying to determine is: Is it worth engaging those lawyers on this? Do
> > TeX and LaTeX offer enough, in terms of features or quality of output,
> > to justify switching away from FOP.
>
> No need to include lawyers, just use the MIT/BSD-licensed TeX
> Distribution Kerkis:
>
> http://www.kergis.com/en/kertex.html

Which is only a set of binaries, no macro packaged is included. TeXLive have been extensively legally reviewed (including the lawyers of Red Hat for its inclusion in Fedora) and anything included should be free to use commercially, both the binaries and the macro packages.

Peter Davis

unread,

Apr 30, 2012, 11:44:52 AM4/30/12

to

On 4/27/2012 7:05 PM, Peter Flynn wrote:
> Quantifying quality is hard,

Ok, I suppose "quantify" was a poor choice. I was thinking more of
something like a feature comparison (e.g., LaTeX supports
micro-typography and hanging punctuation, LaTeX supports more complex
scripting languages, LaTeX supports better kerning, etc.)

> and in this case it's best done by
> side-by-side comparison. Get someone (yourself, perhaps) to take some
> examples of material you are currently processing through FOP, and get
> it done through XSLT and LaTeX as well. You probably only need a few pages.

Side-by-side comparison will only convince those who are typographically
astute. Also, it's hard to compare hand-tweaked LaTeX with
machine-generated DocBook->XSL-FO->PDF.

In any case, thanks to you and everyone else for the helpful feedback
and interesting discussion. I'm not sure how I'm going to proceed with
this, but I'll let everyone know.

unruh

unread,

Apr 30, 2012, 1:05:14 PM4/30/12

to

That kind of comparison is also extremely open to manipulation. I have
seen comparisons in which brand A outshone brand B by supporting 50
features while B only supported 3. When you looked at the features you
realised they were all minor stuff that nobody would care about anyway.
But it looks good!
>
> -pd
>
>

Peter Flynn

unread,

Apr 30, 2012, 5:34:50 PM4/30/12

to

On 30/04/12 18:05, unruh wrote:
> On 2012-04-30, Peter Davis <p...@pfdstudio.com> wrote:
>> On 4/27/2012 7:05 PM, Peter Flynn wrote:
>>> Quantifying quality is hard,
>>
>> Ok, I suppose "quantify" was a poor choice. I was thinking more of
>> something like a feature comparison (e.g., LaTeX supports
>> micro-typography and hanging punctuation, LaTeX supports more complex
>> scripting languages, LaTeX supports better kerning, etc.)
>>
>>> and in this case it's best done by
>>> side-by-side comparison. Get someone (yourself, perhaps) to take some
>>> examples of material you are currently processing through FOP, and get
>>> it done through XSLT and LaTeX as well. You probably only need a few pages.
>>
>> Side-by-side comparison will only convince those who are typographically
>> astute.

They are often the ones who need convincing.

>> Also, it's hard to compare hand-tweaked LaTeX with
>> machine-generated DocBook->XSL-FO->PDF.

Uhuh. No hand-tweaking allowed, on either side. Have each example done
by a proponent.

>> In any case, thanks to you and everyone else for the helpful feedback
>> and interesting discussion. I'm not sure how I'm going to proceed with
>> this, but I'll let everyone know.
>
> That kind of comparison is also extremely open to manipulation. I have
> seen comparisons in which brand A outshone brand B by supporting 50
> features while B only supported 3. When you looked at the features you
> realised they were all minor stuff that nobody would care about anyway.
> But it looks good!

This is called a "Ben Franklin Balance Sheet", and is a well-known sales
aid.

It can be avoided by having the examples and the lists prepared
independently by their relevant proponents.

///Peter

Peter Davis

unread,

May 1, 2012, 11:06:17 AM5/1/12

to

On 4/30/2012 5:34 PM, Peter Flynn wrote:
>>>
>>> Side-by-side comparison will only convince those who are typographically
>>> astute.
>
> They are often the ones who need convincing.

That has not been my experience. The ones who need convincing are the
ones who think Word is perfectly adequate.

>
>>> Also, it's hard to compare hand-tweaked LaTeX with
>>> machine-generated DocBook->XSL-FO->PDF.
>
> Uhuh. No hand-tweaking allowed, on either side. Have each example done
> by a proponent.

The machine-generated DocBook->XSL-FO-PDF path already exists. The
trick is to show that it's worth the additional effort of building a
LaTeX->PDF path ... without having to actually implement it first.

> It can be avoided by having the examples and the lists prepared
> independently by their relevant proponents.

Unfortunately, it's not a matter of comparing two equally palatable
alternatives. It's a matter of being an advocate for changing the status
quo. It's a bit more of an uphill battle.

Thanks!

Peter Flynn

unread,

May 1, 2012, 3:04:10 PM5/1/12

to

On 01/05/12 16:06, Peter Davis wrote:
> On 4/30/2012 5:34 PM, Peter Flynn wrote:
>>>>
>>>> Side-by-side comparison will only convince those who are
>>>> typographically
>>>> astute.
>>
>> They are often the ones who need convincing.
>
> That has not been my experience. The ones who need convincing are the
> ones who think Word is perfectly adequate.

That is also true, but I don't waste time trying to convert them with
LaTeX. Those who see the light will migrate of their own accord. The
rest can only be converted by a typographically synchronous interface
that does not yet exist (although LyX and the BaKoMa are a good start).

> The machine-generated DocBook->XSL-FO-PDF path already exists. The
> trick is to show that it's worth the additional effort of building a
> LaTeX->PDF path ... without having to actually implement it first.

I think someone has to implement enough of it to demonstrate it.

>> It can be avoided by having the examples and the lists prepared
>> independently by their relevant proponents.
>
> Unfortunately, it's not a matter of comparing two equally palatable
> alternatives. It's a matter of being an advocate for changing the status
> quo. It's a bit more of an uphill battle.

OK, so you *do* need to list the advantages to you of using XSLT and LaTeX.

///Peter