TeX parser

14 views
Skip to first unread message

Waldek Hebisch

unread,
Jan 4, 2026, 4:06:10 PM (5 days ago) Jan 4
to fricas...@googlegroups.com
I have now commited a TeX parser. It has limited capabilities,
currently it seem to handle OK all our docstrings. Maybe you
remeber that I wrote here about a package which directly
(avoiding sphinx) generate .html API pages. The parser in
an improved version of parser used by this package (generation
of API pages needs changes, I will write about this when
ready).

Bigger goal for TeX would be to produce HTML/XML version of
FriCAS Book. ATM the parser is too weak for such a job,
but hopefully small extention will be enough. Most notable
unhandled thing is LaTeX 'verbatim' environment, but some
other things may show up.

--
Waldek Hebisch

Ralf Hemmecke

unread,
Jan 5, 2026, 4:04:33 PM (4 days ago) Jan 5
to fricas...@googlegroups.com
On 1/4/26 22:06, Waldek Hebisch wrote:
> I have now commited a TeX parser.

I guess for the restricted TeX macros that appear in docstrings, your
idea might work.

> Bigger goal for TeX would be to produce HTML/XML version of
> FriCAS Book.
That task is very ambitious, but should also be possible.
I just tried to use pandoc to translate the book to html. That does not
work, because some commands rely on the change of catcode to treat their
argument in a verbatim fashion and pandoc cannot deal with catcode changes.

If I remember correctly, mostly the commands starting with "\spad..."
are of this type. Maybe, piping all the .tex files through some filter
that adds necessary escape sequences to the argument text might produce
something that pandoc can handle. Nevertheless, I have no experience
with pandoc. Is there someone on the list who has? In particular, we
probably would like a nice-looking HTML.

Ralf

Dima Pasechnik

unread,
Jan 5, 2026, 4:11:02 PM (4 days ago) Jan 5
to fricas...@googlegroups.com
Pandoc is quite extendable, it does have its own filter facility.
There is also latex+raw_tex mode (i.e. "pandoc -f latex+raw_tex"),
which, they say, allow catcode changes.


>
> Ralf
>
> --
> You received this message because you are subscribed to the Google Groups "FriCAS - computer algebra system" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to fricas-devel...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/fricas-devel/f3f47792-0176-4e71-8611-943dc7e9c467%40hemmecke.org.

Kurt Pagani

unread,
Jan 5, 2026, 6:19:50 PM (4 days ago) Jan 5
to fricas...@googlegroups.com
It might be simpler to convert the PDF file to HTML by using

* https://pdf2htmlex.github.io/pdf2htmlEX/
* https://github.com/pdf2htmlEX/pdf2htmlEX

It's also available via apt:

pdf2htmlex/now 0.0.18.8.rc1.master.bionic.20200630-0 amd64 [installed,local]
Converts PDF to HTML without losing format

I did convert the book some months ago:

https://nilqed.github.io/book.html

Zooming possible with +/- ...

Looks not bad, albeit the produced HTML is quite huge ...



Ralf Hemmecke

unread,
Jan 5, 2026, 6:54:03 PM (4 days ago) Jan 5
to fricas...@googlegroups.com
> There is also latex+raw_tex mode (i.e. "pandoc -f latex+raw_tex"),
> which, they say, allow catcode changes.

Interesting, but it seems not to be available for html output, if I am
not wrong. And I doubt that commonmark can handle the tex macros from
fricas.sty.

Ralf

Ralf Hemmecke

unread,
Jan 5, 2026, 6:57:50 PM (4 days ago) Jan 5
to fricas...@googlegroups.com
> I did convert the book some months ago:
>
>    https://nilqed.github.io/book.html
>
> Zooming possible with +/- ...
>
> Looks not bad, albeit the produced HTML is quite huge ...

Yep.

Honestly, I wouldn't count that output as "proper" HTML.
It is just the PDF hidden in HTML syntax.
Maybe tex4ht can do the job, but configuring tex4ht for our case is
quite a challenge.

Ralf

Kurt Pagani

unread,
Jan 5, 2026, 7:07:20 PM (4 days ago) Jan 5
to fricas...@googlegroups.com


On 06/01/2026 00:57, 'Ralf Hemmecke' via FriCAS - computer algebra
system wrote:
>> I did convert the book some months ago:
>>
>>     https://nilqed.github.io/book.html
>>
>> Zooming possible with +/- ...
>>
>> Looks not bad, albeit the produced HTML is quite huge ...
>
> Yep.
>
> Honestly, I wouldn't count that output as "proper" HTML.

LOL. What do you want? "Proper" HTML (IMO html is html) or a readable
book? You may apply another CSS, then any similarity to the pdf may
disappear :)

Ralf Hemmecke

unread,
Jan 5, 2026, 7:16:05 PM (4 days ago) Jan 5
to fricas...@googlegroups.com
>> Honestly, I wouldn't count that output as "proper" HTML.
>
> LOL. What do you want? "Proper" HTML (IMO html is html)

Yes, yes, of course, you are right, but I guess, you know what I mean.

> or a readable book?
I am not actually sure whether the effort is worth it to produce an html
format from the .tex files of the book in such a way that one can easily
refer to certain parts by giving a URL.

I don't think I will invest much time into it. There are too few
questions here on the mailing list so that this would justify the amount
of work.

Ralf

Dima Pasechnik

unread,
Jan 5, 2026, 7:33:58 PM (4 days ago) Jan 5
to fricas...@googlegroups.com
On Mon, Jan 5, 2026 at 5:54 PM 'Ralf Hemmecke' via FriCAS - computer
algebra system <fricas...@googlegroups.com> wrote:
>
> > There is also latex+raw_tex mode (i.e. "pandoc -f latex+raw_tex"),
> > which, they say, allow catcode changes.
>
> Interesting, but it seems not to be available for html output, if I am
> not wrong.
did you try? I thought panoc is mapping any input to any output.

> And I doubt that commonmark can handle the tex macros from
> fricas.sty.
>
> Ralf
>
> --
> You received this message because you are subscribed to the Google Groups "FriCAS - computer algebra system" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to fricas-devel...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/fricas-devel/27165069-6820-40fe-9356-78a89499c867%40hemmecke.org.
Reply all
Reply to author
Forward
0 new messages