Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Problem converting manual in PDF to HTML for on screen viewing

17 views
Skip to first unread message

Richard Owlett

unread,
Mar 1, 2021, 5:50:04 AM3/1/21
to
A program I wish to use distributes its manual as a PDF file.
Due to vision problems I wish to convert it to HTML for onscreen viewing.

pdf2htmlEX is an inappropriate tool as it is too focused on maintaining
format of original. Its rigidity results in a fixed number of
*characters* per line irregardless of chosen font size. As there is
nothing formatted as a table, this restriction is inappropriate.

The only relevant links in the document are from the "Table of Contents"
to the appropriate section. pdf2htmlEX *IGNORES* them!

A suggested tool?
TIA

Alexander V. Makartsev

unread,
Mar 1, 2021, 7:30:04 AM3/1/21
to
On 01.03.2021 15:44, Richard Owlett wrote:
A suggested tool?
I can't suggest a converter, but maybe sane PDF viewer, like "evince", will solve your problems.
It can change font size, page layout and has other useful capabilities like index\bookmarks and text search.
For me reading documents in PDFs is much more convenient than in clunky html pages.

-- 
With kindest regards, Alexander.

⢀⣴⠾⠻⢶⣦⠀ 
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org
⠈⠳⣄⠀⠀⠀⠀ 

Richard Owlett

unread,
Mar 1, 2021, 8:00:05 AM3/1/21
to
On 03/01/2021 06:22 AM, Alexander V. Makartsev wrote:
> On 01.03.2021 15:44, Richard Owlett wrote:
>> A suggested tool?
> I can't suggest a converter, but maybe sane PDF viewer, like "evince",
> will solve your problems.
> It can change font size, page layout and has other useful capabilities
> like index\bookmarks and text search.
> For me reading documents in PDFs is much more convenient than in clunky
> html pages.
>

That is on my system but the available "Help" says nothing about the
options you mentioned.
That may be because I'm using the MATE desktop. I've had "Help"
weirdness on other programs IIRC.

Celejar

unread,
Mar 1, 2021, 8:00:05 AM3/1/21
to
You can try pdftohtml, in poppler-utils. I don't know whether it will
suffer from the same problem you have with pdf2htmlEX

Celejar

Dan Ritter

unread,
Mar 1, 2021, 8:10:04 AM3/1/21
to
Richard Owlett wrote:
> A program I wish to use distributes its manual as a PDF file.

They certainly don't write it that way, so perhaps they have a
source website where the original text is written?

-dsr-

Richard Owlett

unread,
Mar 1, 2021, 8:20:04 AM3/1/21
to
With that in mind I had already posted to a list followed by the
programmer. I've a gut feel the answer will be effectively no.

Richard Owlett

unread,
Mar 1, 2021, 9:00:05 AM3/1/21
to
I had initially ignored it as the man page starts with:
> This manual page documents briefly the pdftohtml command. This manual page was
> written for the Debian GNU/Linux distribution because the original program does
> not have a manual page.

When it says "brief" it meant "options will be only *listed*"!

I just tried it and it apparently suffers same end result.
If someone knows of some functional documentation for it, I'll try again.

Nate Bargmann

unread,
Mar 1, 2021, 9:00:05 AM3/1/21
to
Evince has a "zoom" capability that enlarges the document including
fonts and images. I use it quite a bit to see detail in scanned
electronics schematics.

- Nate

--

"The optimist proclaims that we live in the best of all
possible worlds. The pessimist fears this is true."

Web: https://www.n0nb.us
Projects: https://github.com/N0NB
GPG fingerprint: 82D6 4F6B 0E67 CD41 F689 BBA6 FB2C 5130 D55A 8819

signature.asc

Richard Owlett

unread,
Mar 1, 2021, 10:20:05 AM3/1/21
to
I was wrong. I didn't understand what a info file was.
texi2html worked like a charm! <*GRIN*>

Stefan Monnier

unread,
Mar 1, 2021, 11:40:05 AM3/1/21
to
> A program I wish to use distributes its manual as a PDF file.
> Due to vision problems I wish to convert it to HTML for onscreen viewing.

I think what you're trying to do is quite difficult.
It's a bit easier than OCR, but still hard enough that it's not well
supported by any tool that I know. So the better option is to try and
find some source for that PDF and then try to convert *that* to a format
you like.


Stefan
0 new messages