Problem converting manual in PDF to HTML for on screen viewing
17 views
Skip to first unread message
Richard Owlett
unread,
Mar 1, 2021, 5:50:04 AM3/1/21
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to
A program I wish to use distributes its manual as a PDF file.
Due to vision problems I wish to convert it to HTML for onscreen viewing.
pdf2htmlEX is an inappropriate tool as it is too focused on maintaining
format of original. Its rigidity results in a fixed number of
*characters* per line irregardless of chosen font size. As there is
nothing formatted as a table, this restriction is inappropriate.
The only relevant links in the document are from the "Table of Contents"
to the appropriate section. pdf2htmlEX *IGNORES* them!
A suggested tool?
TIA
Alexander V. Makartsev
unread,
Mar 1, 2021, 7:30:04 AM3/1/21
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to
On 01.03.2021 15:44, Richard Owlett
wrote:
A
suggested tool?
I can't suggest a converter, but maybe sane PDF viewer, like
"evince", will solve your problems.
It can change font size, page layout and has other useful
capabilities like index\bookmarks and text search.
For me reading documents in PDFs is much more convenient than in
clunky html pages.
--
With kindest regards, Alexander.
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org
⠈⠳⣄⠀⠀⠀⠀
Richard Owlett
unread,
Mar 1, 2021, 8:00:05 AM3/1/21
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to
On 03/01/2021 06:22 AM, Alexander V. Makartsev wrote:
> On 01.03.2021 15:44, Richard Owlett wrote:
>> A suggested tool?
> I can't suggest a converter, but maybe sane PDF viewer, like "evince",
> will solve your problems.
> It can change font size, page layout and has other useful capabilities
> like index\bookmarks and text search.
> For me reading documents in PDFs is much more convenient than in clunky
> html pages.
>
That is on my system but the available "Help" says nothing about the
options you mentioned.
That may be because I'm using the MATE desktop. I've had "Help"
weirdness on other programs IIRC.
Celejar
unread,
Mar 1, 2021, 8:00:05 AM3/1/21
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to
You can try pdftohtml, in poppler-utils. I don't know whether it will
suffer from the same problem you have with pdf2htmlEX
Celejar
Dan Ritter
unread,
Mar 1, 2021, 8:10:04 AM3/1/21
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to
Richard Owlett wrote:
> A program I wish to use distributes its manual as a PDF file.
They certainly don't write it that way, so perhaps they have a
source website where the original text is written?
-dsr-
Richard Owlett
unread,
Mar 1, 2021, 8:20:04 AM3/1/21
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to
With that in mind I had already posted to a list followed by the
programmer. I've a gut feel the answer will be effectively no.
Richard Owlett
unread,
Mar 1, 2021, 9:00:05 AM3/1/21
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to
I had initially ignored it as the man page starts with:
> This manual page documents briefly the pdftohtml command. This manual page was
> written for the Debian GNU/Linux distribution because the original program does
> not have a manual page.
When it says "brief" it meant "options will be only *listed*"!
I just tried it and it apparently suffers same end result.
If someone knows of some functional documentation for it, I'll try again.
Nate Bargmann
unread,
Mar 1, 2021, 9:00:05 AM3/1/21
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to
Evince has a "zoom" capability that enlarges the document including
fonts and images. I use it quite a bit to see detail in scanned
electronics schematics.
- Nate
--
"The optimist proclaims that we live in the best of all
possible worlds. The pessimist fears this is true."
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to
I was wrong. I didn't understand what a info file was.
texi2html worked like a charm! <*GRIN*>
Stefan Monnier
unread,
Mar 1, 2021, 11:40:05 AM3/1/21
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to
> A program I wish to use distributes its manual as a PDF file.
> Due to vision problems I wish to convert it to HTML for onscreen viewing.
I think what you're trying to do is quite difficult.
It's a bit easier than OCR, but still hard enough that it's not well
supported by any tool that I know. So the better option is to try and
find some source for that PDF and then try to convert *that* to a format
you like.