Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Example of dumper (along with references) using Pod::Simple?

1 view
Skip to first unread message

Martin Quinson

unread,
May 25, 2020, 7:00:04 AM5/25/20
to pod-p...@perl.org, de...@lists.po4a.org
Hello,

I'm one of the authors of the https://po4a.org/ tool, which eases the
translation of documentation in various formats. The idea is to parse
the documentations, keep the structure and replace the content in
english with the translated content. See
https://po4a.org/man/man1/po4a.1.php for more information.

(beware, the mailing lists of both projects are CCed)


po4a is written in Perl, and uses Pod::Parser so far, but I'm
considering switching to Pod::Simple instead, thus this email.

I seem to understand that Pod::Parser is kinda deprecated and
Pod::Simple superior (I guess you agree?). In addition, my biggest
concern with Pod::Parser is that the reported line numbers are wrong.
The library reports fancyful locations for the blocks it reports.
For example, the following page reports that the presented page is in
file po4a, line 2 (there is a link to the bottom right).
https://hosted.weblate.org/translate/po4a/po4a-doc/en/?checksum=439ba747adcb37ed
But when you click on that link, you see that it's not the right
location at all. That's line 2 of the POD chunks, not of the whole file.

I had a look at Pod::Simple to see how to convert po4a to the
modernity, but I would appreciate to have some guidance, please. For
reference, our code is here:
https://github.com/mquinson/po4a/blob/master/lib/Locale/Po4a/Pod.pm

For now, we override initialize(), command(), verbatim() and
textblock() and then call parse_from_file() on the provided file.

First question: which of the many Pod::Simple examples should I look
at to adapt this code? Sorry for this silly question, but you really
have many examples...

Second question: How do you retrieve the line number in the original
text in your parsers? Will I have the same issue with the reported
line numbers ignoring the cut parts of the file?

Third question: Is it possible to get some information from the
external blocks? (the parts that are not POD in the file) I would like
to "see" the comments added to these parts, searching for specific
comments that are intended for the translators. This is to implement
something similar to --add-command of xgettext.


Summarizing the first questions, I'm looking for a Pod::Simple example
that would dump the content to stdout along with the line numbers at
which the blocks are found.

And if you feel that I'm off tracks and that my questions are not
relevant, I really need your guidance...

Thanks in advance,
Mt.

--
Never trust a programmer in a suit.
signature.asc

James E Keenan

unread,
May 25, 2020, 11:00:03 AM5/25/20
to pod-p...@perl.org, Martin Quinson
On 5/21/20 11:45 AM, Martin Quinson wrote:
> Hello,
>
> I'm one of the authors of the https://po4a.org/ tool, which eases the
> translation of documentation in various formats. The idea is to parse
> the documentations, keep the structure and replace the content in
> english with the translated content. See
> https://po4a.org/man/man1/po4a.1.php for more information.
>
> (beware, the mailing lists of both projects are CCed)
>
>
> po4a is written in Perl, and uses Pod::Parser so far, but I'm
> considering switching to Pod::Simple instead, thus this email.
>
> I seem to understand that Pod::Parser is kinda deprecated and
> Pod::Simple superior (I guess you agree?). In addition, my biggest
> concern with Pod::Parser is that the reported line numbers are wrong.
> The library reports fancyful locations for the blocks it reports.
> For example, the following page reports that the presented page is in
> file po4a, line 2 (there is a link to the bottom right).
> https://hosted.weblate.org/translate/po4a/po4a-doc/en/?checksum=439ba747adcb37ed
> But when you click on that link, you see that it's not the right
> location at all. That's line 2 of the POD chunks, not of the whole file.
>
> I had a look at Pod::Simple to see how to convert po4a to the
> modernity, but I would appreciate to have some guidance, please. For
> reference, our code is here:
> https://github.com/mquinson/po4a/blob/master/lib/Locale/Po4a/Pod.pm

I forked and cloned this repo, then ran 'perl ./Build.PL'. I was informed:

#####
hecking prerequisites...
test_requires:
! SGMLS is not installed
recommends:
* Locale::gettext is not installed
* SGMLS is not installed
* Text::WrapI18N is not installed
#####

I installed Locale::gettext and Text::WrapI18N without incident. I then
installed the Debian/Ubuntu package 'libsgmls-perl' in the hope that
that would satisfy the SGMLS pre-requisite. It did not. Moreover,
'./Build installdeps' told me:

#####
Skipping SGMLS because I couldn't find a matching namespace.
#####

How should I proceed?


>
> For now, we override initialize(), command(), verbatim() and
> textblock() and then call parse_from_file() on the provided file.
>
> First question: which of the many Pod::Simple examples should I look
> at to adapt this code? Sorry for this silly question, but you really
> have many examples...
>
> Second question: How do you retrieve the line number in the original
> text in your parsers? Will I have the same issue with the reported
> line numbers ignoring the cut parts of the file?
>
> Third question: Is it possible to get some information from the
> external blocks? (the parts that are not POD in the file) I would like
> to "see" the comments added to these parts, searching for specific
> comments that are intended for the translators. This is to implement
> something similar to --add-command of xgettext.
>
>
> Summarizing the first questions, I'm looking for a Pod::Simple example
> that would dump the content to stdout along with the line numbers at
> which the blocks are found.
>
> And if you feel that I'm off tracks and that my questions are not
> relevant, I really need your guidance...
>
> Thanks in advance,
> Mt.
>

Thank you very much.
Jim Keenan

Russ Allbery

unread,
May 25, 2020, 1:00:03 PM5/25/20
to Martin Quinson, pod-p...@perl.org, de...@lists.po4a.org
Martin Quinson <martin....@ens-rennes.fr> writes:

> I had a look at Pod::Simple to see how to convert po4a to the
> modernity, but I would appreciate to have some guidance, please. For
> reference, our code is here:
> https://github.com/mquinson/po4a/blob/master/lib/Locale/Po4a/Pod.pm

> For now, we override initialize(), command(), verbatim() and
> textblock() and then call parse_from_file() on the provided file.

Pod::Simple has a very different organizational structure than Pod::Parser
and it will probably require some reworking or a compatibility layer to
adapt Pod::Parser code to it. To understand how Pod::Simple natively
thinks about passing information to you, you'll want to look at:

Pod::Simple::Subclassing

and then the three major layers over Pod::Simple:

Pod::Simple::Methody
Pod::Simple::PullParser
Pod::Simple::SimpleTree

which gives you method callbacks, a token stream, and a tree respectively.

I have to admit that when converting Pod::Man and Pod::Text, I ended up
writing my own layer to call into the structure that I was already using
with Pod::Parser. A similar approach may work for you since you're coming
from a Pod::Parser world. Take a look at the code that starts here:

https://github.com/rra/podlators/blob/master/lib/Pod/Text.pm#L149

It's fairly short.

> Second question: How do you retrieve the line number in the original
> text in your parsers? Will I have the same issue with the reported
> line numbers ignoring the cut parts of the file?

It depends on the interface that you use, but in general you'll get a
reference to an attrs hash passed into your code, and the start_line
element of that hash is the line number. I suspect the line numbers will
be more accurate, although I haven't tested this myself.

> Third question: Is it possible to get some information from the external
> blocks? (the parts that are not POD in the file) I would like to "see"
> the comments added to these parts, searching for specific comments that
> are intended for the translators. This is to implement something similar
> to --add-command of xgettext.

Yes, call the code_handler() method and pass it a callback. That callback
will be called for each non-POD block in the source file, and can then do
what it wants with them.

--
#!/usr/bin/perl -- Russ Allbery, Just Another Perl Hacker
$^=q;@!>~|{>krw>yn{u<$$<[~||<Juukn{=,<S~|}<Jwx}qn{<Yn{u<Qjltn{ > 0gFzD gD,
00Fz, 0,,( 0hF 0g)F/=, 0> "L$/GEIFewe{,$/ 0C$~> "@=,m,|,(e 0.), 01,pnn,y{
rw} >;,$0=q,$,,($_=$^)=~y,$/ C-~><@=\n\r,-~$:-u/ #y,d,s,(\$.),$1,gee,print
0 new messages