Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

CPAN vs. POD outside of .pm (.pl) files?

47 views
Skip to first unread message

Ivan Shmakov

unread,
Jun 17, 2013, 10:20:10 AM6/17/13
to
I see that CPAN automagically extracts the POD documentation out
of the .pm and .pl files and presents it as HTML.

However, now I decide to split the documentation off the .pm's.
How do I request CPAN to extract my documentation out of
stand-alone POD files instead? (and associate it with the
respective .pm's?)

TIA.

--
FSF associate member #7257 np. Strange Highways -- Dio

Ben Morrow

unread,
Jun 17, 2013, 11:12:03 AM6/17/13
to

Quoth Ivan Shmakov <onei...@gmail.com>:
> I see that CPAN automagically extracts the POD documentation out
> of the .pm and .pl files and presents it as HTML.

What do you mean by 'CPAN'? The CPAN shell doesn't normally do this. Do
you mean search.cpan.org?

> However, now I decide to split the documentation off the .pm's.
> How do I request CPAN to extract my documentation out of
> stand-alone POD files instead? (and associate it with the
> respective .pm's?)

search.cpan.org already displays a list of all the .pod files in a
distribution under the 'Documentation' section. If a .pm file has no
Pod, and there is a .pod file next to it, it moves the .pod link up into
the 'Modules' section. See for example Net::SSLeay.

It's probably important to get the NAME section of the Pod right. I
don't exactly know how the search.cpan.org/perldoc?foo links it uses for
L<> work, but I suspect they're indexed based on the NAME section.

Ben

Ivan Shmakov

unread,
Jun 17, 2013, 11:39:34 AM6/17/13
to
>>>>> Ben Morrow <b...@morrow.me.uk> writes:
>>>>> Quoth Ivan Shmakov <onei...@gmail.com>:

>> I see that CPAN automagically extracts the POD documentation out of
>> the .pm and .pl files and presents it as HTML.

> What do you mean by 'CPAN'? The CPAN shell doesn't normally do this.
> Do you mean search.cpan.org?

Yes, I've meant http://search.cpan.org/ specifically, even
though I've inaccurately referenced the whole cpan.org
infrastructure.

I didn't mean cpan(1).

>> However, now I decide to split the documentation off the .pm's. How
>> do I request CPAN to extract my documentation out of stand-alone POD
>> files instead? (and associate it with the respective .pm's?)

> search.cpan.org already displays a list of all the .pod files in a
> distribution under the 'Documentation' section. If a .pm file has no
> Pod, and there is a .pod file next to it, it moves the .pod link up
> into the 'Modules' section.

(Which makes me wonder where is it documented?)

> See for example Net::SSLeay.

> It's probably important to get the NAME section of the Pod right. I
> don't exactly know how the search.cpan.org/perldoc?foo links it uses
> for L<> work, but I suspect they're indexed based on the NAME
> section.

ACK, thanks! Hopefully, such indexing won't insist on the use
of a HYPHEN-MINUS (U+002D) there, instead of the arguably more
appropriate EN DASH (U+2013).

(FWIW, http://search.cpan.org/perldoc?Net::SSLeay appears to
work. Yet it uses the conventional HYPHEN-MINUS.)

Ben Morrow

unread,
Jun 17, 2013, 4:03:37 PM6/17/13
to

Quoth Ivan Shmakov <onei...@gmail.com>:
> >>>>> Ben Morrow <b...@morrow.me.uk> writes:
> >>>>> Quoth Ivan Shmakov <onei...@gmail.com>:
>
> >> I see that CPAN automagically extracts the POD documentation out of
> >> the .pm and .pl files and presents it as HTML.
>
> > What do you mean by 'CPAN'? The CPAN shell doesn't normally do this.
> > Do you mean search.cpan.org?
>
> Yes, I've meant http://search.cpan.org/ specifically, even
> though I've inaccurately referenced the whole cpan.org
> infrastructure.
>
> I didn't mean cpan(1).

OK. I asked because I think it's possible to configure at least cpanp to
install HTML documentation, and IIRC ActiveState have or used to patch
their CPAN.pm to do the same.

> >> However, now I decide to split the documentation off the .pm's. How
> >> do I request CPAN to extract my documentation out of stand-alone POD
> >> files instead? (and associate it with the respective .pm's?)
>
> > search.cpan.org already displays a list of all the .pod files in a
> > distribution under the 'Documentation' section. If a .pm file has no
> > Pod, and there is a .pod file next to it, it moves the .pod link up
> > into the 'Modules' section.
>
> (Which makes me wonder where is it documented?)

I don't think it is. search.cpan.org is not part of the CPAN
infrastructure per se, it was just a useful website written by Graham
Barr which was given a domain under cpan.org. I believe the intention is
that it should index things in the same way as CPAN.pm and perldoc.

> > See for example Net::SSLeay.
>
> > It's probably important to get the NAME section of the Pod right. I
> > don't exactly know how the search.cpan.org/perldoc?foo links it uses
> > for L<> work, but I suspect they're indexed based on the NAME
> > section.
>
> ACK, thanks! Hopefully, such indexing won't insist on the use
> of a HYPHEN-MINUS (U+002D) there, instead of the arguably more
> appropriate EN DASH (U+2013).

I wouldn't muck about with the formatting of the NAME section. pod2man
in particular is quite picky about it, and there are other tools which
rely on the format being right.

Ben

Ivan Shmakov

unread,
Jun 27, 2013, 1:42:10 AM6/27/13
to
>>>>> Ben Morrow <b...@morrow.me.uk> writes:
>>>>> Quoth Ivan Shmakov <onei...@gmail.com>:
>>>>> Ben Morrow <b...@morrow.me.uk> writes:

[...]

>>> What do you mean by 'CPAN'? The CPAN shell doesn't normally do
>>> this. Do you mean search.cpan.org?

>> Yes, I've meant http://search.cpan.org/ specifically, even though
>> I've inaccurately referenced the whole cpan.org infrastructure.

>> I didn't mean cpan(1).

> OK. I asked because I think it's possible to configure at least
> cpanp to install HTML documentation, and IIRC ActiveState have or
> used to patch their CPAN.pm to do the same.

That's interesting. Thanks.

[...]

>>> search.cpan.org already displays a list of all the .pod files in a
>>> distribution under the 'Documentation' section. If a .pm file has
>>> no Pod, and there is a .pod file next to it, it moves the .pod link
>>> up into the 'Modules' section.

>> (Which makes me wonder where is it documented?)

> I don't think it is. search.cpan.org is not part of the CPAN
> infrastructure per se, it was just a useful website written by Graham
> Barr which was given a domain under cpan.org.

Which seems to make it quite a "part," at least for a casual
user like me.

> I believe the intention is that it should index things in the same
> way as CPAN.pm and perldoc.

ACK, thanks.

>>> It's probably important to get the NAME section of the Pod right.
>>> I don't exactly know how the search.cpan.org/perldoc?foo links it
>>> uses for L<> work, but I suspect they're indexed based on the NAME
>>> section.

>> ACK, thanks! Hopefully, such indexing won't insist on the use of a
>> HYPHEN-MINUS (U+002D) there, instead of the arguably more
>> appropriate EN DASH (U+2013).

> I wouldn't muck about with the formatting of the NAME section.

FWIW, http://search.cpan.org/ seems to handle it just fine.
Consider, e. g.:

http://search.cpan.org/perldoc?Tree::Range::base
http://search.cpan.org/perldoc?Tree::Range::RB

Preasumably, it just indexes the PODs by the filename.

> pod2man in particular is quite picky about it, and there are other
> tools which rely on the format being right.

As it seems, ExtUtils::MakeMaker assumes (SPACE, HYPHEN-MINUS,
SPACE) for the delimiter while handling ABSTRACT_FROM, and I'm
considering it a bug (yet to be filed.)

I've not seen any problem with pod2man(1) vs. NAME as of yet.
What should I take note of?

(It appears to assume that --quotes= is a string of two
/octets/, not two /characters,/ though.)

Ben Morrow

unread,
Jun 27, 2013, 3:12:00 AM6/27/13
to

Quoth Ivan Shmakov <onei...@gmail.com>:
> >>>>> Ben Morrow <b...@morrow.me.uk> writes:
> >>>>> Quoth Ivan Shmakov <onei...@gmail.com>:
>
> >> ACK, thanks! Hopefully, such indexing won't insist on the use of a
> >> HYPHEN-MINUS (U+002D) there, instead of the arguably more
> >> appropriate EN DASH (U+2013).
>
> > I wouldn't muck about with the formatting of the NAME section.
>
> FWIW, http://search.cpan.org/ seems to handle it just fine.
> Consider, e. g.:
>
> http://search.cpan.org/perldoc?Tree::Range::base
> http://search.cpan.org/perldoc?Tree::Range::RB
>
> Preasumably, it just indexes the PODs by the filename.

Presumably.

> > pod2man in particular is quite picky about it, and there are other
> > tools which rely on the format being right.
>
> As it seems, ExtUtils::MakeMaker assumes (SPACE, HYPHEN-MINUS,
> SPACE) for the delimiter while handling ABSTRACT_FROM, and I'm
> considering it a bug (yet to be filed.)

It's not a bug. It's part of the syntax of a properly-formatted perldoc.
Pod::Checker looks for a hyphen as well.

As I said, don't muck about with the formatting, there's no point. Note
that Pod::Man (at least) converts that hyphen into the roff escape
sequence for an endash (along with other instances of " - "), so if you
don't get endashes in the output it's because your formatter doesn't
know how to produce them. 'groff -man -Tps' at least will get them
right.

> I've not seen any problem with pod2man(1) vs. NAME as of yet.
> What should I take note of?
>
> (It appears to assume that --quotes= is a string of two
> /octets/, not two /characters,/ though.)

Since the roff emitted by pod2man is normally ASCII-only, what's the
difference?

Ben

Ivan Shmakov

unread,
Jun 27, 2013, 4:50:12 AM6/27/13
to
>>>>> Ben Morrow <b...@morrow.me.uk> writes:
>>>>> Quoth Ivan Shmakov <onei...@gmail.com>:
>>>>> Ben Morrow <b...@morrow.me.uk> writes:

[...]

>>> pod2man in particular is quite picky about it, and there are other
>>> tools which rely on the format being right.

>> As it seems, ExtUtils::MakeMaker assumes (SPACE, HYPHEN-MINUS,
>> SPACE) for the delimiter while handling ABSTRACT_FROM, and I'm
>> considering it a bug (yet to be filed.)

> It's not a bug. It's part of the syntax of a properly-formatted
> perldoc.

Which I see no reason /not/ to extend.

> Pod::Checker looks for a hyphen as well.

> As I said, don't muck about with the formatting, there's no point.
> Note that Pod::Man (at least) converts that hyphen into the roff
> escape sequence for an endash (along with other instances of " - "),

Frankly, I consider the unconditional replacement of " - " to be
a hack by itself.

Why, I've seen a Usenet poster who'd use groff to format his
messages. Guess what he'd end up when quoting code?

> so if you don't get endashes in the output it's because your
> formatter doesn't know how to produce them.

Apparently, the HTML formatter at http://search.cpan.org/
doesn't know how to produce EN DASHes, either.

--cut: http://search.cpan.org/perldoc?Digest --
NAME ^-

Digest - Modules that calculate message digests
--cut: http://search.cpan.org/perldoc?Digest --

Note the HYPHEN-MINUS propagated to the resulting HTML.

... Indeed, my first thought was to use DocBook or XHTML for the
documentation right from the start, so to completely avoid all
those 40 years of formatting mess. Somehow, however, I became
assured that persuading http://search.cpan.org/ to allow for
XHTML documentation would be next to impossible a task, which is
why I've ended up following the mainstream.

Not that I'm particularly happy with it.

(A reminder to myself: suggest updates to [1].)

[1] http://www.tldp.org/HOWTO/DocBook-Demystification-HOWTO/

> 'groff -man -Tps' at least will get them right.

Please note that -Tps produces not a document, but a program to
be executed (by a PostScript interpreter, in this case), which
has implications to both security and software freedom.

(Not unlike HTML "adorned" with JavaScript, Java, or
Adobe Flash, which became such a commonplace on the Web.)

Therefore, unless there's a very good reason to use PostScript,
my suggestion would be to always stick to PDF. (Or perhaps SVG,
as long as single-page vector graphics is concerned.)

>> I've not seen any problem with pod2man(1) vs. NAME as of yet. What
>> should I take note of?

>> (It appears to assume that --quotes= is a string of two /octets/,
>> not two /characters,/ though.)

> Since the roff emitted by pod2man is normally ASCII-only, what's the
> difference?

First of all, pod2man(1) supports --utf8. Then, even if
ASCII-only roff code is requested, pod2man(1) should try to
convert the --quotes= characters to the appropriate roff
escapes, just as it's claimed it does for non-ASCII sources:

[...] Many *roff implementations cannot handle non-ASCII
characters, so this means all non-ASCII characters are converted
either to a *roff escape sequence that tries to create a properly
accented character (at least for troff output) or to "X".

Ben Morrow

unread,
Jun 27, 2013, 7:32:16 AM6/27/13
to

Quoth Ivan Shmakov <onei...@gmail.com>:
> >>>>> Ben Morrow <b...@morrow.me.uk> writes:
> >>>>> Quoth Ivan Shmakov <onei...@gmail.com>:
>
> >> As it seems, ExtUtils::MakeMaker assumes (SPACE, HYPHEN-MINUS,
> >> SPACE) for the delimiter while handling ABSTRACT_FROM, and I'm
> >> considering it a bug (yet to be filed.)
>
> > It's not a bug. It's part of the syntax of a properly-formatted
> > perldoc.
>
> Which I see no reason /not/ to extend.

'I see no reason not to extend the syntax of HTML to allow Unicode
quotes as well as ASCII. They're *so* much prettier.' (cf. xthread.)

> > Pod::Checker looks for a hyphen as well.
>
> > As I said, don't muck about with the formatting, there's no point.
> > Note that Pod::Man (at least) converts that hyphen into the roff
> > escape sequence for an endash (along with other instances of " - "),
>
> Frankly, I consider the unconditional replacement of " - " to be
> a hack by itself.

Pod is, by design, a somewhat loosely-specified format, mostly or
(originally) entirely in ASCII, which relies on the formatter to make
things look pretty where that's necessary. The format has been tightened
up a little recently (it's no longer considered appropriate for the
formatter to turn random references like 'printf(3)' into L<>, for
instance), but this sort of intuition about punctuation is entirely
expected. Inconsistency is also, necessarily, expected.

> > so if you don't get endashes in the output it's because your
> > formatter doesn't know how to produce them.
>
> Apparently, the HTML formatter at http://search.cpan.org/
> doesn't know how to produce EN DASHes, either.

Apparently not. Apparently whoever wrote the relevant bit of code didn't
think it was terribly important.

> ... Indeed, my first thought was to use DocBook or XHTML

Ewww, yuck. Formats designed by and for pedants.

> for the
> documentation right from the start, so to completely avoid all
> those 40 years of formatting mess. Somehow, however, I became
> assured that persuading http://search.cpan.org/ to allow for
> XHTML documentation would be next to impossible a task, which is
> why I've ended up following the mainstream.
>
> Not that I'm particularly happy with it.

Heh. I can just picture the conversation... Not to mention that
command-line perldoc would no longer function, making your modules
unusable.

> > 'groff -man -Tps' at least will get them right.
>
> Please note that -Tps produces not a document, but a program to
> be executed (by a PostScript interpreter, in this case), which
> has implications to both security

That is a relevant concern under some circumstances; this is not one of
them. (I have, somewhat reluctantly, moved to using PDF instead of
PostScript almost exclusively. I like PostScript: it's comfortingly
insane. (The same could be said about Perl.))

> and software freedom.

Don't be so ridiculous.

> Therefore, unless there's a very good reason to use PostScript,
> my suggestion would be to always stick to PDF. (Or perhaps SVG,
> as long as single-page vector graphics is concerned.)

In this particular case I had a rather good reason: my version of groff
doesn't have a -Tpdf device.

> >> I've not seen any problem with pod2man(1) vs. NAME as of yet. What
> >> should I take note of?
>
> >> (It appears to assume that --quotes= is a string of two /octets/,
> >> not two /characters,/ though.)
>
> > Since the roff emitted by pod2man is normally ASCII-only, what's the
> > difference?
>
> First of all, pod2man(1) supports --utf8.

That's new(ish), and not particularly well-supported.

> Then, even if
> ASCII-only roff code is requested, pod2man(1) should try to
> convert the --quotes= characters to the appropriate roff
> escapes, just as it's claimed it does for non-ASCII sources:

The characters passed to --quotes aren't the characters as they will
appear in the output, they are roff escapes. (I don't really understand
roff, but the characters you pass are inserted directly into a .ds
line.) Remember, all this stuff comes from perl 5.000, before perl (or
groff, I expect) had any sort of Unicode support.

Ben

Ivan Shmakov

unread,
Jun 27, 2013, 8:48:53 AM6/27/13
to
>>>>> Ben Morrow <b...@morrow.me.uk> writes:
>>>>> Quoth Ivan Shmakov <onei...@gmail.com>:
>>>>> Ben Morrow <b...@morrow.me.uk> writes:
>>>>> Quoth Ivan Shmakov <onei...@gmail.com>:

>>>> As it seems, ExtUtils::MakeMaker assumes (SPACE, HYPHEN-MINUS,
>>>> SPACE) for the delimiter while handling ABSTRACT_FROM, and I'm
>>>> considering it a bug (yet to be filed.)

>>> It's not a bug. It's part of the syntax of a properly-formatted
>>> perldoc.

>> Which I see no reason /not/ to extend.

> 'I see no reason not to extend the syntax of HTML to allow Unicode
> quotes as well as ASCII. They're *so* much prettier.' (cf. xthread.)

The HTML syntax /allows/ Unicode quotes. Inside the payload,
that is. (Which the text of the NAME section certainly is.)

The good thing about HTML is that it doesn't try to parse
anything outside of (roughly) the <tags /> and &entities;.

Ever.

[...]

>> Frankly, I consider the unconditional replacement of " - " to be a
>> hack by itself.

> Pod is, by design, a somewhat loosely-specified format, mostly or
> (originally) entirely in ASCII, which relies on the formatter to make
> things look pretty where that's necessary. The format has been
> tightened up a little recently (it's no longer considered appropriate
> for the formatter to turn random references like 'printf(3)' into
> L<>, for instance), but this sort of intuition about punctuation is
> entirely expected. Inconsistency is also, necessarily, expected.

... And so is unpredictability.

>>> so if you don't get endashes in the output it's because your
>>> formatter doesn't know how to produce them.

>> Apparently, the HTML formatter at http://search.cpan.org/ doesn't
>> know how to produce EN DASHes, either.

> Apparently not. Apparently whoever wrote the relevant bit of code
> didn't think it was terribly important.

But I do. So, assuming that my intent is to provide quality
documentation (both the contents and the form) for the users of
the software I develop, should I satisfy the NAME convention, at
the cost of having to host the /proper/ HTML renditions of the
documentation by myself? Or should I instead disregard the
convention -- used by the developer's tools I won't use myself
anyway, -- to ensure that certain well-known Web resource will
have the documentation rendered properly?

>> ... Indeed, my first thought was to use DocBook or XHTML

> Ewww, yuck. Formats designed by and for pedants.

... Remind me not to ask you about TEI, then...

>> for the documentation right from the start, so to completely avoid
>> all those 40 years of formatting mess. Somehow, however, I became
>> assured that persuading http://search.cpan.org/ to allow for XHTML
>> documentation would be next to impossible a task, which is why I've
>> ended up following the mainstream.

>> Not that I'm particularly happy with it.

> Heh. I can just picture the conversation... Not to mention that
> command-line perldoc would no longer function, making your modules
> unusable.

"Command-line perldoc"? What's it?

>>> 'groff -man -Tps' at least will get them right.

>> Please note that -Tps produces not a document, but a program to be
>> executed (by a PostScript interpreter, in this case), which has
>> implications to both security

> That is a relevant concern under some circumstances; this is not one
> of them.

It's a valid concern whenever the code comes from a generally
untrusted source. Such as from a Web site its author put it to.
(Which is how the documentation for free software packages is
often distributed.)

> (I have, somewhat reluctantly, moved to using PDF instead of
> PostScript almost exclusively. I like PostScript: it's comfortingly
> insane. (The same could be said about Perl.))

Still, I don't quite understand why one might want to use an
ad-hoc graphics language, when there're general-purpose ones,
with a number of graphics libraries to choose from? (And that
includes Perl, BTW.)

Pretty much the same applies to the ad-hoc formatter languages,
such as roff or TeX. Or to the usual hacks, like having the
"document conversion" chain run as follows:

document.foo -> (conversion) -> program -> (interpreter) -> document.bar

>> and software freedom.

> Don't be so ridiculous.

Well, looking at the license the software I use to produce
PostScript may attach to the pieces of the code which end up in
the resulting "document" isn't exactly the thing I'd like to
spend my time on.

The same applies to the license for the JavaScript code the Web
sites I visit employ. Which is one more reason to prefer Lynx.

>> Therefore, unless there's a very good reason to use PostScript, my
>> suggestion would be to always stick to PDF. (Or perhaps SVG, as
>> long as single-page vector graphics is concerned.)

> In this particular case I had a rather good reason: my version of
> groff doesn't have a -Tpdf device.

To me, it looks much more like a very good reason to update the
particular groff install.

[...]

>>>> (It appears to assume that --quotes= is a string of two /octets/,
>>>> not two /characters,/ though.)

>>> Since the roff emitted by pod2man is normally ASCII-only, what's
>>> the difference?

>> First of all, pod2man(1) supports --utf8.

> That's new(ish), and not particularly well-supported.

The more's the effort, the better's the support. And
identifying (and reporting) bugs is part of such an effort.

(Unless the agreement would be to just drop POD altogether, and
move on to the better tools. Which I still hope for, even
understanding all the improbability of such a decision.)

>> Then, even if ASCII-only roff code is requested, pod2man(1) should
>> try to convert the --quotes= characters to the appropriate roff
>> escapes, just as it's claimed it does for non-ASCII sources:

> The characters passed to --quotes aren't the characters as they will
> appear in the output, they are roff escapes. (I don't really
> understand roff, but the characters you pass are inserted directly
> into a .ds line.)

Which means that it may require a more thorough code change to
fix the issue. (Thanks for the pointer, BTW.)

> Remember, all this stuff comes from perl 5.000, before perl (or
> groff, I expect) had any sort of Unicode support.

How could this justify having the bug remain unfixed?

Ben Morrow

unread,
Jun 27, 2013, 9:53:41 AM6/27/13
to

Quoth Ivan Shmakov <onei...@gmail.com>:
> >>>>> Ben Morrow <b...@morrow.me.uk> writes:
> >>>>> Quoth Ivan Shmakov <onei...@gmail.com>:
> >>>>> Ben Morrow <b...@morrow.me.uk> writes:
> >>>>> Quoth Ivan Shmakov <onei...@gmail.com>:
>
> >>>> As it seems, ExtUtils::MakeMaker assumes (SPACE, HYPHEN-MINUS,
> >>>> SPACE) for the delimiter while handling ABSTRACT_FROM, and I'm
> >>>> considering it a bug (yet to be filed.)
>
> >>> It's not a bug. It's part of the syntax of a properly-formatted
> >>> perldoc.
>
> >> Which I see no reason /not/ to extend.
>
> > 'I see no reason not to extend the syntax of HTML to allow Unicode
> > quotes as well as ASCII. They're *so* much prettier.' (cf. xthread.)
>
> The HTML syntax /allows/ Unicode quotes. Inside the payload,
> that is. (Which the text of the NAME section certainly is.)

No, it isn't. That's exactly my point. The content of the NAME section
is syntax, and it needs to look like

<title> - <abstract>

with an ASCII hyphen. You may not like this, but that's the way it is.

> >>> so if you don't get endashes in the output it's because your
> >>> formatter doesn't know how to produce them.
>
> >> Apparently, the HTML formatter at http://search.cpan.org/ doesn't
> >> know how to produce EN DASHes, either.
>
> > Apparently not. Apparently whoever wrote the relevant bit of code
> > didn't think it was terribly important.
>
> But I do. So, assuming that my intent is to provide quality
> documentation (both the contents and the form) for the users of
> the software I develop, should I satisfy the NAME convention, at
> the cost of having to host the /proper/ HTML renditions of the
> documentation by myself? Or should I instead disregard the
> convention -- used by the developer's tools I won't use myself
> anyway, -- to ensure that certain well-known Web resource will
> have the documentation rendered properly?

The former. The assumption about NAME formatting is widespread, and you
don't know what types of systems people might be using your module on or
what sort of Pod-formatting tools they might have. Portability is more
important than typographical niceties.

> >> for the documentation right from the start, so to completely avoid
> >> all those 40 years of formatting mess. Somehow, however, I became
> >> assured that persuading http://search.cpan.org/ to allow for XHTML
> >> documentation would be next to impossible a task, which is why I've
> >> ended up following the mainstream.
>
> >> Not that I'm particularly happy with it.
>
> > Heh. I can just picture the conversation... Not to mention that
> > command-line perldoc would no longer function, making your modules
> > unusable.
>
> "Command-line perldoc"? What's it?

Are you serious? Run 'perldoc Pod::Man' from your shell prompt.

More generally, providing HTML documentation means you *only* provide
HTML documentation. Pod can be converted to a great many formats (that's
the point).

> >>> 'groff -man -Tps' at least will get them right.
>
> >> Please note that -Tps produces not a document, but a program to be
> >> executed (by a PostScript interpreter, in this case), which has
> >> implications to both security
>
> > That is a relevant concern under some circumstances; this is not one
> > of them.
>
> It's a valid concern whenever the code comes from a generally
> untrusted source. Such as from a Web site its author put it to.
> (Which is how the documentation for free software packages is
> often distributed.)

Perl documentation distributed in any format other than Pod is
worthless, since perldoc can't find it.

> > (I have, somewhat reluctantly, moved to using PDF instead of
> > PostScript almost exclusively. I like PostScript: it's comfortingly
> > insane. (The same could be said about Perl.))
>
> Still, I don't quite understand why one might want to use an
> ad-hoc graphics language, when there're general-purpose ones,
> with a number of graphics libraries to choose from? (And that
> includes Perl, BTW.)

Because my printer speaks PostScript? (Actually it doesn't, but
historically that was the reason for using it.)

> >> and software freedom.
>
> > Don't be so ridiculous.
>
> Well, looking at the license the software I use to produce
> PostScript may attach to the pieces of the code which end up in
> the resulting "document" isn't exactly the thing I'd like to
> spend my time on.

Me either. What makes you think the Turing-completeness of the language
used makes any difference to that, though? I very much doubt any such
licences are enforcable, in any case; certainly not if you're just using
the document as a document, and not trying to pick it apart and use the
bits to write your own PostScript driver.

> >> Therefore, unless there's a very good reason to use PostScript, my
> >> suggestion would be to always stick to PDF. (Or perhaps SVG, as
> >> long as single-page vector graphics is concerned.)
>
> > In this particular case I had a rather good reason: my version of
> > groff doesn't have a -Tpdf device.
>
> To me, it looks much more like a very good reason to update the
> particular groff install.

That would be the FreeBSD base system. The groff there is not going to
be updated, it's going to be replaced, because newer groffs are GPLv3.

> (Unless the agreement would be to just drop POD altogether, and
> move on to the better tools. Which I still hope for, even
> understanding all the improbability of such a decision.)

We're not going to do that. We *like* Pod. Personally, when I'm writing
technical documentation, I would write in Pod for choice. I appreciate
its lack of clutter.

> >> Then, even if ASCII-only roff code is requested, pod2man(1) should
> >> try to convert the --quotes= characters to the appropriate roff
> >> escapes, just as it's claimed it does for non-ASCII sources:
>
> > The characters passed to --quotes aren't the characters as they will
> > appear in the output, they are roff escapes. (I don't really
> > understand roff, but the characters you pass are inserted directly
> > into a .ds line.)
>
> Which means that it may require a more thorough code change to
> fix the issue. (Thanks for the pointer, BTW.)

...no, it means that changing it would break backcompat, so it's
probably not going to happen. (Not to mention that the whole question of
dealing with non-ASCII command-line arguments is largely unsolved on
non-Win32 systems.)

> > Remember, all this stuff comes from perl 5.000, before perl (or
> > groff, I expect) had any sort of Unicode support.
>
> How could this justify having the bug remain unfixed?

Because noone in a position to change it thinks it's a bug, or thinks
it's worth fixing? We're talking about pod2man here; I doubt anyone's
run it with --quotes for a very long time.

Ben

Ivan Shmakov

unread,
Jun 27, 2013, 11:20:08 AM6/27/13
to
>>>>> Ben Morrow <b...@morrow.me.uk> writes:
>>>>> Quoth Ivan Shmakov <onei...@gmail.com>:
>>>>> Ben Morrow <b...@morrow.me.uk> writes:

[...]

>>> The characters passed to --quotes aren't the characters as they
>>> will appear in the output, they are roff escapes. (I don't really
>>> understand roff, but the characters you pass are inserted directly
>>> into a .ds line.)

>> Which means that it may require a more thorough code change to fix
>> the issue. (Thanks for the pointer, BTW.)

> ... no, it means that changing it would break backcompat, so it's
> probably not going to happen.

How would it? Currently, the use of --quotes= with arguments
other than the two-octet ones and "none" results in an error.
Having pod2man interpret multi-octet sequences as two-character
ones looks like a "pure" extension.

Besides, that appears to contradict your own claim below that
"pod2man [is not being run] with --quotes for a very long time."

> (Not to mention that the whole question of dealing with non-ASCII
> command-line arguments is largely unsolved on non-Win32 systems.)

On POSIX systems, I'd expect non-ASCII command-line arguments to
be passed as octet sequences, in the encoding specified by the
LC_CTYPE category.

It has worked for me so far, BTW.

>>> Remember, all this stuff comes from perl 5.000, before perl (or
>>> groff, I expect) had any sort of Unicode support.

>> How could this justify having the bug remain unfixed?

> Because noone in a position to change it thinks it's a bug, or thinks
> it's worth fixing? We're talking about pod2man here; I doubt
> anyone's run it with --quotes for a very long time.

Which appears to make the compatibility concerns irrelevant.

Ivan Shmakov

unread,
Jun 27, 2013, 11:25:19 AM6/27/13
to
>>>>> Ben Morrow <b...@morrow.me.uk> writes:
>>>>> Quoth Ivan Shmakov <onei...@gmail.com>:
>>>>> Ben Morrow <b...@morrow.me.uk> writes:
>>>>> Quoth Ivan Shmakov <onei...@gmail.com>:

[...]

>> The HTML syntax /allows/ Unicode quotes. Inside the payload, that
>> is. (Which the text of the NAME section certainly is.)

> No, it isn't. That's exactly my point. The content of the NAME
> section is syntax, and it needs to look like

> <title> - <abstract>

> with an ASCII hyphen. You may not like this, but that's the way it
> is.

I don't like, and it won't happen on my Pods.

It's free software, though. Anyone's free take the Pods and
improve (or "improve") them as one sees fit.

[...]

>> But I do. So, assuming that my intent is to provide quality
>> documentation (both the contents and the form) for the users of the
>> software I develop, should I satisfy the NAME convention, at the
>> cost of having to host the /proper/ HTML renditions of the
>> documentation by myself? Or should I instead disregard the
>> convention -- used by the developer's tools I won't use myself
>> anyway, -- to ensure that certain well-known Web resource will have
>> the documentation rendered properly?

> The former. The assumption about NAME formatting is widespread, and
> you don't know what types of systems people might be using your
> module on or what sort of Pod-formatting tools they might have.
> Portability is more important than typographical niceties.

Well, let's see if there'd be any actual bug reports...

[...]

>>> Heh. I can just picture the conversation... Not to mention that
>>> command-line perldoc would no longer function, making your modules
>>> unusable.

>> "Command-line perldoc"? What's it?

> Are you serious? Run 'perldoc Pod::Man' from your shell prompt.

$ perldoc Pod::Man
You need to install the perl-doc package to use this program.
$

So?

But if you mean that perldoc is just a fancy way to extract the
Pods out of the sources, convert them into executable roff code,
interpret them with roff, and produce a kind of "extended" ASCII
document, -- then I'd like to note that the intermediate roff
code is already present on my system, and M-x woman in Emacs
shows it without actually executing it with a roff interpreter.

> More generally, providing HTML documentation means you *only* provide
> HTML documentation.

It doesn't. Arguably, a profile of even the good old
HTML 4.0 Strict that matches the expressiveness of Pod would be
easier to convert into a variety of formats than Pod itself.

However, I understand that it's a common misconception.
Hopefully, I'd be able to prepare some counter-examples for the
SFD this September.

> Pod can be converted to a great many formats (that's the point).

So can be DocBook.

[...]

>>>> Please note that -Tps produces not a document, but a program to
>>>> be executed (by a PostScript interpreter, in this case), which
>>>> has implications to both security

>>> That is a relevant concern under some circumstances; this is not
>>> one of them.

>> It's a valid concern whenever the code comes from a generally
>> untrusted source. Such as from a Web site its author put it to.
>> (Which is how the documentation for free software packages is often
>> distributed.)

> Perl documentation distributed in any format other than Pod is
> worthless, since perldoc can't find it.

? How does this relate to my suggestion to avoid PostScript?

>>> (I have, somewhat reluctantly, moved to using PDF instead of
>>> PostScript almost exclusively. I like PostScript: it's
>>> comfortingly insane. (The same could be said about Perl.))

>> Still, I don't quite understand why one might want to use an ad-hoc
>> graphics language, when there're general-purpose ones, with a number
>> of graphics libraries to choose from? (And that includes Perl,
>> BTW.)

> Because my printer speaks PostScript? (Actually it doesn't, but
> historically that was the reason for using it.)

Nowadays, the majority of printers speak neither PostScript nor
PDF. It's the host the printer's attached to that does. And
I'd argue that the host speaks PDF more often than PostScript.

>>>> and software freedom.

>>> Don't be so ridiculous.

>> Well, looking at the license the software I use to produce
>> PostScript may attach to the pieces of the code which end up in the
>> resulting "document" isn't exactly the thing I'd like to spend my
>> time on.

> Me either. What makes you think the Turing-completeness of the
> language used makes any difference to that, though?

It doesn't. Yet I fail to understand the purpose the software
may embed some "hidden" non-code /creative/ work of its
developer into the resulting document.

> I very much doubt any such licences are enforceable, in any case;

Perhaps; IANAL.

> certainly not if you're just using the document as a document, and
> not trying to pick it apart and use the bits to write your own
> PostScript driver.

If /I/ release the document in question under, say, CC BY-SA,
how the recipient (licensee) is expected to know that some parts
of the document's own digital representation are in fact under
some other license, issued by a third party?

[...]

>> To me, it looks much more like a very good reason to update the
>> particular groff install.

> That would be the FreeBSD base system. The groff there is not going
> to be updated, it's going to be replaced, because newer groffs are
> GPLv3.

I was unaware that the FreeBSD developers are opposing GPLv3.
And why do they, BTW? (It was always my impression that FreeBSD
is more lax regarding the licenses than, say, Debian.)

Anyway, isn't it possible to install an (additional) groff
instance from the ports?

>> (Unless the agreement would be to just drop POD altogether, and move
>> on to the better tools. Which I still hope for, even understanding
>> all the improbability of such a decision.)

> We're not going to do that. We *like* Pod.

Slightly tangential to the discussion is the question whether
the "total manpower" of /we/ is currently rising or diminishing?

> Personally, when I'm writing technical documentation, I would write
> in Pod for choice. I appreciate its lack of clutter.

... And also predictability, structure, the ease to run
structured searches against, etc.?

[...]

Tim McDaniel

unread,
Jun 27, 2013, 2:43:08 PM6/27/13
to
In article <0fkt9a-...@anubis.morrow.me.uk>,
Ben Morrow <b...@morrow.me.uk> wrote:
>Formats designed by and for pedants.

"You say that like it's a bad thing."

--
Tim McDaniel, tm...@panix.com

Ben Morrow

unread,
Jun 27, 2013, 5:21:29 PM6/27/13
to

Quoth Ivan Shmakov <onei...@gmail.com>:
> >>>>> Ben Morrow <b...@morrow.me.uk> writes:
> >>>>> Quoth Ivan Shmakov <onei...@gmail.com>:
>
> >> "Command-line perldoc"? What's it?
>
> > Are you serious? Run 'perldoc Pod::Man' from your shell prompt.
>
> $ perldoc Pod::Man
> You need to install the perl-doc package to use this program.

Your perl install is broken; it has been mangled by over-zealous
packagers looking to save a few bytes. You need to find which magic
package to install to give you the whole thing.

> But if you mean that perldoc is just a fancy way to extract the
> Pods out of the sources, convert them into executable roff code,
> interpret them with roff, and produce a kind of "extended" ASCII
> document, -- then I'd like to note that the intermediate roff
> code is already present on my system, and M-x woman in Emacs
> shows it without actually executing it with a roff interpreter.

OK, if that works for you. A lot of Perl programmers use perldoc, and
rely on it working properly, so providing documentation perldoc can't
find is not helpful.

[PostScript]
> >>
> >> It's a valid concern whenever the code comes from a generally
> >> untrusted source. Such as from a Web site its author put it to.
> >> (Which is how the documentation for free software packages is often
> >> distributed.)
>
> > Perl documentation distributed in any format other than Pod is
> > worthless, since perldoc can't find it.
>
> ? How does this relate to my suggestion to avoid PostScript?

Since Perl documentation is distributed in Pod format, the only reason
you would have it in PS is because you have generated it yourself,
presumably with tools you trust not to put anything malicious in the
output.

> >> Well, looking at the license the software I use to produce
> >> PostScript may attach to the pieces of the code which end up in the
> >> resulting "document" isn't exactly the thing I'd like to spend my
> >> time on.
>
> > Me either. What makes you think the Turing-completeness of the
> > language used makes any difference to that, though?
>
> It doesn't. Yet I fail to understand the purpose the software
> may embed some "hidden" non-code /creative/ work of its
> developer into the resulting document.

Fonts? And they usually have quite restrictive licences, from a 'reusing
in other documents' point of view.

> > I very much doubt any such licences are enforceable, in any case;
>
> Perhaps; IANAL.
>
> > certainly not if you're just using the document as a document, and
> > not trying to pick it apart and use the bits to write your own
> > PostScript driver.
>
> If /I/ release the document in question under, say, CC BY-SA,
> how the recipient (licensee) is expected to know that some parts
> of the document's own digital representation are in fact under
> some other license, issued by a third party?

By applying common sense. Your copyright and therefore your licence
applies to what you created, that is, to the content of the document.

> >> To me, it looks much more like a very good reason to update the
> >> particular groff install.
>
> > That would be the FreeBSD base system. The groff there is not going
> > to be updated, it's going to be replaced, because newer groffs are
> > GPLv3.
>
> I was unaware that the FreeBSD developers are opposing GPLv3.
> And why do they, BTW? (It was always my impression that FreeBSD
> is more lax regarding the licenses than, say, Debian.)

I don't know the details; all I know is the GPLv3 has been deemed too
restrictive for the base system. Unlike Debian, the ports collection
still contains software with all licences, open-source and commercial;
though of course the prebuilt packages are only available where the
licence allows it.

> Anyway, isn't it possible to install an (additional) groff
> instance from the ports?

Why would I want to do that? The only thing I use groff for is to read
manpages. Why would I want them as PDFs? (Or as PostScript, for that
matter; I was only using it as an example of a format I expected to get
the endashes right.)

> >> (Unless the agreement would be to just drop POD altogether, and move
> >> on to the better tools. Which I still hope for, even understanding
> >> all the improbability of such a decision.)
>
> > We're not going to do that. We *like* Pod.
>
> Slightly tangential to the discussion is the question whether
> the "total manpower" of /we/ is currently rising or diminishing?

If Perl is dying (which it isn't, by the way), it's not because we don't
use DocBook.

> > Personally, when I'm writing technical documentation, I would write
> > in Pod for choice. I appreciate its lack of clutter.
>
> ... And also predictability, structure, the ease to run
> structured searches against, etc.?

No, mostly I appreciate its lack of clutter. Pod makes writing
documentation no harder than writing comments, which means I actually do
write documentation, even for code I don't think anyone else will ever
see. This is a Good Thing.

(Besides, what do I need structured searches for? I've got ack.)

Ben

Martijn Lievaart

unread,
Jun 28, 2013, 6:38:58 AM6/28/13
to
On Thu, 27 Jun 2013 22:21:29 +0100, Ben Morrow wrote:

> Quoth Ivan Shmakov <onei...@gmail.com>:
>> >>>>> Ben Morrow <b...@morrow.me.uk> writes: Quoth Ivan Shmakov
>> >>>>> <onei...@gmail.com>:
>>
>> >> "Command-line perldoc"? What's it?
>>
>> > Are you serious? Run 'perldoc Pod::Man' from your shell prompt.
>>
>> $ perldoc Pod::Man You need to install the perl-doc package to use this
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> program.
>
> Your perl install is broken; it has been mangled by over-zealous
> packagers looking to save a few bytes. You need to find which magic
> package to install to give you the whole thing.

Maybe that magic package would be perl-doc? :-)

M4

Ivan Shmakov

unread,
Jun 28, 2013, 7:58:38 AM6/28/13
to
>>>>> Ben Morrow <b...@morrow.me.uk> writes:
>>>>> Quoth Ivan Shmakov <onei...@gmail.com>:
>>>>> Ben Morrow <b...@morrow.me.uk> writes:
>>>>> Quoth Ivan Shmakov <onei...@gmail.com>:

[...]

>>>> Well, looking at the license the software I use to produce
>>>> PostScript may attach to the pieces of the code which end up in
>>>> the resulting "document" isn't exactly the thing I'd like to spend
>>>> my time on.

>>> Me either. What makes you think the Turing-completeness of the
>>> language used makes any difference to that, though?

>> It doesn't. Yet I fail to understand the purpose the software may
>> embed some "hidden" non-code /creative/ work of its developer into
>> the resulting document.

> Fonts? And they usually have quite restrictive licences, from a
> 'reusing in other documents' point of view.

Indeed. Though in practice, I fail to recall using any fonts
that fail to meet DFSG for my documents recently.

[...]

>>> certainly not if you're just using the document as a document, and
>>> not trying to pick it apart and use the bits to write your own
>>> PostScript driver.

>> If /I/ release the document in question under, say, CC BY-SA, how
>> the recipient (licensee) is expected to know that some parts of the
>> document's own digital representation are in fact under some other
>> license, issued by a third party?

> By applying common sense. Your copyright and therefore your licence
> applies to what you created, that is, to the content of the document.

And how the recipient is intended to know that?

The same applies to the other kinds of works, though. But for
the images and such, I'd expect the copyrights to be properly
stated in the /visible/ part of the document. For the fonts, it
may seem a bit excessive. Still, for the embedded code parts,
-- I wouldn't expect it to happen at all.

Ivan Shmakov

unread,
Jun 28, 2013, 8:04:45 AM6/28/13
to
>>>>> Ben Morrow <b...@morrow.me.uk> writes:
>>>>> Quoth Ivan Shmakov <onei...@gmail.com>:
>>>>> Ben Morrow <b...@morrow.me.uk> writes:
>>>>> Quoth Ivan Shmakov <onei...@gmail.com>:

>>>> "Command-line perldoc"? What's it?

>>> Are you serious? Run 'perldoc Pod::Man' from your shell prompt.

>> $ perldoc Pod::Man

>> You need to install the perl-doc package to use this program.

> Your perl install is broken;

"It's not a bug."

> it has been mangled by over-zealous packagers looking to save a few
> bytes. You need to find which magic package to install to give you
> the whole thing.

Fortunately, with the error message quoted above, the packagers
already made that a trivial thing to do.

JFTR: the purpose of such splits is to avoid the installation of
the documentation on the hosts that merely /run/ existing code,
and aren't used for the actual development. The installs I use
for Perl development are /ought/ to contain "perl-doc," but
I forgot to do it for this particular one, and I'm not quite
inclined to touch it in the foreseeable future. In the
meantime, I use http://search.cpan.org/perldoc? and
http://perldoc.perl.org/. (Besides, Lynx is almost as fancy as
the Emacs' own WoMan browser.)

>> But if you mean that perldoc is just a fancy way to extract the Pods
>> out of the sources, convert them into executable roff code,
>> interpret them with roff, and produce a kind of "extended" ASCII
>> document, -- then I'd like to note that the intermediate roff code
>> is already present on my system, and M-x woman in Emacs shows it
>> without actually executing it with a roff interpreter.

> OK, if that works for you. A lot of Perl programmers use perldoc,
> and rely on it working properly, so providing documentation perldoc
> can't find is not helpful.

Nowhere have I opposed the use of perldoc. What I contest is
the use of perldoc as the ultimate judge of the documentation
format.

To put it another way: if perldoc can't find the documentation,
-- it's clearly a bug. But it's not necessarily a bug in the
/documentation/ itself.

That being said, I'm (obviously) not familiar with perldoc.
Thus, I'm curious if there's any compelling reason for perldoc
/not/ to support, say, DocBook (or DITA, HTML, TEI, etc.)?

[...]

>> Anyway, isn't it possible to install an (additional) groff instance
>> from the ports?

> Why would I want to do that? The only thing I use groff for is to
> read manpages.

ACK. Although I've found some other uses for groff on
occasions, as I've stated before, I don't usually use it for
reading the manpages, either.

[...]

>>>> (Unless the agreement would be to just drop POD altogether, and
>>>> move on to the better tools. Which I still hope for, even
>>>> understanding all the improbability of such a decision.)

>>> We're not going to do that. We *like* Pod.

>> Slightly tangential to the discussion is the question whether the
>> "total manpower" of /we/ is currently rising or diminishing?

> If Perl is dying (which it isn't, by the way), it's not because we
> don't use DocBook.

I didn't assert either of that. (Even though there're several
definitions of "dying," I guess I understand what you mean.)

My point is that I know of no reason for a programmer looking
for a new language to learn to choose Perl, and I'm not actually
seeing a lot of newcomers to join this group lately, either.
(As compared to, say, news:comp.lang.javascript and
news:comp.lang.python... Why, should they get a Perl course on
Coursera, wouldn't it be rightful to call it "The glorious, and
overly long, history of the Perl programming language", or
something like that?)

The other point to note is that even though I'm using Perl for
almost decade and a half now (on and off), I still can't make
head or tail of it at times. On the contrary, while I have put
virtually no effort to learn Python whatsoever, I seem to
understand the code written in it quite well.

So, there're two reasons for me to stick to Perl. First of all,
it has a rich set of (quality) libraries (although Go, and
perhaps Python, Racket, etc. may surpass it in the near future,
if not already have), which appear to cover my demands well.

The other reason is that Perl isn't of the "one size fits all"
type. Contrast it with Python ("one indentation fits all"), or
Racket ("one package format fits all"), or Go ("one
documentation format fits all")...

... Or is it?

Rainer Weikusat

unread,
Jun 28, 2013, 9:29:49 AM6/28/13
to
Ben Morrow <b...@morrow.me.uk> writes:
> Quoth Ivan Shmakov <onei...@gmail.com>:
>> >>>>> Ben Morrow <b...@morrow.me.uk> writes:
>> >>>>> Quoth Ivan Shmakov <onei...@gmail.com>:
>>
>> >> "Command-line perldoc"? What's it?
>>
>> > Are you serious? Run 'perldoc Pod::Man' from your shell prompt.
>>
>> $ perldoc Pod::Man
>> You need to install the perl-doc package to use this program.
>
> Your perl install is broken; it has been mangled by over-zealous
> packagers looking to save a few bytes. You need to find which magic
> package to install to give you the whole thing.

"Please note that whatever precisely constitutes 'the whole thing'
might be subject to change with little or no advance warning based on
what 'certain developers' presently do or don't consider fashionable",
possibly based on outright irrational reasons (or intentionally
disingenious mock arguments. It is not always possible to distinguish
which is which) such as

I find no small irony in a few posts asking whether there's
any reason to use Moose or Mouse or Moo when you can write
your own object accessors by hand (I wrote my own templating
system by hand too. No more.)

Side remark: I should be noted that the suspicion that what was
supposed to be the be-all and end-all of YARFPOO has already again
'logically' splintered into three different semi-compatible
implementations with mutually exclusive design goals is correct. More
to follow as time goes by and old non-solutions to non-problem are
abandoned because their non-maintainers get bored with them and people
discover more aspects in which all of the existing YARFPOOs are deficient
in this or that way and hence, set forth to - once and for all this
time! - nail the jello to the tree in the perfect way 'from scratch'
all over again. Structual similarities to daily soap opera
installments are unintentional but very like not accidental.

I wonder if somebody ever really asked a so thoroughly stupid question
when considering the scope of the moose mouse that mooed versus
'writing accessors'. This looks suspiciously like strawman. 'writing
accessors' is not a particularly sensible activity in its own right:
Objects should provide behaviour and not hierarchically structured
storage. If a particular 'behaviour' can sensibly be abstracted away
from a specific way of handling state information, it shouldn't be
tied to one: The implementation should be capable of working with all
kinds of objects providing a compatible interface and not be part of
any of them. Comparing 'writing accessors' to 'writing a template
system' is again totally bogus: The tasks are of vastly differing
technical complexity.

As a rule of thumb, people resort to sophisms when marketing their
opinions when they can't think of any better way to further their
causes. This may be because they themselves know that they are wrong
or - considering that 'web development' is what the guy who runs the
advertising agency dabbles in - because they're really marketing
specialists to whom all this 'programming stuff' is part of the cost
of displaying advertisements they'd rather (and totally rationally
in this case) want to get rid of.

The problem with this is that computers do more (and much more
sophisticated) things than "being your plastic pal who's fun to be
with" and what minimizes the per-case workload of the guy who decides
on the colour said 'plastic pal' should have today (and hence, helps
him to maximize his income for a given time period) might not be the
most sensible way to construct 24x7 autonomously operating software
system people rely on in order to get some (not inherently
computer-related) job done.

Ivan Shmakov

unread,
Jun 28, 2013, 9:45:05 AM6/28/13
to
>>>>> Rainer Weikusat <rwei...@mssgmbh.com> writes:

[...]

> As a rule of thumb, people resort to sophisms when marketing their
> opinions when they can't think of any better way to further their
> causes. This may be because they themselves know that they are wrong
> or - considering that 'web development' is what the guy who runs the
> advertising agency dabbles in - because they're really marketing
> specialists to whom all this 'programming stuff' is part of the cost
> of displaying advertisements they'd rather (and totally rationally in
> this case) want to get rid of.

> The problem with this is that computers do more (and much more
> sophisticated) things than "being your plastic pal who's fun to be
> with" and what minimizes the per-case workload of the guy who decides
> on the colour said 'plastic pal' should have today (and hence, helps
> him to maximize his income for a given time period) might not be the
> most sensible way to construct 24x7 autonomously operating software
> system people rely on in order to get some (not inherently
> computer-related) job done.

Even though I cannot quite parse these two paragraphs (is it
some kind of Perl code, BTW?), I seem to wholeheartedly agree
with this way of thought itself.

Rainer Weikusat

unread,
Jun 28, 2013, 10:13:53 AM6/28/13
to
Hmm ... I admit that I'm at least partially guilty of the same thing
(I was criticizing) :-). The important meta-bit would be "Think for
yourself".

Rainer Weikusat

unread,
Jun 29, 2013, 2:37:12 PM6/29/13
to
Rainer Weikusat <rwei...@mssgmbh.com> writes:
> Ivan Shmakov <onei...@gmail.com> writes:
>>>>>>> Rainer Weikusat <rwei...@mssgmbh.com> writes:
>> > As a rule of thumb, people resort to sophisms when marketing their
>> > opinions when they can't think of any better way to further their
>> > causes.

[...]

> Hmm ... I admit that I'm at least partially guilty of the same thing
> (I was criticizing) :-).

OTOH, I can't help thinking that it is a very strange coincidence that
Moose is positively unusable for CGI programs, provided that what is
reported about the time needed to compile it is correct (or - for that
matter - for anything which isn't 'a long-running application'[*]) and
that outspoken proponents of it want to make it more difficult for
people to write CGI programs by removing CGI.pm from the Perl
distribution.

[*] Imagine that people actually write modularized and
object-oriented system configuration tools ...

Ivan Shmakov

unread,
Jun 30, 2013, 3:56:52 AM6/30/13
to
>>>>> Rainer Weikusat <rwei...@mssgmbh.com> writes:

[...]

> OTOH, I can't help thinking that it is a very strange coincidence
> that Moose is positively unusable for CGI programs, provided that
> what is reported about the time needed to compile it is correct (or -
> for that matter - for anything which isn't 'a long-running
> application' [*]) and that outspoken proponents of it want to make it
> more difficult for people to write CGI programs by removing CGI.pm
> from the Perl distribution.

CGI.pm appears to mix up CGI support and HTML generation, which
is a thing I see no good reason to do. So, I tend to advocate
in favor of replacing it with CGI::Simple whenever possible.
(My Web pages are generated with XML::LibXML::toFH (), anyway.)

Moreover, thanks to Fast CGI (and FCGI.pm), it's possible to
serve multiple HTTP requests without restarting the CGI code.
And, it may also allow for more flexible privilege separation
(than mod_suexec, anyway.)

As for Moose, I've scanned through the documentation, but didn't
quite grasp its utility as of yet.

> [*] Imagine that people actually write modularized and
> object-oriented system configuration tools ...

Rainer Weikusat

unread,
Jun 30, 2013, 9:05:52 AM6/30/13
to
Ivan Shmakov <onei...@gmail.com> writes:
>>>>>> Rainer Weikusat <rwei...@mssgmbh.com> writes:
>
> [...]
>
> > OTOH, I can't help thinking that it is a very strange coincidence
> > that Moose is positively unusable for CGI programs, provided that
> > what is reported about the time needed to compile it is correct (or -
> > for that matter - for anything which isn't 'a long-running
> > application' [*]) and that outspoken proponents of it want to make it
> > more difficult for people to write CGI programs by removing CGI.pm
> > from the Perl distribution.
>
> CGI.pm appears to mix up CGI support and HTML generation, which
> is a thing I see no good reason to do.

The 'good reason' would be that CGI programs usually both consume
input and produce output. While I'm not particularly fond of the HTML
generation support in CGI.pm, the basic idea of generating HTML using
a procedural interface has something going for it: In this way, it
possible to use a 'high-level interface' whose parts represent the
logical structure of a form. This makes modifications of this logical
structure much easier than when having to deal with some kind of
proto-HTML markup language with an HTML-like syntax and some typically
fairy 'dumb' (insofar programmbility goes) support for value
interpolation. 'Circumstances' forced me to get some first-hand
experiences with JSF and RichFaces and the 'template pages' generated
in this way are generally huge, repetitive angle bracket swamps. I
understand that 'the typical programmer' never uses a loop (or -
heaven forbid - a subroutine) when he can just copy'n'paste identical
code fifteen times in a row (and thus minimize the amount of work
needed for each of the slightly different fifteen individual cases)
but the result of this is an unwieldy and rigid structure which
reminds me of a frozen rubbish heap (extended by throwing more stuff
onto it and waiting for it to freeze but never modified in any other
way as this would be prohibitively expensive).

> So, I tend to advocate
> in favor of replacing it with CGI::Simple whenever possible.
> (My Web pages are generated with XML::LibXML::toFH (),
> anyway.)

I've had a look at that. A missing feature I absolutely need jumped at
me immediately: Support for accessing uploaded file data without
creating a temporary file first. Neither CGI.pm not CGI::Simple seem
to be 'maintained' in the sense that anybody bothers to deal with CPAN
bug reports, however, this here

https://rt.cpan.org/Public/Bug/Display.html?id=64160

is an absolute showstopper for me: I have to humour people with a
seriously high level of professional paranoia (yes, I do mean that) and
'open CVEs' are ratpoison in this respect. I can, of course, maintain
the code myself but I could as well just write it myself and the
result would very likely be less buggy and perform better for my use
cases.

> Moreover, thanks to Fast CGI (and FCGI.pm), it's possible to
> serve multiple HTTP requests without restarting the CGI code.

... 'we can just write a long-running application instead' ... well,
yes. I've also done that in the past, although based on mod_perl (the
mod_perl I have even behaves like its documentation says it should
because I forced it to ...). But if I can get by with the more
'UNIX(*)-style' approach of using relatively small, independent
cooperating processes, I prefer to do that. Maybe because I'm old
enough that my first impression of this world wasn't the 'designed for
Windows 98' logo but young enough to feel no haste to chase whatever
happens to be modern now because it happens to be modern now (lest I
could be ... left behind !!1) but so be it.

> And, it may also allow for more flexible privilege separation
> (than mod_suexec, anyway.)

Being able to switch UIDs on UNIX(*) means 'running with elevated
privileges' and if I don't absolutely have to use some 'huge' piece of
software for that whose innards are essentially unknown to me, I
prefer to avoid that. And this means small, setuid-0 C programs which
don't perform any function except 'uid switching', usually, to a
hard-coded persona, and only executable by the user supposed to execute
them.

Shmuel Metz

unread,
Jun 30, 2013, 8:37:16 AM6/30/13
to
In <877ghe7...@violet.siamics.net>, on 06/28/2013
at 12:04 PM, Ivan Shmakov <onei...@gmail.com> said:

> That being said, I'm (obviously) not familiar with perldoc.
> Thus, I'm curious if there's any compelling reason for perldoc
> /not/ to support, say, DocBook (or DITA, HTML, TEI, etc.)?

Is there any compelling reason for DecBook et al to not support
perldoc? I'd be happy if the Perl community magically switched to
DocBook, but as a practical matter the cost of converting everything
would be to high. A mixture of two different documentation formats
could get very ugly very quickly.

> My point is that I know of no reason for a programmer looking
> for a new language to learn to choose Perl,

CPAN

> The other point to note is that even though I'm using Perl for
> almost decade and a half now (on and off), I still can't make
> head or tail of it at times. On the contrary, while I have put
> virtually no effort to learn Python whatsoever, I seem to
> understand the code written in it quite well.

The most arcane part of Perl is the Regex sybtax; Python and Ruby are
in the same tradition. I'd really prefer something more in the
tradition of Icon, SNOBOL and Wylbur.

> The other reason is that Perl isn't of the "one size fits all"
> type. Contrast it with Python ("one indentation fits all"),

"We hates it, precious, we hates the nasty thing." I'll take
semicolons and a prettyprinter, TYVM.

--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to spam...@library.lspace.org

Ivan Shmakov

unread,
Jun 30, 2013, 1:57:20 PM6/30/13
to
>>>>> Shmuel (Seymour J ) Metz <spam...@library.lspace.org.invalid> writes:
>>>>> Ivan Shmakov <onei...@gmail.com> said:

>> That being said, I'm (obviously) not familiar with perldoc. Thus,
>> I'm curious if there's any compelling reason for perldoc /not/ to
>> support, say, DocBook (or DITA, HTML, TEI, etc.)?

> Is there any compelling reason for DecBook et al to not support
> perldoc? I'd be happy if the Perl community magically switched to
> DocBook, but as a practical matter the cost of converting everything
> would be to high.

Yes.

> A mixture of two different documentation formats could get very ugly
> very quickly.

Why?

To note is that there already /is/ a mixture, assuming that one
uses an operating system which isn't written entirely in Perl.

So, we have pure-roff; roff generated from a variety of sources
(including both DocBook and Pod); AsciiDoc; Markdown; plain text
(mainly READMEs; in ASCII, UTF-8, and, occasionally, other
encodings); Texinfo; and what not. And all of that appears to
work reasonably well in practice.

>> My point is that I know of no reason for a programmer looking for a
>> new language to learn to choose Perl,

> CPAN

... Which I've actually mentioned (kind of.) But then, take a
look at, say, [1, 2].

... But it may be a reason worth considering. We're currently
preparing for the local SFD event, and I guess we may invest
some time in writing a dozen or so of blog entries on free
software (and digital freedom in general.) Hopefully, I'd be
able to write something reasonably decent on Perl. Naturally,
CPAN would be the first feature to mention.

[1] http://pypi.python.org/pypi
[2] http://code.google.com/p/go-wiki/wiki/Projects

>> The other point to note is that even though I'm using Perl for
>> almost decade and a half now (on and off), I still can't make head
>> or tail of it at times. On the contrary, while I have put virtually
>> no effort to learn Python whatsoever, I seem to understand the code
>> written in it quite well.

> The most arcane part of Perl is the Regex sybtax;

To me, the most arcane part of Perl is that it behaves as if it
actually /is parsed/ with a bunch of REs.

--cut: https://en.wikipedia.org/wiki/Perl --
[...] One consequence of this is that Perl is not a tidy language.
It includes many features, tolerates exceptions to its rules, and
employs heuristics to resolve syntactical ambiguities. [...]
--cut: https://en.wikipedia.org/wiki/Perl --

The "dualvar" scalars I've recently discovered also do not look
like a particularly clever concept.

> Python and Ruby are in the same tradition. I'd really prefer
> something more in the tradition of Icon, SNOBOL and Wylbur.

Could you please show some example, comparing the approaches?

>> The other reason is that Perl isn't of the "one size fits all" type.
>> Contrast it with Python ("one indentation fits all"),

> "We hates it, precious, we hates the nasty thing." I'll take
> semicolons and a prettyprinter, TYVM.

So will I.

And the same for Go, which allows for:

foo = (42 +
bar +
hello);

but not (my preference):

foo = (42
+ bar
+ hello);

(Precisely because a newline after a non-operator is taken for
an "implied semicolon".)

Rainer Weikusat

unread,
Jun 30, 2013, 2:44:43 PM6/30/13
to
Ivan Shmakov <onei...@gmail.com> writes:

[...]

>> it has been mangled by over-zealous packagers looking to save a few
>> bytes. You need to find which magic package to install to give you
>> the whole thing.
>
> Fortunately, with the error message quoted above, the packagers
> already made that a trivial thing to do.
>
> JFTR: the purpose of such splits is to avoid the installation of
> the documentation on the hosts that merely /run/ existing code,
> and aren't used for the actual development.

In order to accomplish what? According to dpkg --print-avail, the size
of perl-doc is about 6.9M. Except rare special cases, the
inconvenience of not having the documentation at hand in some 'strange
situation' by far outweighs the possible 'space saving' here except
that some people presumably feel that 'documentation' is dead weight
because they would never read it, anyway[*].

[*] Nice little anecdote about that: A former colleague of mine used
to boast that 'only newbies read documentation' ("Nur Anfaenger lesen
Dokumentation"). Once upon a time, he and my boss went to China in
order to perform some demos there for some prospective 'large
customers'. By this time, the server part of the then-product was
usually installed on SuSE Linux systems because that was what said
former colleague always used. Consequently, he went to China with a
brand new 'free SuSE CD'. Nobody ever bothered to test this new
version together with our software and since no 'newbies' where
involved here, nobody bothered to read through the release notes for
incompatible changes, either. The end result of that was that I was
woken by an "It doesn't work and we don't know what to do" phone call
around 3am, had to go the the office and read the documentation for
him in order to determine what the problem was (MySQL default date
output format changed) and to change the software to be able to deal
with that (of course, this guy still makes more money than I do ...).

[...]

> My point is that I know of no reason for a programmer looking
> for a new language to learn to choose Perl,

It is a highly useful programming language whose 'crudely implemented
Lisp subset' is fairly complete -- you'll even get run-time modifiable
symbol tables and symbols (called globs) -- with support for automatic
management of all kinds of resources and more than decent
performance. Eg, I use OO-Perl to make real-time WWW content-filtering
descisions and the latency of that is in the order of at most about a
dozebn 0.0001s --- that's something Java developers don't even dream
of (OTOH, it is presumably possible to force perl down to
JBoss/Hibernate/SEAM levels by adding enough 'CPAN frameworks' to it).

Ben Morrow

unread,
Jul 1, 2013, 8:29:29 AM7/1/13
to

Quoth Martijn Lievaart <m...@rtij.nl.invlalid>:
> On Thu, 27 Jun 2013 22:21:29 +0100, Ben Morrow wrote:
>
> > Quoth Ivan Shmakov <onei...@gmail.com>:
> >>
> >> $ perldoc Pod::Man You need to install the perl-doc package to use this
> >> program.
> >
> > Your perl install is broken; it has been mangled by over-zealous
> > packagers looking to save a few bytes. You need to find which magic
> > package to install to give you the whole thing.
>
> Maybe that magic package would be perl-doc? :-)

I didn't mean just the docs; I don't know what else might have been
stripped out, and you really need all of it. I presume there is some
perl-complete package you can install which will pull in everything the
CPAN tarball would give you.

Ben

Ben Morrow

unread,
Jul 1, 2013, 8:26:57 AM7/1/13
to

Quoth Rainer Weikusat <rwei...@mssgmbh.com>:
> Ivan Shmakov <onei...@gmail.com> writes:
>
[FCGI]
> > And, it may also allow for more flexible privilege separation
> > (than mod_suexec, anyway.)
>
> Being able to switch UIDs on UNIX(*) means 'running with elevated
> privileges' and if I don't absolutely have to use some 'huge' piece of
> software for that whose innards are essentially unknown to me, I
> prefer to avoid that.

I'm not sure what you're saying here. One of the major advantages of
FCGI/HTTP proxying rather than plain CGI is that it is straightforward
to have the webserver and the application itself running under separate
uids, both started (ultimately) from init, without having anything
setuid or running as root. This is impossible with CGI, since the
application is invoked by the webserver, so either it runs under the
webserver's uid or the webserver has to be able to switch uid.

> And this means small, setuid-0 C programs which
> don't perform any function except 'uid switching', usually, to a
> hard-coded persona, and only executable by the user supposed to execute
> them.

I try to avoid setuid altogether. Daemons are started from /etc/rc or
managed with daemontools, and once that process tree has switched down
from root it never goes back.

Ben

Peter J. Holzer

unread,
Jul 1, 2013, 11:19:08 AM7/1/13
to
On 2013-07-01 12:29, Ben Morrow <b...@morrow.me.uk> wrote:
> Quoth Martijn Lievaart <m...@rtij.nl.invlalid>:
>> On Thu, 27 Jun 2013 22:21:29 +0100, Ben Morrow wrote:
>> > Quoth Ivan Shmakov <onei...@gmail.com>:
>> >> $ perldoc Pod::Man You need to install the perl-doc package to use
>> >> this program.
>> >
>> > Your perl install is broken; it has been mangled by over-zealous
>> > packagers looking to save a few bytes.

:-) Push the right button and Ben sounds like Rainer.

>> > You need to find which magic package to install to give you the
>> > whole thing.
>>
>> Maybe that magic package would be perl-doc? :-)
>
> I didn't mean just the docs; I don't know what else might have been
> stripped out, and you really need all of it. I presume there is some
> perl-complete package you can install which will pull in everything the
> CPAN tarball would give you.

Maybe, maybe not. In 10+ years of using Debian (and even more of using
Redhat, which has a similar packaging philosophy) I've never missed
having a package or meta-package which included "everything the CPAN
tarball would give me". I have a working perl installation and when a
feature is missing, it is usually straightforward to find out which
package provides it and install that. I don't care whether a module is
part of the core or not.

hp

--
_ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
|_|_) | Sysadmin WSR | Man feilt solange an seinen Text um, bis
| | | h...@hjp.at | die Satzbestandteile des Satzes nicht mehr
__/ | http://www.hjp.at/ | zusammenpaßt. -- Ralph Babel

Rainer Weikusat

unread,
Jul 1, 2013, 1:12:19 PM7/1/13
to
Ben Morrow <b...@morrow.me.uk> writes:
> Quoth Rainer Weikusat <rwei...@mssgmbh.com>:
>> Ivan Shmakov <onei...@gmail.com> writes:
>>
> [FCGI]
>> > And, it may also allow for more flexible privilege separation
>> > (than mod_suexec, anyway.)
>>
>> Being able to switch UIDs on UNIX(*) means 'running with elevated
>> privileges' and if I don't absolutely have to use some 'huge' piece of
>> software for that whose innards are essentially unknown to me, I
>> prefer to avoid that.
>
> I'm not sure what you're saying here.

In the given context, that I wouldn't want to use mod_suexec for
anything as this would necessarily imply that 'the CGI executor' (or
at least some part of it) had to run with elevated privileges by
default and consequently, that mentioning it here is somewhat out of
place.

> One of the major advantages of FCGI/HTTP proxying rather than plain
> CGI is that it is straightforward to have the webserver and the
> application itself running under separate uids, both started
> (ultimately) from init, without having anything setuid or running as
> root.

One of disadvantages of separating 'a web application' into a 'web
server component' and 'an application server component' is that this
means that one more permanently running daemon must be configured to
run as an untrusted user and possibly even that it must be
specifically told that it shouldn't bind its network sockets to the
wildcard address (like JBoss) and possibly, it will even create
listening sockets bound to the wildcard address nevertheless (like
JBoss 5). Even if said application server doesn't kindly offer its own
'attack surface' (slight misuse of the term), it is still a bunch of
more code reachable from the network.

> This is impossible with CGI, since the application is invoked by the
> webserver, so either it runs under the webserver's uid or the
> webserver has to be able to switch uid.

If 'the application is invoked from the webserver', this means it will
- by default - run as an unprivileged user with no additional
cost. This also includes that any access restrictions affecting this user
will - by default - affect the application as well (with no additional
cost).

Ivan Shmakov

unread,
Jul 1, 2013, 1:27:25 PM7/1/13
to
>>>>> Rainer Weikusat <rwei...@mssgmbh.com> writes:
>>>>> Ben Morrow <b...@morrow.me.uk> writes:

[Cross-posting to news:comp.infosystems.www.servers.unix.]

[...]

>> One of the major advantages of FCGI/HTTP proxying rather than plain
>> CGI is that it is straightforward to have the webserver and the
>> application itself running under separate uids, both started
>> (ultimately) from init, without having anything setuid or running as
>> root.

> One of disadvantages of separating 'a web application' into a 'web
> server component' and 'an application server component' is that this
> means that one more permanently running daemon must be configured to
> run as an untrusted user and possibly even that it must be
> specifically told that it shouldn't bind its network sockets to the
> wildcard address (like JBoss)

It's certainly not impossible to fork a process which creates a
network socket bound to a wildcard address from a "classic" CGI
process.

Not to mention that when properly implemented, the only socket
used by the "application server" would be a "Unix domain" one.

[...]

>> This is impossible with CGI, since the application is invoked by the
>> webserver, so either it runs under the webserver's uid or the
>> webserver has to be able to switch uid.

> If 'the application is invoked from the webserver', this means it
> will - by default - run as an unprivileged user with no additional
> cost. This also includes that any access restrictions affecting this
> user will - by default - affect the application as well (with no
> additional cost).

... However, if "a bunch of applications are invoked from the
Web server," this means that all of them will use the same
unprivileged user, thus ruining the privilege separation.

And why, even a single "classic" CGI application has enough
permissions to kill () its parent -- the HTTP server. Or to
mangle its logs, screw its runtime data (if any), etc.

Naturally, it's the very kind of problem which is easy to avoid
by having the "payload" process' lineage completely separated
from that of the HTTP server.

Rainer Weikusat

unread,
Jul 1, 2013, 1:45:04 PM7/1/13
to
Ivan Shmakov <onei...@gmail.com> writes:

[...]

Unless this moves again into a different direction, I'm going to
ignore it. I'm interested in programming problems, here, programming
problems specifically relating to Perl, and not in this kind of "who's
the most fastly rotating humming top" 'system architecture theory
wars' (this is not supposed to be offensive, I just see no value in
such a discussion).

Ben Morrow

unread,
Jul 1, 2013, 1:38:58 PM7/1/13
to

Quoth "Peter J. Holzer" <hjp-u...@hjp.at>:
> On 2013-07-01 12:29, Ben Morrow <b...@morrow.me.uk> wrote:
> > Quoth Martijn Lievaart <m...@rtij.nl.invlalid>:
> >> On Thu, 27 Jun 2013 22:21:29 +0100, Ben Morrow wrote:
> >> > Quoth Ivan Shmakov <onei...@gmail.com>:
> >> >> $ perldoc Pod::Man You need to install the perl-doc package to use
> >> >> this program.
> >> >
> >> > Your perl install is broken; it has been mangled by over-zealous
> >> > packagers looking to save a few bytes.
>
> :-) Push the right button and Ben sounds like Rainer.

Touché :).

> >> > You need to find which magic package to install to give you the
> >> > whole thing.
> >>
> >> Maybe that magic package would be perl-doc? :-)
> >
> > I didn't mean just the docs; I don't know what else might have been
> > stripped out, and you really need all of it. I presume there is some
> > perl-complete package you can install which will pull in everything the
> > CPAN tarball would give you.
>
> Maybe, maybe not. In 10+ years of using Debian (and even more of using
> Redhat, which has a similar packaging philosophy) I've never missed
> having a package or meta-package which included "everything the CPAN
> tarball would give me". I have a working perl installation and when a
> feature is missing, it is usually straightforward to find out which
> package provides it and install that. I don't care whether a module is
> part of the core or not.

That's certainly a sensible attitude if you rely on OS-packaged perl
modules: one would assume that they pull in bits of the 'core' package
as needed, and that they have been tested with whatever bits of code
they do actually pull in. However, once you start installing modules
directly from CPAN (which I don't recommend with an OS-installed perl in
any case), or developing your own, you start to find that Perl
developers assume 'perl vX means you have module Y in core', or 'header
H', or 'Unicode data file U', and these assumptions no longer
necessarily hold.

I am speaking from the background of rather a lot of bug reports which
end up like this

'...I don't have ___.'
'You ought to, it's in the core.'
'...Oh, right, my OS package doesn't include that...'

not to mention the 'but perldoc doesn't work' people we get here from
time to time.

Ben

Ben Morrow

unread,
Jul 1, 2013, 1:52:16 PM7/1/13
to

Quoth Rainer Weikusat <rwei...@mssgmbh.com>:
> Ben Morrow <b...@morrow.me.uk> writes:
> > Quoth Rainer Weikusat <rwei...@mssgmbh.com>:
> >> Ivan Shmakov <onei...@gmail.com> writes:
> >>
> > [FCGI]
> >> > And, it may also allow for more flexible privilege separation
> >> > (than mod_suexec, anyway.)
> >>
> >> Being able to switch UIDs on UNIX(*) means 'running with elevated
> >> privileges' and if I don't absolutely have to use some 'huge' piece of
> >> software for that whose innards are essentially unknown to me, I
> >> prefer to avoid that.
> >
> > I'm not sure what you're saying here.
>
> In the given context, that I wouldn't want to use mod_suexec for
> anything as this would necessarily imply that 'the CGI executor' (or
> at least some part of it) had to run with elevated privileges by
> default and consequently, that mentioning it here is somewhat out of
> place.

I see. In that case, I entirely agree. I don't like the concept of
mod_suexec either.

> > One of the major advantages of FCGI/HTTP proxying rather than plain
> > CGI is that it is straightforward to have the webserver and the
> > application itself running under separate uids, both started
> > (ultimately) from init, without having anything setuid or running as
> > root.
>
> One of disadvantages of separating 'a web application' into a 'web
> server component' and 'an application server component' is that this
> means that one more permanently running daemon must be configured to
> run as an untrusted user and possibly even that it must be
> specifically told that it shouldn't bind its network sockets to the
> wildcard address (like JBoss) and possibly, it will even create
> listening sockets bound to the wildcard address nevertheless (like
> JBoss 5). Even if said application server doesn't kindly offer its own
> 'attack surface' (slight misuse of the term), it is still a bunch of
> more code reachable from the network.

Well, I didn't say the application had to be written incompetently :).
Nevertheless, if you are obliged to run an app which is, it's much
easier to lock it down (with jails/MAC/whatever) if it's running as a
separately-managed process tree than if it's invoked ad-hoc by the
webserver.

> > This is impossible with CGI, since the application is invoked by the
> > webserver, so either it runs under the webserver's uid or the
> > webserver has to be able to switch uid.
>
> If 'the application is invoked from the webserver', this means it will
> - by default - run as an unprivileged user with no additional
> cost. This also includes that any access restrictions affecting this user
> will - by default - affect the application as well (with no additional
> cost).

I don't want my applications running as the web server user. My apps
necessarily have access to files and other resources (database sockets
and so on) which I don't want to give the web server user access to. The
web server user is 'extremely untrusted', because it's running a process
which is directly exposed to the Internet; ideally, the system as a
whole would remain entirely secure even if the webserver process was
entirely compromised. (I don't suppose I reach that ideal in practice, I
do what I can in that direction.)

Ben

Mart van de Wege

unread,
Jul 2, 2013, 4:13:25 AM7/2/13
to
Ben Morrow <b...@morrow.me.uk> writes:

> However, once you start installing modules directly from CPAN (which I
> don't recommend with an OS-installed perl in any case), or developing
> your own, you start to find that Perl developers assume 'perl vX means
> you have module Y in core', or 'header H', or 'Unicode data file U',
> and these assumptions no longer necessarily hold.
>
> I am speaking from the background of rather a lot of bug reports which
> end up like this
>
> '...I don't have ___.'
> 'You ought to, it's in the core.'
> '...Oh, right, my OS package doesn't include that...'
>
> not to mention the 'but perldoc doesn't work' people we get here from
> time to time.
>
Which is why Debian has helper tools to create your own packages from
CPAN modules.

In case of most modules, a CPAN module can be installed with one
command. The only thing it doesn't do yet is automatic dependency
management, but that is something you don't get with manual installs
either.

And perldoc is a bit of a red herring. Debian users know that Debian
doesn't package docs by default and will see it is a 'Suggested'
package, so they can install it when needed.

The problem is Debian derivatives that stupidly install a subset of
Debian packages without looking at the Suggests for their default build,
and then sell themselves as a ready-made solution for the (beginning)
Linux developer. Yes, I do mean Ubuntu.

Mart
--
"We will need a longer wall when the revolution comes."
--- AJS, quoting an uncertain source.

Dave Saville

unread,
Jul 2, 2013, 4:54:43 AM7/2/13
to
On Tue, 2 Jul 2013 08:13:25 UTC, Mart van de Wege <mvd...@mail.com>
wrote:
I just fell over that one with Ubuntu and I was going to ask sometime.
I am not a great fan of Ubuntu so what distro do you perl hackers
recommend so *I* get control of such things?

TIA

--
Regards
Dave Saville

Ben Morrow

unread,
Jul 2, 2013, 6:42:10 AM7/2/13
to

Quoth "Dave Saville" <da...@invalid.invalid>:
> On Tue, 2 Jul 2013 08:13:25 UTC, Mart van de Wege <mvd...@mail.com>
> wrote:
>
> > Which is why Debian has helper tools to create your own packages from
> > CPAN modules.
> >
> > In case of most modules, a CPAN module can be installed with one
> > command. The only thing it doesn't do yet is automatic dependency
> > management, but that is something you don't get with manual installs
> > either.

The various CPAN clients will pull in dependencies as needed. What you
don't get natively is uninstall; MakeMaker has an uninstall target, but
it's never really been very reliable, because it has no way of tracking
if a file has been overwritten since it was installed, and because some
modules traditionally use INSTALLDIRS=>"perl" which means you end up
with part of the core missing if you then uninstall them.

But that doesn't help if a module has an undeclared dependency on a bit
of the core which isn't installed (undeclared because it's core, so it
ought to always be there). Nothing automated can work out those deps.

> > And perldoc is a bit of a red herring. Debian users know that Debian
> > doesn't package docs by default and will see it is a 'Suggested'
> > package, so they can install it when needed.

This subthread started with a Debian user who didn't know that.

> > The problem is Debian derivatives that stupidly install a subset of
> > Debian packages without looking at the Suggests for their default build,
> > and then sell themselves as a ready-made solution for the (beginning)
> > Linux developer. Yes, I do mean Ubuntu.
>
> I just fell over that one with Ubuntu and I was going to ask sometime.
> I am not a great fan of Ubuntu so what distro do you perl hackers
> recommend so *I* get control of such things?

FreeBSD. (Though I don't hate Debian itself, despite my complaints, on
the occasions when I need to use Linux specifically.)

Ben

Ivan Shmakov

unread,
Jul 2, 2013, 7:17:36 AM7/2/13
to
>>>>> Ben Morrow <b...@morrow.me.uk> writes:
>>>>> Quoth "Dave Saville" <da...@invalid.invalid>:
>>>>> On Tue, 2 Jul 2013 08:13:25 UTC, Mart van de Wege wrote:

[...]

>>> And perldoc is a bit of a red herring. Debian users know that
>>> Debian doesn't package docs by default and will see it is a
>>> 'Suggested' package, so they can install it when needed.

> This subthread started with a Debian user who didn't know that.

Even though I think I've clarified this thing up in [1], I'd
like to reiterate it once more: I knew about the perl-doc
package since well before this discussion, and it was indeed the
first package for me to look for "command-line perldoc."

My point was that I never used the latter, and I feel no loss of
that, or an urge to start using it, like, now.

[1] news:877ghe7...@violet.siamics.net

[...]

PS. And should anyone ask my advice about reading Perl documentation,
it's highly unlikely that perldoc(1) would come to my mind as
something deserving a mention.

Rainer Weikusat

unread,
Jul 2, 2013, 7:37:59 AM7/2/13
to
Ivan Shmakov <onei...@gmail.com> writes:

[...]

> PS. And should anyone ask my advice about reading Perl documentation,
> it's highly unlikely that perldoc(1) would come to my mind as
> something deserving a mention.

perldoc -f is quite useful for looking up descriptions of individual
builtin functions/operators. I also perldoc -q, although less
frequently.

Shmuel Metz

unread,
Jul 1, 2013, 5:38:11 PM7/1/13
to
In <87li5r4...@violet.siamics.net>, on 06/30/2013
at 05:57 PM, Ivan Shmakov <onei...@gmail.com> said:

> > A mixture of two different documentation formats could get very
> > ugly very quickly.

> Why?

Tool sets.

> To note is that there already /is/ a mixture,

Not within CPAN.

> > Python and Ruby are in the same tradition. I'd really prefer
> > something more in the tradition of Icon, SNOBOL and Wylbur.
> Could you please show some example, comparing the approaches?

As an example, in Perl I would concisely write [aeiou] while in Icon I
would write the more verbose any('aeiou'). In general, the Perl syntax
for a regex relies heavily on special characters while the other
languages I mentioned rely more heavily on words used as, e.g.,
function names. Icon in particular is nice because it has a cset[1]
(character set) data type and because patterns can be eaily built from
procedures.

[1] I vaguely recall that Perl 6 may have something similar.

Ivan Shmakov

unread,
Jul 2, 2013, 10:34:32 AM7/2/13
to
>>>>> Rainer Weikusat <rwei...@mssgmbh.com> writes:
>>>>> Ivan Shmakov <onei...@gmail.com> writes: [...]

>> PS. And should anyone ask my advice about reading Perl
>> documentation, it's highly unlikely that perldoc(1) would come to my
>> mind as something deserving a mention.

> perldoc -f is quite useful for looking up descriptions of individual
> builtin functions/operators.

How is it different to pointing one's browser at, say,
http://perldoc.perl.org/functions/NAME.html?

> I also perldoc -q, although less frequently.

Indeed, a sensible application. (And it certainly looks like a
kind of a "structured" search facility.)

But what makes me wonder is: was it added to work-around the
whole issue of the Perl FAQ being split into several manpages?

Jürgen Exner

unread,
Jul 2, 2013, 11:18:44 AM7/2/13
to
Ivan Shmakov <onei...@gmail.com> wrote:
>>>>>> Rainer Weikusat <rwei...@mssgmbh.com> writes:
>>>>>> Ivan Shmakov <onei...@gmail.com> writes: [...]
>
> >> PS. And should anyone ask my advice about reading Perl
> >> documentation, it's highly unlikely that perldoc(1) would come to my
> >> mind as something deserving a mention.
>
> > perldoc -f is quite useful for looking up descriptions of individual
> > builtin functions/operators.
>
> How is it different to pointing one's browser at, say,
> http://perldoc.perl.org/functions/NAME.html?

There is no need for an additional tool, i.e. a browser. There is no
need for an Internet connection. And the documentation is actually
matching the version of Perl that I am using.

jue

Ivan Shmakov

unread,
Jul 2, 2013, 11:23:38 AM7/2/13
to
>>>>> Jürgen Exner <jurg...@hotmail.com> writes:

[...]

>>> perldoc -f is quite useful for looking up descriptions of
>>> individual builtin functions/operators.

>> How is it different to pointing one's browser at, say,
>> http://perldoc.perl.org/functions/NAME.html?

> There is no need for an additional tool, i. e. a browser.

For me, it's perldoc(1) that's an "additional" tool.

> There is no need for an Internet connection.

Indeed. However, a developer is likely to have one, anyway.

> And the documentation is actually matching the version of Perl that I
> am using.

OTOH, the documentation packaged with the version of Perl used
may still have bugs already resolved in the newer version served
from http://perldoc.perl.org/.

Peter J. Holzer

unread,
Jul 2, 2013, 11:38:43 AM7/2/13
to
On 2013-07-02 15:18, Jürgen Exner <jurg...@hotmail.com> wrote:
> Ivan Shmakov <onei...@gmail.com> wrote:
>> How is it different to pointing one's browser at, say,
>> http://perldoc.perl.org/functions/NAME.html?
>
> There is no need for an additional tool, i.e. a browser. There is no
> need for an Internet connection. And the documentation is actually
> matching the version of Perl that I am using.

And it works for all perl-related stuff I have installed: Perl core,
CPAN modules, modules and scripts developed in-house ...

http://perldoc.perl.org/ only covers the core. HTML-formatted docs for
CPAN modules are on http://search.cpan.org. And your own stuff is
whereever you (or your co-workers) put it. So that's at least three
different places. With perldoc it's only one.

Ben Morrow

unread,
Jul 2, 2013, 11:37:42 AM7/2/13
to

Quoth Ivan Shmakov <onei...@gmail.com>:
> >>>>> Rainer Weikusat <rwei...@mssgmbh.com> writes:
> >>>>> Ivan Shmakov <onei...@gmail.com> writes: [...]
>
> >> PS. And should anyone ask my advice about reading Perl
> >> documentation, it's highly unlikely that perldoc(1) would come to my
> >> mind as something deserving a mention.
>
> > perldoc -f is quite useful for looking up descriptions of individual
> > builtin functions/operators.
>
> How is it different to pointing one's browser at, say,
> http://perldoc.perl.org/functions/NAME.html?

It doesn't hit the network? It doesn't involve leaving your terminal
window and using a whole nother program? (I realise this may be less of
a concern if you live inside Emacs, as I gather you do.) It's
considerably older, so many of us don't carry perldoc.perl.org in our
conscious mind (indeed, if I found myself needing to read the core docs
on a machine without perl installed, I'd probably find the perl tarball
on search.cpan.org instead).

Also, and perhaps more importantly, it gives you the documentation for
the version of perl you are running.

> > I also perldoc -q, although less frequently.
>
> Indeed, a sensible application. (And it certainly looks like a
> kind of a "structured" search facility.)
>
> But what makes me wonder is: was it added to work-around the
> whole issue of the Perl FAQ being split into several manpages?

Probably it was added to 'work around' the fact that the perl FAQ is
rather large, and searching with less(1) would come up with lots of
false positives. (And some systems don't even have less.)

perldoc -X *ought* to be useful, but in practice the perl infrastructure
hasn't yet got on top of building the index and keeping it up to date.
At some point someone went through all the core docs and put in a whole
lot of X<> entries, so it seems someone wants it to be useful.

Ben

Rainer Weikusat

unread,
Jul 2, 2013, 12:20:00 PM7/2/13
to
Ivan Shmakov <onei...@gmail.com> writes:
>>>>>> Rainer Weikusat <rwei...@mssgmbh.com> writes:
>>>>>> Ivan Shmakov <onei...@gmail.com> writes: [...]
>
> >> PS. And should anyone ask my advice about reading Perl
> >> documentation, it's highly unlikely that perldoc(1) would come to my
> >> mind as something deserving a mention.
>
> > perldoc -f is quite useful for looking up descriptions of individual
> > builtin functions/operators.
>
> How is it different to pointing one's browser at, say,
> http://perldoc.perl.org/functions/NAME.html?

It prints a help message telling me that perldoc -f needs an argument
(and some other information) instead of 'The requested URL
/functions/NAME.html was not found on this server' (SCNR).

I consider it more convenient to use: It is easier to type than the
WWW-based version, it uses 'my default pager' whose search
facilities are more comfortable to use and more powerful than those of
firefox, it displays less stuff I don't care for and it is faster. It
is also more reliable because it works regardless of the conditions
'on the internet' ATM, using local computing facilities in order to
access the documentation is nicer to the people who generously provide
the perldoc WWW-service and - last but not least - it is better for
national security because it doesn't inject additional noise into
worlds largest and technically most sophiscated facility for storing
p0rn trailers and 'penis enlargment pills!' spam mails some very
duteous people built using tax payer's money so that they can sift
through all this trash to ensure that there are no terorists hiding
somewhere in it.

Possibly offensive content below the page break.

Also, imagine the frustation of someone who is one a holy mission to
locate child pornography who only ever gets Perl documentation. Surely
the cause of quite a few nervous breakdowns ...


Charlton Wilbur

unread,
Jul 2, 2013, 12:08:56 PM7/2/13
to
>>>>> "IS" == Ivan Shmakov <onei...@gmail.com> writes:

IS> How is [perldoc -f] different to pointing one's browser at, say,
IS> http://perldoc.perl.org/functions/NAME.html?

perldoc -f uses the perl documentation associated with your
installation. perldoc.perl.org takes extra steps and brainpower to
ensure that they match.

Charlton


--
Charlton Wilbur
cwi...@chromatico.net

Eric Pozharski

unread,
Jul 3, 2013, 2:41:23 AM7/3/13
to
with <87vc4t2...@violet.siamics.net> Ivan Shmakov wrote:
*SKIP*
> OTOH, the documentation packaged with the version of Perl used
> may still have bugs already resolved in the newer version served
> from http://perldoc.perl.org/.

See how I resist urge to post a troll-o-meter.

--
Torvalds' goal for Linux is very simple: World Domination
Stallman's goal for GNU is even simpler: Freedom

Ivan Shmakov

unread,
Jul 3, 2013, 8:59:34 AM7/3/13
to
>>>>> Ben Morrow <b...@morrow.me.uk> writes:
>>>>> Quoth "Peter J. Holzer" <hjp-u...@hjp.at>:

> Content-type: text/plain; charset=UTF-8

[...]

>> :-) Push the right button and Ben sounds like Rainer.

> Touch\351 :).

As for the pedantry, I'm curious, since when has a lone \351
acquired a defined meaning in UTF-8? (As opposed to:
ISO-8859-1.)

[...]

Well, I have to admit that this message is more a response to
news:51aec8bf$11$fuzhry+tra$mr2...@news.patriot.net (posted in
news:news.software.readers, where I hereby cross-post), which I
haven't managed to answer back then, and do it now.

> Newsgroups: soc.culture.russian, news.software.readers
> From: Shmuel (Seymour J.) Metz <spam...@library.lspace.org.invalid>
> Date: Wed, 05 Jun 2013 01:12:31 -0400
> Subject: Re: [OT] Russian language

> In <87li6w1...@violet.siamics.net>, Ivan Shmakov said:

[...]

>> Unfortunately, an RFC isn't a magic spell scroll to prevent
>> subscribers from using any of the non-compliant software, ever.

(And the last time I've checked, which was perhaps a decade ago,
though, trn had no MIME support whatsoever.)

> That doesn't mean that the "American Usenet" (whatever that is) isn't
> affected by the standards, or that there are a higher percentage of
> nyekulturniy users in the USA than in Russia. Both counties have
> people who follow the rules and people who don't.

My point is that it's not the /lack/ of culture, it's the people
who still adhere to the considerably older culture of "Usenet
without MIME."

PS. Not to mention that the "American Usenet" has seen an extra decade
of "MIME-less" operation as compared to the "Russian Usenet."

Ben Morrow

unread,
Jul 3, 2013, 12:49:43 PM7/3/13
to
[news.software.readers removed, since I don't read it and don't believe
this response belongs there.]

Quoth Ivan Shmakov <onei...@gmail.com>:
> >>>>> Ben Morrow <b...@morrow.me.uk> writes:
> >>>>> Quoth "Peter J. Holzer" <hjp-u...@hjp.at>:
>
> > Content-type: text/plain; charset=UTF-8
>
> >> :-) Push the right button and Ben sounds like Rainer.
>
> > Touch\351 :).
>
> As for the pedantry, I'm curious, since when has a lone \351
> acquired a defined meaning in UTF-8? (As opposed to:
> ISO-8859-1.)

Sorry, something-or-other made Vim decide to write the file out in
8859-1 rather than UTF-8. I considered posting an apology, but decided
it would just be unnecessary noise.

Ben

0 new messages