I see this going wrong with =heading level 1 already. I like the numbers
in =headN too, by the way, as it makes inconsistencies easier to spot.
> And then replaced [...] and [=...] and /.../ and *...* with their more
> POD-like: L[...], C[...], I[...] and B[...] with a bare [foo] working as
> a "I have no idea what I'm linking to, but do the right thing" sort of
> wikiness, where L[...] is a more structured, POD-like link. For example:
L[] C[] I[] B[] are all hard to read. With <>, the weight is evenly
distributed, while with [], the weight is on the outside, next to that
capital letter that is just as large.
Visual comparison:
L[] C[] I[] B[] # I is worst
L<> C<> I<> B<>
So if [] is going to be used, may I suggest using lc letters with it
then?
l[] c[] i[] b[]
L[] C[] I[] B[]
L<> C<> I<> B<>
Still not great, but better IMO. Why are <> bad, by the way? Can't we
just change the meaning to be qq-like, that is: with nested content?
That means only for non-unicode >><< you need extra angle brackets.
Or maybe we introduce [] as an alternative for <>.
Also, how is C[@*INC[-1]] parsed? # I find this very hard to parse,
# visually
Likewise, %?INC{something}?
Two possible sources of inspiration for the whole documentation thing:
* Text::MetaMarkup
* Paragraphs CAN begin with a block level html tag, "h1: heading"
* Inline HTML tags can be used as "{b:bold}"
* Paragraph starting with * is a list
* Paragraph starting with # is comment
* Verbatim paragraphs simply start with "pre:"
* No support for tables yet
* PodTables
* See http://pugs.kwiki.org/?PodTables
Juerd
--
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html
http://convolution.nl/gajigu_juerd_n.html
Upon reading this, it is unclear to me whether you have read about the
Kwid format or you are simply guessing that Kwid is the same syntax
used by Kwiki.
It is not the same format at all. Kwid is merely /inspired/ by Kwiki,
which in turn is inspired by the (more usenix) features of modern wiki
languages. It is more fair to say that Kwid is much more inspired by Pod
than Kwiki.
Please read what is considered to be the de facto spec:
http://svn.openfoundry.org/pugs/ext/Pugs-Documentation/perlkwid.kwid
carefully and recomment.
A few notes.
To create Kwid I carefully studied the POD information model (the
semantic tree that POD parses to). Kwid uses the exact same info model.
This means that switching between the two without loss of information
is possible.
This makes the Kwid experiment much less risky, since it will be trivial to
convert in either direction.
As to the syntax, care has been taken to ensure that all the corner
cases are covered. And also covered elegantly.
Some people have argued that Kwid is only a syntactic change to Pod. I
would argue that they are correct. :) But this does not mean it is not
an important change. Kwid has an emphasis on minimizing the markup, and
using the markup one might use to discuss programming in everyday email.
This is hard to do in POD, but very easy to do in Kwid. Thus a bug win.
It is also wrong to imply that important semantic changes cannot be made
in the future. Things like introspection and transclusion. But that is
not the current concern.
In reality, Kwid does vary ever so slightly in semantics from POD. But only in
cases where POD seemed to have a wart. For instance Kwid allows named
hyperlinks: [The Pugs Source|http://svn.openfoundry.org/pugs].
Cheers, Brian
On 15/03/05 11:46 -0500, Aaron Sherman wrote:
> Wherein I propose (to the wrong list, sigh) a re-envisioning of Kwid in
> a more POD-like form.
>
> I did leave out some POD markup forms. Assume that, if I did not mention
> them, then I think they should keep the same prefix character (e.g. X<>)
>
Content-Description: Forwarded message - Re: [RFC] A more extensible/flexible POD (ROUGH-DRAFT)
> From: Aaron Sherman <a...@ajs.com>
> Date: Tue, 15 Mar 2005 11:43:39 -0500
> To: Stevan Little <ste...@iinteractive.com>
> Cc: perl6-c...@perl.org
> Subject: Re: [RFC] A more extensible/flexible POD (ROUGH-DRAFT)
> X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2)
>
> On Tue, 2005-03-15 at 09:37, Stevan Little wrote:
> > On Mar 15, 2005, at 12:54 AM, Nigel Hamilton wrote:
>
> > > There is a need for a higher level 'structural' documentation that
> > > hypertext is well suited to cover - something that spans more than one
> > > module. This will be especially important for CPan6 and connecting
> > > versions, and modules into bigger 'packages'.
> >
> > Agreed as well. It would be nice if CPAN6 or CP6AN or FreePAN (or
> > whatever it will eventually be called) have a more sophisticated
> > linking/documentation system which goes beyond the actual single
> > module. I even think this would be possible in the current CPAN if we
> > could get the L<> construct fixed, but that is another issue.
>
> Actually, I don't think that's at all another issue. It's the core of
> what you're talking about. L<> gives you the ability to link, and in
> several different ways. It's also broken in Perl 5, which makes a
> replacement sound attractive, but fixing it solves for much of that
> need.
>
> Taking a cue from the wiki world makes sense to me. Kwid is almost ideal
> as far as I can tell in that it:
>
> a) Does everything POD does
> b) Is backward compatible with Perl 5 in that it can be ignored by the
> parser in the same way.
> c) Makes many things easier
>
> Now, it does need some tweaking, I think, but nothing too severe. It
> just makes a few things harder, and a few other things non-POD-like for
> no particular reason. I like C<POD> for the ease of including keywords
> in C<perl> documentation. It's also B<trivial> to I<recognize> all
> markup quickly (visually or programmaticly).
>
> Kwid /on the other hand/ makes it a bit harder to [=find] that markup,
> and is *thus* not quite as easy to de-parse visually.
>
> I'd be thrilled if we just changed the "."-introduced things to
> "="-introduced things:
>
> = heading level 1
> == heading level 2
> =begin list
> * You don't really need the begin
> * But it doesn't hurt
> * and it allows
> some(code())
> to appear inside a list item.
> * Hmmm
> =end list
>
> And then replaced [...] and [=...] and /.../ and *...* with their more
> POD-like: L[...], C[...], I[...] and B[...] with a bare [foo] working as
> a "I have no idea what I'm linking to, but do the right thing" sort of
> wikiness, where L[...] is a more structured, POD-like link. For example:
>
> = Proposed Kwid Changes
> == Introduction
> It is my I[goal] to introduce an easier to use (for [POD] users)
> version of [Kwid], and impose it B[mercilessly] on the heathen
> masses!
>
> Markup can consist of C[[]]-delimited text such as C[[Kwid]] or
> a prefixed C[[]]-delimited text such as C[C[Kwid]]. Possible
> prefixes are:
> =begin list
> *= L
> A structured link ala POD C[L<>]
> *= B, I
> Bold or italics
> *= C
> Code
> =end list
> Anywhere a C[[]] can be used, a C[{}] can also be used. This is
> useful when you need to enclose unbalanced C[{], C[}], C{[},
> or C{]} characters.
>
> All formatting is introduced with C[=for] just as in POD, so:
>
> =for html <hr />
>
> works as you might expect. C[=begin] is similar, but introduces
> a block, ended by C[=end]
>
> =for html,xhtml,xml <img src="foo.png" alt="A foo!" />
> =begin !html,xhtml,xml
> You can't see the image, but it would be pretty!
> =end !html,xhtml,xml
>
> Notice the use of C[!format[,format...]] to indicate all formats
> not listed.
>
> "comment" is the null format, so you can always introduce a
> C[=for comment] or C[=begin comment], but lines which start with
> C[#] are always treated as comments anyway.
>
> Lists are introduced with C[=begin list], which is a special
> format. A list can be numbered, bulletted or defintion-style.
> Each type is introduced differently, e.g.:
>
> *1 numbered
> * bullets
> *= term
> definition
>
> Only a C[1] can follow the C[*]. So, your numbered list would
> look like:
>
> *1 First
> *1 Second
> *1 Third
>
> This tells Kwid to number your items, but does not allow strange
> things like:
>
> *2 First prime
> *3 Second prime
> *5 Third prime
>
> For that, you need C[*=]
>
> Thoughts?
>
> > Well, not everyone likes HTML (although I can't imagine why).
>
> * It's hard for humans to read
> * It imposes too much display-think on what should be content-think
> * It is not a proper super-set of the other documentation formats.
>
> XHTML addresses some of this, but still provides far too much
> display-oriented power for a high-level markup like POD or Kwid.
>
> --
> Aaron Sherman <a...@ajs.com>
> Senior Systems Engineer and Toolsmith
> "It's the sound of a satellite saying, 'get me down!'" -Shriekback
>
Actually, I quite like <> as the bracketing characters. They are
visually distinctive, they connect well with their adjacent C/X/L/etc
without visually merging into it (compare L<foo> with L[foo]), and in
the circumstance that you want to bracket an unbalanced bracket, you
just double (triple, whatever) up and add some space:
C<< $x > $y >>
Looks pretty clear to me.
--Dks
I read the Kwid documentation from the Pugs distribution in depth.
To create Kwid I carefully studied the POD information model
(the semantic tree that POD parses to). Kwid uses the exact same
info model. This means that switching between the two without
loss of information is possible.
I noted that in my original message.
This makes the Kwid experiment much less risky
Risk was not my concern. My concern was a Wiki-like model which is
inconsistent with many of the goals of POD. POD is intended to be PLAIN
OLD documentation. Kwid breaks this model of simplicity by introducing
unique boundary characters for many types of operations, and by making
the overall presentation more complex.
While I appreciate several features of Kwid, I feel that it should not
replace POD without first adopting a POD-like simplicity.
Some people have argued that Kwid is only a syntactic change to
Pod. I would argue that they are correct.
They are demonstrably wrong. You cannot Parse Kwid correctly by changing
the syntax of a POD parser.
For example, the behavior of
* foo
bar
Is totally dependent on what context it is enclosed in (.list or
top-level). This cannot be emulated with simple syntactic changes to a
POD parser.
Interestingly, this is one of the main benefits of Kwid, IMHO.
Re-read what I wrote and think about it. I think you'll find that it
avoids some of the major pitfalls of old-POD and incorporates all of the
useful features of Kwid while maintaining a simplicity that is as
elegant as POD's.
POD 1:1 mappings:
POD My Kwid Proposal
x<...> x[...] or x{...} (where x is C, B, I, X, L, etc)
=item * *
=item foo *= foo
=item n *1
=for someformat =for someformat
=over n =begin list
The one obvious thing to POD users is the replacement of <> with [] or
{}. Why is this? Because < and > are used in un-balanced ways in a large
number of situations, so they should not be the primary bracketing
constructs. Also because the visual cue to many users of POD is that
it's SGML-like, and that way lies danger, since POD's <>-bracketed
constructs are not intended to be balanced.
Also, there should be a "=begin block" which in HTML would be
<blockquote>, and in POD would be "=over n".
Kwid 1:1 mappings:
Kwid My Kwid Proposal
* *
.list =begin list ... =end list
.someformat =begin someformat ... =end someformat
.comment =begin comment ... =end comment
[generic] [generic]
Extensions I proposed:
=begin x,y,z Section used only in the given formats
=begin !x,y,z Section NOT used in the given formats
As you can see, what I proposed IS a simple syntactic transformation of
> I quite like <> as the bracketing characters. They are
> visually distinctive, they connect well with their adjacent C/X/L/etc
> without visually merging into it (compare L<foo> with L[foo]), and in
> the circumstance that you want to bracket an unbalanced bracket, you
> just double (triple, whatever) up and add some space:
>
> C<< $x > $y >>
>
> Looks pretty clear to me.
You are confusing aesthetics with usability. Yes, the above looks clear,
but then I have to type "C<< " and " >>" just to tell the POD parser
that there might be unbalanced < or > characters in my string. You're
failing to apply Larry's rules of Perl 6. Huffman and the "easy things
easy, while hard things are possible" principles demand that a common
case not require copious extra gunk, and noting could be simpler than:
C[$x > $y] is about as B[easy] as it gets in [Perl]
vs:
C<< $x > $y >> is about as B<easy> as it gets in L[Perl|perl]
without going full Wikish:
[=$x > $y] is about as *easy* as it gets in [Perl]
However, saving a couple of keystrokes and cleaning up the above text is
inconsequential compared to the massive savings in terms of taking
advantage of the legions of people who are learning Wiki syntax these
days. Making POD *more* Wiki-like without sacrificing useful features of
POD is invaluable in terms of tech writers and other
non-Perl-programmers writing useful docs in POD!
I think consistency with "goals" is fine, but consistency with the "data
model" is more important. POD has a very nice data model that maps well
to other formats. It is really well done. So logic stands that any
format which can map cleanly to POD and yet offer an advantages to the
author is a benefit to the author without needing to retool the
extensions. Further any dialect can be converted to any other without
information or structure loss.
POD's syntax is certainly good enough, but doesn't match the way that
people commonly write structured prose; especially in this era of text
based formatting such as in wikis. POD's syntax /elegance/ also tends to
break down into workarounds too fast in edge cases.
NOTE: POD does have some minor warts in the data model, but they can be
fixed later on.
> Kwid breaks this model of simplicity by introducing unique boundary
> characters for many types of operations, and by making the overall
> presentation more complex.
Let's look at an example and explore the rationale:
Pod uses this syntax B<bold stuff> which is fine, but it is a very
common idiom to use *bold stuff* outside of Pod. Usenet posts, email,
irc, etc. People use this, they grok it, and lots of tools grok it as
well. (I'm thinking irc clients as an example).
So how do we get *there* formally and avoid making a mess where * is
just supposed to be an asterisk?
Kwid does this by formally changing
X<...>
into
{X...X}
Where `X` is any Pod code like `B`, `I` or `C`. Since there are only 3
codes in common use (ignore `L` for a second), Kwid thus uses {*bold*}
{/italic/} and {`code`}.
This has a subtle but significant advantage over X<...>. The difference
is that instead of the ending marker being '>' it is 'X}', which is
orders of magnitude less likely to show up in the content being escaped.
This means you avoid the X<< ... >> mess almost entirely.
But the better part of this is that the Kwid forms can be relaxed to
drop the curlies in most cases. This technique uses the principle of
"hugging". So you can say *$a = $b * $c* and get the bolded equation
since the middle * isn't hugging anything. Hugging is a nontrivial
heuristic, but let's just say it Does The Right Thing. And if you aren't
sure just say {*$a = $b * $c*}. ie, just add some curlies to what you
already have.
*Bold* and /italics/ are rather obvious, intuitive and commonly used.
Backticks for `code` was chosen because backticks are rarely used in
code. Except of course when writing about Kwid itself. But to get
C<`code`> you just go to the curlies: {``code``}. Simple. Backticks also
seem to be right visually, but that's just my opinion.
For L<...>, I decided to use the very common wiki idiom of [...] for a
link. Everything in the `...` is the same as Pod.
I is purely subjective whether Kwid's overall presentation is more or
less complex than Pod's. Kwid attempts to elegantly move towards the
modern internet era of social software, with the hope that those
participating in those arenas might feel more at home.
> While I appreciate several features of Kwid, I feel that it should not
> replace POD without first adopting a POD-like simplicity.
>
> Some people have argued that Kwid is only a syntactic change to
> Pod. I would argue that they are correct.
>
> They are demonstrably wrong. You cannot Parse Kwid correctly by changing
> the syntax of a POD parser.
>
> For example, the behavior of
>
> * foo
> bar
The behaviour of this is completely consistent. You may need to reread the
perlkwid document for it has recently changed.
ie
* foo
bar
* baz
boom
matches
* foo bar
* baz boom
matches
.list
* foo bar
* baz boom
.list.
The explicit `.list` is only needed when the parser cannot guess from
the context.
...
I would encourage those interested in further fleshing out Kwid to join
irc://irc.freenode.net/#kwid where all of this is actively being
discussed.
Cheers, Brian
vs Kwid:
`$x > $y` is about as *easy* as it gets in [Perl]
Did you really read `perlkwid.kwid`? There is simply no mention
of `[=...]` as a markup option, which makes me wonder where you
got it from?
> However, saving a couple of keystrokes and cleaning up the above text is
> inconsequential compared to the massive savings in terms of taking
> advantage of the legions of people who are learning Wiki syntax these
> days. Making POD *more* Wiki-like without sacrificing useful features of
> POD is invaluable in terms of tech writers and other
> non-Perl-programmers writing useful docs in POD!
Well said!
Cheers, Brian
> Kwid does this by formally changing
>
> X<...>
>
> into
>
> {X...X}
Ok, where is THAT proposal?! I'm reading the doc that's in
doc/perlkwid.kwid in the pugs source tree. Hmmm... odd, I just did an
update and it's GONE now... was I looking at some phantom doc that had
an old spec for Kwid?!
> Where `X` is any Pod code like `B`, `I` or `C`. Since there are only 3
> codes in common use (ignore `L` for a second), Kwid thus uses {*bold*}
> {/italic/} and {`code`}.
Well, I'm personally not fond of the bare-bracketting with {}, but as
long as it's not a stand-alone /italic/ like it was in the original doc,
that sounds fine. Why {/foo/} is more readable than I[foo], I'm not
sure... but I'll try to take your word for it.
> For L<...>, I decided to use the very common wiki idiom of [...] for a
> link. Everything in the `...` is the same as Pod.
There, I think you're making a small mistake, but not a huge one. I'd
separate out magical wiki-like [foo] from pedantic, pod-like L[foo] so
that you can get either one. Wiki's [foo] is like a URN, where POD's
L[foo] is more in tune with a relative URL.
> > While I appreciate several features of Kwid, I feel that it should not
> > replace POD without first adopting a POD-like simplicity.
> >
> > Some people have argued that Kwid is only a syntactic change to
> > Pod. I would argue that they are correct.
> >
> > They are demonstrably wrong. You cannot Parse Kwid correctly by changing
> > the syntax of a POD parser.
> The behaviour of this is completely consistent. You may need to reread the
> perlkwid document for it has recently changed.
Apparently.
> ie
>
> * foo
> bar
> * baz
> boom
>
> matches
>
> * foo bar
> * baz boom
>
> matches
>
> .list
> * foo bar
> * baz boom
> .list.
Hrm...
How, then do you differentiate:
* Bullet list.
1. Numbered list.
Other
Term/definition lists
? In POD, that would be:
=item *
Bullet list
=item 1
Numbered list.
=item Other
Term/definition lists
> I would encourage those interested in further fleshing out Kwid to join
> irc://irc.freenode.net/#kwid where all of this is actively being
> discussed.
Sorry, no access to IRC at work. If the specification of core pieces of
P6 are being done off-list, why is there a list?
> vs Kwid:
>
> `$x > $y` is about as *easy* as it gets in [Perl]
>
> Did you really read `perlkwid.kwid`?
Yes, and can you please stop asking that question? I read it several
times, and you're starting to get just this side of insulting. If I got
something wrong, fine, say so. Stop trying to dismiss everything else
I've said by suggesting that I'm completely uninformed.
> There is simply no mention
> of `[=...]` as a markup option, which makes me wonder where you
> got it from?
I got it from that document.... or so I thought. Since it's now deleted,
I'm no longer sure. Having a reference again would be nice. It's hard to
have a conversation about a document that does not exist.
Ok, that said PLEASE DO NOT USE UNBALANCED CHARACTERS TO DELIMIT!
Please, for the love of all that is valid input to any scanner / parser
anywhere, do not re-introduce quoting hell. Really. Please. Don't. I'll
buy you a beer. I swear, just put the unbalanced operator down and step
back.
Sorry, but I use POD specifically because it makes my life simple.
Introducing unbalanced quotes into it would remove that functionality. A
few examples:
"And then I says, `Mabel,' I says, `shut up.'"
The ``` character is no longer used.
And of course, TONS of Gnu documentation which uses the TeX-friendly:
This is the way you ``quote'' things.
which means cutting-and-pasting such docs is now much more
labor-intensive.
Sorry, it has been moved around the pugs source tree a bit. It is
currently swinging from the documentation branch:
ext/Pugs-Documentation/perlkwid.kwid
> > Where `X` is any Pod code like `B`, `I` or `C`. Since there are only 3
> > codes in common use (ignore `L` for a second), Kwid thus uses {*bold*}
> > {/italic/} and {`code`}.
FYI, it turns out that at least one modern format,
[Markdown|http://daringfireball.net/projects/markdown/syntax], uses
backticks for code. Markdown doesn't really map to the Pod space very
well, but it has a few gems...
> Well, I'm personally not fond of the bare-bracketting with {}, but as
> long as it's not a stand-alone /italic/ like it was in the original doc,
> that sounds fine. Why {/foo/} is more readable than I[foo], I'm not
> sure... but I'll try to take your word for it.
In short you don't need to worry about I[[ $foo[3] ]]. Since the ending
marker is '/}', you only ever need to worry about escaping anything but
'/}' itself. I might as well show how that would be done:
{/foo \/} bar/}
{{/foo /} bar/}}
{/foo { /} } bar/}
Those are 3 possible ways to make I<foo /} bar>. Note that '{ ' and ' }'
are the "asis" or "leave me alone" indicators. But the real point is
that '/}' is rather unlikely to ever show up in italics outside this
discussion.
> > For L<...>, I decided to use the very common wiki idiom of [...] for a
> > link. Everything in the `...` is the same as Pod.
>
> There, I think you're making a small mistake, but not a huge one. I'd
> separate out magical wiki-like [foo] from pedantic, pod-like L[foo] so
> that you can get either one. Wiki's [foo] is like a URN, where POD's
> L[foo] is more in tune with a relative URL.
So I will give a little extra info on this...
The idea is to DWIM and there is a lot you can do with the `[...|.../...]`
syntax. Pod's strict syntax is:
L<text|resource/"section">
/Text/ is obviously the text that should render. /Resource/ can be a
local manpage (ie another Pod document) and then /section/ is a section
in that doc. If /resource/ is empty, the current document is assumed.
/Resource/ can also be a fully qualified url and in that case section
does not apply.
For some reason Pod does not allow L<text|url> but there seems to be no
obvious reason. (This is the the only major thing where Kwid strays from
Pod's info model).
I am not certain what use case `L[...]` could get you that isn't already
covered by `[...]`.
This makes no sense in html and perlpod says:
* And perhaps most importantly, keep the items consistent: either
use "=item *" for all of them, to produce bullets; or use
"=item 1.", "=item 2.", etc., to produce numbered lists; or use
"=item foo", "=item bar", etc. -- namely, things that look
nothing like bullets or numbers.
In Kwid, therefore, this:
* Bullet list.
+ Numbered list.
- Other
Term/definition lists
would produce 3 single item lists. You can obviously switch types in sublists:
* Bullet list.
++ Numbered list.
* another bullet
-- Other
Term/definition lists
> > I would encourage those interested in further fleshing out Kwid to join
> > irc://irc.freenode.net/#kwid where all of this is actively being
> > discussed.
>
> Sorry, no access to IRC at work. If the specification of core pieces of
> P6 are being done off-list, why is there a list?
Honestly this project was started as an /experiment/ and was not
intended to distract p6l. Kwid requires no extra input from the language
side as long as:
=kwid
...
=cut
is ignored by the interpreter. This turns out to be the case with both
`perl` (Perl 5) and `pugs`. I am fine with some mailing list discussion
but I would rather spend the cycles on a reference implementation that
could be easily modified later.
Cheers, Brian
Aaron, /please/ take no offense. I just don't understand where you
picked `[=...]` up other than that is the (hated) syntax artifact of the
original `CGI::Kwiki`. I don't recall ever using it in regards to Kwid.
I'm sure there's a reasonable explanation. :)
> > There is simply no mention
> > of `[=...]` as a markup option, which makes me wonder where you
> > got it from?
>
>
> I got it from that document.... or so I thought. Since it's now deleted,
> I'm no longer sure. Having a reference again would be nice. It's hard to
> have a conversation about a document that does not exist.
In my first mail of the thread I pointed to it:
http://svn.openfoundry.org/pugs/ext/Pugs-Documentation/perlkwid.kwid
As I said in my last mail, it has moved around a bit, so pardon our dust.
> Ok, that said PLEASE DO NOT USE UNBALANCED CHARACTERS TO DELIMIT!
> Please, for the love of all that is valid input to any scanner / parser
> anywhere, do not re-introduce quoting hell. Really. Please. Don't. I'll
> buy you a beer. I swear, just put the unbalanced operator down and step
> back.
A beer is tempting...
> Sorry, but I use POD specifically because it makes my life simple.
> Introducing unbalanced quotes into it would remove that functionality. A
> few examples:
>
> "And then I says, `Mabel,' I says, `shut up.'"
>
> The ``` character is no longer used.
>
> And of course, TONS of Gnu documentation which uses the TeX-friendly:
>
> This is the way you ``quote'' things.
No problem with this example. `` doesn't /hug/ anything so it shows up
asis. Cutting and pasting LaTeX doesn't mess anything up in this regard.
But really, you'll likely refactor it to:
This is the way you "quote" things.
Since `` and '' don't do anything for you in either Pod or Kwid.
Cheers, Brian
No, I am relating simplicity and consistency to usability. If it
costs two extra keystrokes, I'm cool with that.
> and noting could be simpler than:
>
> C[$x > $y] is about as B[easy] as it gets in [Perl]
C[$x[0] > $y] # hmmm...parser ok with that?
C[$x[0] > $] # hmmm...error, but what was intended: $y] or $]]?
C<< $x[0] > $y >> # parser's ok (so's the human)
C<< $x[0] > $ >> # oh, obviously $y was intended
> However, saving a couple of keystrokes and cleaning up the above text is
> inconsequential compared to
"...the power of the Force." Sorry, had to say it.
> the massive savings in terms of taking
> advantage of the legions of people who are learning Wiki syntax these
> days. Making POD *more* Wiki-like without sacrificing useful features of
> POD is invaluable in terms of tech writers and other
> non-Perl-programmers writing useful docs in POD!
Here's the real crux of your argument, and the real crux of my problem
with this approach. I don't like Wiki syntax; to me, it seems
arbitrary and non-unified. I use Wikis, I run one, I recognize their
usefulness. I just don't like them.
Here are some of the formatting rules for TWiki (the Wiki version I
use):
1) Elements of a bulleted list must match /^ {3}\* /
2) Elements of a numbered list must match /^ {3}1 /
3) Headings must match /^----*\++/. Number of +s determines level
4) *bold*
5) /italic/
6) =fixed font=
7) <verbatim> put text to be rendered as-is here </verbatim>
What is the organizing priciple? What similarities do they have?
Quick, what level heading is this: +++++ ? And this is just the
beginning...I didn't even get into the weird cases like ==bold fixed
font== and __bold italic__, which have no perceptible relation to
their component pieces (I would have expected */bold italics/*). Yes,
it's powerful and it can do useful things, but as soon as I stray from
the most basic stuff I find myself going back to the docs to look up
how it's done.
Contrast this to POD (I'm not trying for point-to-point equivalence):
1) All formatting starts with = in the first column.
2) Every POD command must have a blank line above and below it.
3) A list of any type starts with =over N and finishes with =back
4) List items are denoted with =item X where X is either * (bullets),
an int (numbered), or word/phrase. Use only one type per list.
5) Headings are denoted by =head1, =head2, etc
6) Formatting effects are done with X<text> where X is one of:
B (bold), C (code), I (italics). You may also use
X<< text >> or X<<< text >>> if you have < or > in your text.
7) Text that is indented will be rendered as-is in fixed width font.
Aside from links, that's pretty much the entire perlpodtut boiled down
into 7 bullets; a little experimentation to get the hang of it and it
all holds together nicely, easy to remember.
I freely admit that the link syntax in POD is difficult to manage and
not as powerful as it could be.
--Dks
PS I'm subscribed to the list so feel free to just reply there; I
don't need a personal copy as well.
First off, thanks for your kind responses. I'm sure I just got confused
by some web page I was looking at, and overwrote part of my stack that
I'd just populated from the Kwid doc. And thanks also for pointing me to
the Kwid docs where they live now.
> In short you don't need to worry about I[[ $foo[3] ]]. Since the ending
> marker is '/}', you only ever need to worry about escaping anything but
> '/}' itself. I might as well show how that would be done:
We're suffering a major disconnect over the nature of bracketting.
I see no reason to I[[ $foo[3] ]] at all.
That would simply be I[$foo[3]] ... we are using a real parser here, no?
I can't imagine basing this on some pile of regexps, and we all have
"matched the balanced brackets" tools at our disposal, regardless of
what parser / parser-generator we're using these days.
Here's a simple Parse::RecDescent grammar for my proposal, so that we
can talk about it in more reasonable terms. Please note that I'm
TERRIBLE with P::RD, so I'm sure someone can figure out why I keep
ending up with the string "text_chunk" in my resulting syntax tree ;-)
See attached program and sample input. Just run parseajskwid.pl on
ajskwid.kwid.
> For some reason Pod does not allow L<text|url> but there seems to be no
> obvious reason. (This is the the only major thing where Kwid strays from
> Pod's info model).
That's not POD's info model, that's POD's implementation limitation.
> I am not certain what use case `L[...]` could get you that isn't already
> covered by `[...]`.
I'm very happy with the modern Wiki convention (keep in mind, when we
talk about Wiki, we're talking about something that's either nearly as
old as or older than the Web, depending on what you count as it's birth)
of using [...] as a sort of magical indexer. Like I said elsewhere, you
might have:
[Kwid] in your document.
This is a hint that you expect there to be a thing named "Kwid"
somewhere and you wish that somewhere to be applied thusly:
L[Kwid|somewhere] in your document.
Where L[Kwid] would simply fail because it is as strict as POD, and it
won't find L[Kwid|perlkwid].
Other examples of this DWIMery:
[http://www.perl.org/] => L[http://www.perl.org/|http://www.perl.org]
[;-)] =>
=for html <img src="winksmily.png" alt=";-)" />
=for !html ;-)
> > How, then do you differentiate:
> >
> > * Bullet list.
> > 1. Numbered list.
> > Other
> > Term/definition lists
[...]
> In Kwid, therefore, this:
>
> * Bullet list.
> + Numbered list.
> - Other
> Term/definition lists
That was the answer I was looking for, thanks.
I'm not thrilled with it (again, too many special characters that people
might have thought they could get away with using in their
documentation), but it's not too bad at all.
> > Sorry, no access to IRC at work. If the specification of core pieces of
> > P6 are being done off-list, why is there a list?
>
> Honestly this project was started as an /experiment/ and was not
> intended to distract p6l. Kwid requires no extra input from the language
> side as long as:
>
> =kwid
> ...
> =cut
Well, look over AJS Kwid, and see what you think. The bullet syntax you
give could work fine as a replacement for what I demonstrate, but I
think everything else is pretty much 1:1. Now it's just a matter of: do
you make it Wikiish or PODish?
Yes, yes, yes.
Pod is one of the things Perl 5 did almost exactly right. It's
simple, intuitive, and stays out of your way. It gives you most of
the formatting primitives you actually *need*, and nicely balances the
need for easy-to-remember and easy-to-type formatting codes with the
need to avoid using them on accident. It's a very clean,
low-punctuation format, which makes it visually distinctive from the
surrounding code.
Specifically, I like the use of angle brackets in Pod. Angle brackets
are simple, distinctive shapes; they remain wide in variable-width
fonts; they're associated with formatting codes in my
(HTML-influenced) mind. The most common use of them in Perl 5--method
call/dereference--is going away in Perl 6, which makes them even more
usable. (I never have a problem correctly marking up C<< $foo > $bar
>>, but occasionally I carelessly type C<$foo->bar>.)
Pod needs incremental improvements--tables, (maybe) footnotes, simpler
links, tweaks to =begin/=end, etc. Pod does *not* need to be ripped
out and replaced with something very different, especially something
that involves adding "line noise" to documents intended for human
consumption.
In my mind at least, Pod has five goals:
1. Simple.
2. Adequate.
3. Easy to write.
4. Easy to convert.
5. Readable without a formatter.
#5 may be last on the list, but it's not least.
--
Brent 'Dax' Royal-Gordon <br...@brentdax.com>
Perl and Parrot hacker
"I used to have a life, but I liked mail-reading so much better."
Absolutely, and that's why I'd like to see more POD details preserved.
> It's simple, intuitive, and stays out of your way. It gives you most of
> the formatting primitives you actually *need*, and nicely balances the
> need for easy-to-remember and easy-to-type formatting codes with the
> need to avoid using them on accident. It's a very clean,
> low-punctuation format, which makes it visually distinctive from the
> surrounding code.
This is the spirit in which I've absorbed some of Kwid into my proposal
only where it supports those goals. I've removed some extra formatting
characters because I thought that they added too many chances for
overlap with real documentation. I've also searched my local PODs to see
where AJS Kwid would overlap POD and cause problems (6.4% of my local
PODs, for example, start some lines with "*", which would be a minor,
but notable problem, but 13.1% start a line with "-" which is a larger
problem, thus my "*1" which could easily be "*-" if we prefer that).
> Specifically, I like the use of angle brackets in Pod. Angle brackets
> are simple, distinctive shapes; they remain wide in variable-width
This is aesthetic preference. I could cite the reasons that I have an
aesthetic preference for the other syntax, but the reality is that angle
brackets aren't angle brackets; they are less-than (E<lt>) and greater-
than signs (E<gt>). We ignore this fact at our peril, and the hacks in
pod syntax (e.g. C<< < >>) to get around this are glaring anti-
huffmanisms.
> The most common use of them in Perl 5--method call/dereference--is
> going away in Perl 6
Hmm, I remain unconvinced of that as the most common use, especially
with the copious use of =>. Still, in my local source tree you're right,
though by < a factor of 2.
Perl 6 also adds new uses of E<gt> and E<lt> for pipelining, and further
expands the usefulness of the => operator as a pair constructor. Rules
also add new uses of these characters, but those are balanced, so
improving POD with a real grammar specification would solve for that.
> Pod needs incremental improvements--tables
Oops, forgot that one. I'll add it tonight, when I get home from work.
> (maybe) footnotes
Good point, and I'd add that to X[...] rather than introducing something
new, personally.
> simpler links, tweaks to =begin/=end, etc.
I think everything you list above is EXACTLY AJS Kwid, with one
exception, which is the dreaded paradigm shift of using [] instead of <>
Much as it may be an EMOTIONAL sticking point, it's a very minor thing.
If we can agree on everything else, and I suspect we can, then let's
come back to that.
> Pod does *not* need to be ripped
> out and replaced with something very different,
yes, yes, yes!
> especially something
> that involves adding "line noise" to documents intended for human
> consumption.
yes, yes, yes!
Thanks Brent, I'm not sure if you intended your mail as an endorsement,
but other than one sticking point, you and I appear to be on the same
page. Thank you for your message.
> C[$x[0] > $y] # hmmm...parser ok with that?
> C[$x[0] > $] # hmmm...error, but what was intended: $y] or $]]?
In the former case, it's fine. See the grammar I sent last night.
In the latter case, you would get balanced-[] matching, and given how
hard it is for PERL to do the right thing, there, I think it's fair to
fall back on "only perl can parse perl", and just do what the eye
suggests is correct. Remember that POD (and thus Kwid) are not intended
to be Perl-specific, just Perl-friendly. You can always:
C{$x[0] > $]}
If you need unbalanced []s
That said, I'd be ok with something like Z[name|text] and Z{name|...}
for doing external grammars, but it'd still have to live inside one of
the balanced operators correctly.
Please also note that as a side-effect of using extract_bracketed in my
grammar,
C[\]]
works. This introduces escaping which may or may not be good. It was not
a conscious decision, just a side-effect. I PREFER:
C{]}
Which also works.
> C<< $x[0] > $y >> # parser's ok (so's the human)
> C<< $x[0] > $ >> # oh, obviously $y was intended
By the same token:
C[ $x[0] > $y ] # Parser's ok
C[ $x[0] > $ ] # Parser's ok... human might care
> > the massive savings in terms of taking
> > advantage of the legions of people who are learning Wiki syntax these
> > days. Making POD *more* Wiki-like without sacrificing useful features of
> > POD is invaluable in terms of tech writers and other
> > non-Perl-programmers writing useful docs in POD!
>
> Here's the real crux of your argument, and the real crux of my problem
> with this approach. I don't like Wiki syntax; to me, it seems
> arbitrary and non-unified.
Agreed.
> I use Wikis, I run one, I recognize their usefulness. I just don't like them.
Fair enough. I don't like them either, for many reasons. I like them in
equal measure. It's a bit like sendmail ;-)
> Here are some of the formatting rules for TWiki (the Wiki version I
> use):
>
> 1) Elements of a bulleted list must match /^ {3}\* /
> 2) Elements of a numbered list must match /^ {3}1 /
> 3) Headings must match /^----*\++/. Number of +s determines level
> 4) *bold*
> 5) /italic/
> 6) =fixed font=
> 7) <verbatim> put text to be rendered as-is here </verbatim>
>
> What is the organizing priciple? What similarities do they have?
Yes, yes, yes. I agree. See AJS Kwid. Please.
> Contrast this to POD (I'm not trying for point-to-point equivalence):
>
> 1) All formatting starts with = in the first column.
Add ONE new item to that list: *
> 2) Every POD command must have a blank line above and below it.
Why do you want TWO ways of determining commands? I can see requiring a
blank line above OR below, just for readability, but not both. I'm a POD
lover. I've used POD since the mid-90s and I'm THRILLED with it.
However, there's that nagging problem with having to type:
\n\n=head1 Foo\n\n=head2 Introduction\n\nTopics:\n\n=over 5\n
\n=item *\n\nFoo vs. Bar\n\n=item *\n\nC<\n>\n
AJS Kwid:
\n= Foo\n\n== Introduction\n\nTopics:\n* Foo vs Bar\n* C[\n]\n
> 3) A list of any type starts with =over N and finishes with =back
Again, you're asking for a second way to determine what point 1 already
told you. If this were code, I might buy that, but it's documentation.
We like documentation. We want to encourage documentation and =over
ain't the way to encourage nothing ;-)
If you prefer a requirement for a leading blank line for readability, I
could see that, and we could talk about it. I do see the value in
"=begin list/=end list", but only in cases like this:
=begin list
* a
* b
=end list
=begin list
* Apples
* Grapes
=end list
Where you wish to make it clear that you are ending the first list.
Which do you find more readable:
=over 5
=item C<--help>
Help is given
=item C<--verbose>
Verbosity is given
=item C<--debug>
Debugging is done
=back
vs:
=begin list
*= C[--help]
Help is given
*= C[--verbose]
Verbosity is given
*= C[--debug]
Debugging is done
=end list
Replace *= with + and =begin list with .list and =end list with .list.
for the original Kwid proposal.
> 4) List items are denoted with =item X where X is either * (bullets),
> an int (numbered), or word/phrase. Use only one type per list.
> 5) Headings are denoted by =head1, =head2, etc
> 6) Formatting effects are done with X<text> where X is one of:
> B (bold), C (code), I (italics). You may also use
> X<< text >> or X<<< text >>> if you have < or > in your text.
> 7) Text that is indented will be rendered as-is in fixed width font.
There's a 1:1 mapping with all of the above except for the <> vs []/{}
and "=head1 heading" vs "= heading", which are simply syntactic
transformations. If you want "=head1", then you can add a pre-processor,
or we could even have a "=begin head1 / =end head1" for the formatting
bondage crowd.
> PS I'm subscribed to the list so feel free to just reply there; I
> don't need a personal copy as well.
Ok, will do. Some people filter mail such that a reply to them CCed is
seen as "personal" (I do this, for example), some don't. Glad to do what
you're comfortable with.
Other than awareness, this really doesn't have a point to it.
In ASCII, ' was meant as an apostrophe, but we use it as a quote.
Yen was never meant to have anything to do with zipping.
Guillemets originally had nothing to do with parallelization.
The hacks for square brackets are exactly the same. Think of how
C[@foo[0]] would be parsed without nesting. And if you say nesting fixes
all, then consider C["]"] as a counter-example. Are you willing to parse
code in a "simple" documentation format?
> > Pod needs incremental improvements--tables
> Oops, forgot that one. I'll add it tonight, when I get home from work.
See PodTables in the Pugs wiki.
> > Pod does *not* need to be ripped out and replaced with something
> > very different,
> yes, yes, yes!
Agreed.
> > especially something that involves adding "line noise" to documents
> > intended for human consumption.
> yes, yes, yes!
Agreed, though I like *bold*, /italic/, _underline_ and `code` very
much. I'd like an option to enable it.
Aaron,
I think AJS Kwid is fine if it fits your brain. It doesn't fit mine in
much the same way that Kwid doesn't fit yours, in much the same way that
Pod doesn't quite fit either of ours.
The interesting thing to me is that all 3 syntaxes map over the same
data model and thus are easily interchangable. The other interesting
thing is that all three could be supported without affecting the Perl5
or Perl6 syntax proper.
Sam "mugwump" Vilain refers to each of these syntaxes as /Pod dialects/.
He is working on more formally defining the common model or "AST" that
these dialects map to.
Given that, I am going to continue working on the Kwid dialect and
developing the Kwid tools I have started. The ideas in AJS Kwid seem too
different to incorporate without muddying the Kwid vision. I encourage
you work on the AJS Kwid dialect too. I would guess that the Kwid tools
I'm working on, when completed, could be easily adapted to AJS Kwid.
Cheers, Brian
PS All my work thus far is available in the pugs repository.
Since when is anything in Perl 6, except its name, set in stone?
PodTables is a more detailed and more consistent approach to a
suggestion I did long time ago.
It also states what I found important when inventing the syntax: things
to consider when inventing your own syntax for the same feature.
> > > Pod needs incremental improvements--tables
> > Oops, forgot that one. I'll add it tonight, when I get home from work.
>
> See PodTables in the Pugs wiki.
Or see the archive of this list, where we hammered it out previously.
YMMV. I'll have the second revision available sometime tonight, I think,
hopefully with an AJS Kwid to KwidData to POD translator (lossy, but
still mostly workable).
> The interesting thing to me is that all 3 syntaxes map over the same
> data model and thus are easily interchangable. The other interesting
> thing is that all three could be supported without affecting the Perl5
> or Perl6 syntax proper.
If any of the above was news to you, then I suggest you take another
look at why POD (and more generally, any abstract markup language)
exists. If any of the above were NOT true, it would be contrary to the
entire point of an abstract, layout-neutral markup language.
It is, however, contrary to the spirit of POD for you or me to continue
much further down this road (see below).
> Sam "mugwump" Vilain refers to each of these syntaxes as /Pod dialects/.
> He is working on more formally defining the common model or "AST" that
> these dialects map to.
Why? Seriously, why on earth do you want to encourage the proliferation
of variant markup languages?! There aren't enough?
My effort here was to try to PREVENT the proliferation (e.g. by Kwid and
POD butting heads and ending up in a stalemate). The only problem is
that, presented with a compromise, the Kwid folks seem to be content to
ADD it to the list of variants rather than, in fact, compromise and
collapse the list.
I'll continue only as far as is needed to propose this in full as an
example parser / converter, and then I'm going to stop. My goal is not
to proliferate the number of markups further, and I'd MUCH rather see
Perl 6 rely on POD than fragment the SINGLE MOST IMPORTANT TASK in
creating code to share with the world: documentation.
If I'm left on a desert island with POD, then the only part I'll lament
is the desert island.
Only in name. Years of HTML and Perl have trained me to treat these
as bracketing constructs, and Perl 6 is set to increase that use.
> and the hacks in
> pod syntax (e.g. C<< < >>) to get around this are glaring anti-
> huffmanisms.
Whatever bracketing character we decide to use, there will always be
occasions where we need to use it in an unbalanced way within a
formatting code. (Though I do admit that angle brackets are more
likely to be unbalanced than other characters.)
The problem I have with square brackets specifically is that they get
lost really easily, especially in variable-width fonts. Gmail, for
example, displays e-mail in a sans-serif font, and virtually all such
fonts have narrow square brackets. The square brackets in your
examples were visually lost in the surrounding text--without spaces,
square brackets are invisible. That's okay when you're subscripting,
because your brain doesn't really need them to understand what's going
on, but it's not when you're applying and reading formatting codes.
Further, although Perl 6 is the time to make such a change, I'm not
convinced the change is really necessary. We might be able to avoid a
few uses of C<< >>, but is that a big enough win to change *yet
another* aspect of Perl? Especially an aspect programmers can--and
traditionally did--ignore?
> > The most common use of them in Perl 5--method call/dereference--is
> > going away in Perl 6
>
> Hmm, I remain unconvinced of that as the most common use, especially
> with the copious use of =>. Still, in my local source tree you're right,
> though by < a factor of 2.
Are you looking at your entire source tree, or just the Pod in it?
The code in Pod--and especially the short snippets of code typically
included in a C<> construct--is very different from arbitrary Perl
code.
> Perl 6 also adds new uses of E<gt> and E<lt> for pipelining, and further
> expands the usefulness of the => operator as a pair constructor. Rules
> also add new uses of these characters, but those are balanced, so
> improving POD with a real grammar specification would solve for that.
I definitely support intelligently defining the way Pod handles angle
brackets which aren't part of a formatting code. I also think writing
a reference grammar would be an excellent idea.
> Thanks Brent, I'm not sure if you intended your mail as an endorsement,
> but other than one sticking point, you and I appear to be on the same
> page. Thank you for your message.
I intended my e-mail to be an endorsement of Pod as it exists, with
extensions rather than a redesign. I think you have mostly the right
idea, but I really don't think switching to square brackets is
necessary.
By the way, I think I've seen a few people suggest some sort of
syntax-switching mechanism for "Pod6". The day people have to think
about what dialect of Pod they're using is the day Pod dies as a
useful documentation language.
I think you're thinking of something else. I'm talking about Luke's
proposal from this very list, back in Aug, which was a followup on
Larry's comments about my proposal, and which I agreed was far better
than what I had suggested in the first place.
> > and the hacks in
> > pod syntax (e.g. C<< < >>) to get around this are glaring anti-
> > huffmanisms.
>
> Whatever bracketing character we decide to use, there will always be
> occasions where we need to use it in an unbalanced way within a
> formatting code.
Absolutely yes. See C[...] vs C{...} vs C[\]]
I'm pointing out that the ROUTINE case is to need < and > in unbalanced
ways. Hence we have names for them that to not involve the words "open",
"close", "left" or "right". We could name "[", "fobit", but then people
would automatically call "]" a "right-fobit" or a "close-fobit". We look
at a piece of code or documentation that contains an un-balanced fobit
and we get a little chill because it's WRONG somehow. THAT is the kind
of balanced operator you want to tie your documentation to.
> > > The most common use of them in Perl 5--method call/dereference--is
> > > going away in Perl 6
> >
> > Hmm, I remain unconvinced of that as the most common use, especially
> > with the copious use of =>. Still, in my local source tree you're right,
> > though by < a factor of 2.
>
> Are you looking at your entire source tree, or just the Pod in it?
> The code in Pod--and especially the short snippets of code typically
> included in a C<> construct--is very different from arbitrary Perl
> code.
I EXPLICITLY ignored POD. I don't have the find+perl that I used handy
any more, but I was counting state every time I saw an "^=[hoib]" and
decrementing state every time I saw an "^=cut" so that I only counted
code... I probably caught some strange gunk after __END__ and __DATA__
tags, but not enough to throw the stats that far off.
It was 23k vs. 44k.
> > Perl 6 also adds new uses of E<gt> and E<lt> for pipelining, and further
> > expands the usefulness of the => operator as a pair constructor. Rules
> > also add new uses of these characters, but those are balanced, so
> > improving POD with a real grammar specification would solve for that.
>
> I definitely support intelligently defining the way Pod handles angle
> brackets which aren't part of a formatting code. I also think writing
> a reference grammar would be an excellent idea.
Already done, see my previous message. It's still brain-dead in some
places, and I need to re-work it to correctly handle nested operations
(e.g. B[I[...]]), but I'll have that soon. It will also, I've decided,
parse POD. Just as Perl 6 parses Perl 5 by recognizing the first few
statements, AJS Kwid will recognize "=head.*" as an indication that POD
is being used, and parse it into KwidData internally so that the same
tools could be used.
I'm stopping once it works. I'm not looking to fork POD. If people don't
like the proposal, once they can play around with it, then I'll drop it.
Well, I don't think anyone wants to see as many POD dialects as there
are wiki text formats (BBCode, anyone?). Maybe there will be something
very close to the original POD, but with a verbose way of making tables,
and an enhanced linking syntax. But otherwise identical to the original
Perl 5 POD.
Note that POD dialects, and differing POD conventions already exist in
Perl 5 and are in common use. They were designed in the original POD
with the =for tag. At the moment, tools like `pod2html' have to be
heavily aware of the POD dialect, which I think is sub-optimal when it
comes to some of the really interesting things people have achieved
with POD. Look at MarkOv's OODoc, or Test::Inline, for instance.
All I'm trying to do is giving these beasts a name, and defining a
mechanism by which they can be used by tools that only know how to deal
with "standard" documents - thus giving users the freedom to define a
local convention if one of them doesn't quite fit their needs.
Using a local Subversion repository, and Request Tracker, and want to
be able to put hyperlinks in POD to refer to these entities? No
problem, just extend the dialect and add a link style. Then select
from a dozen output tools or variants to see which one works for you.
Sam.
If Brian is correct about the fundamental interchangeability of these
dialects (and I have no reason to think he isn't), may I suggest that
the simple answer is to have a program which can translate from one
dialect to another--just like we distribute pod2man, pod2html, and
pod2text, we would now distribute pod2kwid and ajskwid2pod.
--Dks
--
dst...@dstorrs.com
Even before Brian announced Kwid, I was privately suggesting to Larry that
Markdown (http://daringfireball.net/projects/markdown/) was an excellent
evolution of mark-up notations and might be well suited to Perl 6. At
least...as a second allowable syntax.
And, in my view, Kwid kicks Markdown's butt in terms of its suitability for
Perl documentation. POD itself is brilliant and we should certainly not
abandon it, but it's critical to remember that POD is just an *interface* (or
B<interface>, if you prefer ;-) to Perl's built-in documentation systems. I
strongly believe that Kwid is, for many purposes, a cleaner and less-intrusive
interface, and I for one will be using it (even if I have to build a kwid2pod
translator).
But frankly, I'd rather just be able to write:
=kwid
in place of
=pod
within standard Perl 6.
As for the larger issue of redoing pod, I've appended my notes on where the
Design Team left their discussions when last we discussed it. This might spark
some ideas (but note that I will not be able to respond to them any time soon
-- alas, bread-winning must, for the moment, take precedence over most of my
public activities).
Damian
-----cut----------cut----------cut----------cut----------cut-----
There would be a single consistent rule that says that every POD block
(except raw text blocks) has one of the following three equivalent
syntactic forms:
=begin TYPE OPTIONAL_MULTIWORD_LABEL_TO_END_OF_LINE
BLOCK_CONTENTS_START_HERE_AND_CONTINUE_OVER_MULTIPLE_LINES_UNTIL...
=end TYPE OPTIONAL_SAME_MULTIWORD_LABEL
or:
=for TYPE OPTIONAL_MULTIWORD_LABEL_TO_END_OF_LINE
BLOCK_CONTENTS_START_HERE_AND_CONTINUE_OVER_MULTIPLE_LINES_UNTIL...
<first whitespace-only line or next pod directive>
or:
=TYPE BLOCK_CONTENTS_START_HERE_AND_CONTINUE_OVER_MULTIPLE_LINES_UNTIL...
<first whitespace-only line or pod directive>
For example:
=begin table Table of Contents
Constants 1
Variables 10
Subroutines 33
Everything else 57
=end table
=begin list
=begin item *
Doh
=end item
=begin item *
Ray
=end item
=begin item *
Me
=end item
=end list
=begin comment
This is the most verbose way to write all this
=end comment
Or equivalently:
=for table Table of Contents
Constants 1
Variables 10
Subroutines 33
Everything else 57
=begin list
=for item *
Doh
=for item *
Ray
=for item *
Me
=end list
=for comment
This is a less verbose way to write all this
Or also equivalently:
=for table Table of Contents
Constants 1
Variables 10
Subroutines 33
Everything else 57
=for list
=item * Doh
=item * Ray
=item * Me
=comment This is the least verbose way to write all this
POD formatters could then be simply and consistently implemented by
inheriting from a standard Pod::Base class, which would provide a
C<.parse_pod> method that sequentially extracts each block construct (from
whichever of the three syntaxes), including raw text blocks (which are
actually just unlabelled C<=for body> blocks), and raw code blocks
(which are actually just unlabelled C<=for verbatim> blocks).
C<.parse_pod> would be something like:
multi method parse_pod ($self: Str $from_str) {
# Get sequence of POD blocks to be parsed
# Using standard rules...
my @blocks = $self.extract_pod($from_str);
# Dispatch each block to be processed by the
# appropriate method...
for @blocks -> $block {
my ($type, $label, $contents) = $block<type label contents>;
$self.$type($label, $contents);
}
}
When each C<.$type()> method is called, both the label and contents would
passed as simple strings (either of which might, of course, be empty if
the corresponding component had been omitted from the block). The
(multi)method thus selected would then be responsible for
formatting/processing/whatevering the label and contents passed to it:
method head1 ($label, $contents) {...}
method head2 ($label, $contents) {...}
method list ($label, $contents) {...}
method item ($label, $contents) {...}
# etc.
Note that under this scheme the Perl5 syntax for:
=head1 Title here
=head2 Subtitle here
=head3 Subsubtitle here
=head4 Subsubsubsubtitle here
=item Bullet Item text
=cut
=pod
would mostly all continue to work (though, of course, C<=cut> and
C<=pod> would actually be dealt with directly within C<.extract_from>).
The most noticable change would be that something like:
=item Bullet
Text of item here
would now have to be written either as:
=item Bullet Text of item here
(an improvement, I suspect), or as:
=item Bullet
Text of item here
(assuming the .item() method was clever enough to remove leading
whitespace from the contents), or as:
=for item Bullet
Text of item here
or:
=begin item Bullet
Text of item here
=end text
Of course:
=over 4
...
=back
would no longer work; they would have to be written something like:
=begin indent 4
...
=end indent
Or better still, removed entirely and replaced with:
=begin list
...
=end list
At the moment they're odd-fish: not a mark-up block, but a layout block.
And hence intrinsically evil. ;-)
And if you wanted to *change* how POD is processed by perl6, you'd just
use a C<=use> directive to install your own class:
=use Pod::Quibble
as the POD handler. That class would probably be derived from Pod::Base
with some polymorphic or multimorphic adjustments to one or more of
C<.extract_pod>, C<.parse_pod>, or the various C<.head1>, C<.head2>,
C<.list>, C<.item>, C<.table>, C<.data>, etc. methods.
We also intend to unify __DATA__ and POD, and make both accessible (at
compile time and run time) to the program.
The single Perl 5 __DATA__ section would become:
=begin data
...
=end data
and you could define multiple separate data sections (a la Inline::Files)
with:
=begin data LABEL1
...
=end data
=begin data LABEL2
...
=end data
# etc.
Of course, under the synactic equivalences described above,
you could also write those as:
=for data LABEL1
...
=for data LABEL2
...
# etc.
or:
=data LABEL1 ...
=data LABEL2 ...
# etc.
These would simply be parsed by the standard Pod::Inline class (or whatever
it's eventually called), running as part of the perl6 parser.
Perl 6 would provide two standard file-scoped variables named
C<%=POD> and C<%=DATA>, which would provide access to all the file-
related metadata:
%=POD --> structured POD object
%=DATA --> structured DATA object (part of %=POD)
The "structured POD object" is an object that provides both sequential
and named access (lazily, of course!) to the overall POD structure of the
current file (including any =data sections):
%=POD<head1> --> Array of POD objects representing C<=head1>
chunks
%=POD<head1>[$n] --> structured POD object representing Nth
C<=head1> chunk
%=POD[$n] --> structured POD object representing Nth
C<=head1> chunk (shorthand)
%=POD[$n].text --> Text of Nth C<=head1> directive
%=POD[$n].loc --> Line range of the Nth C<=head1> directive
%=POD<head1>[$n]<head2>[$m]
--> structured POD object representing the
Mth C<=head2> chunk within Nth C<=head1>
section
%=POD[$n]<head2>[$m] --> structured POD object representing the
Mth C<=head2> chunk within Nth C<=head1>
section (shorthand)
%=POD[$n][$m] --> structured POD object representing the
Mth C<=head2> chunk within Nth C<=head1>
section (evenshorterhand)
%=POD<head2> --> Array of POD objects representing C<=head2>
chunks (from all C<=head1> sections)
%=POD<table>[$t] --> POD object representing the Tth C<=table>
%=POD<table>[$t].text --> Caption of the Tth C<=table>
%=POD<table>[$t].loc --> Line range of the Tth C<=table>
%=POD<table>[$t][$r] --> The Rth row of the Tth C<=table>
%=POD<html>[$h] --> POD object representing the Hth C<=begin html>
section
etc.
Meanwhile, the "DATA hash" would contain the (lazily extracted!) text of
just the C<=data> sections, with the keys of the hash being the names of
the sections. The value of each entry would be an object with stringific
and arrayific overloadings:
%=DATA --> Hash of objects representing C<=data>
sections, keyed by name
%=DATA<LABEL1> --> Data object representing all C<=data LABEL1>
sections
~ %=DATA<LABEL1> --> Concatenated text from all C<=data LABEL1>
sections
%=DATA<LABEL1>[$n] --> Text from only the Nth C<=data LABEL1>
section
Of course, in-line data is accessed from within the program far more
frequently that POD is likely to be, so there might also be convenience
bindings of entries in the data hash to named C<$=NAME> variables (much
as $1, $2, etc. are convenience bindings into components of the $/ match
variable):
$=LABEL2 --> Data object representing all C<=data LABEL2>
sections
~ $=LABEL2 --> Concatenated text from all C<=data LABEL2>
sections
$=LABEL2[$n] --> Text from only the Nth C<=data LABEL2>
section
"Data objects" would also have an iterator overloading, so that:
for = $=DATA {...}
would work as expected.
In the contents of any block, any line with '=' in column zero and a
whitespace character in column 1, has those two characters removed when the
contents are extracted. So you can write:
=begin data POSSIBLE_POD_DIRECTIVES
=
= =doh -- Oh, dear! Oh frikking dear!
= =ray -- A ravening beam of destruction
= =me -- A name I call my invocant
= =far -- A long, long way to Australia
= =sew -- What I do with contention
= =LA -- A place to follow trends
= =tee -- I pipe to double streams
=
=end data
To create the inline data:
=doh -- Oh, dear! Oh frikking dear!
=ray -- A ravening beam of destruction
=me -- A name I call my invocant
=far -- A long, long way to Australia
=sew -- What I do with contention
=LA -- A place to follow freaks
=tee -- I pipe to double streams
Damian
Thanks! This message has lots of useful information that I would have
otherwise probably missed.
It seems that the basic premise of the POD document object model gels
well with that early design document, so I look forward to being able to
flesh out the details.
Using ^=\s to delimit a line starting with a = will interfere with the
Kwid method of:
= Heading
foo
Which I was imagining would be converted to a DOM tree that when
represented in the "Normative XML" would look like:
<sect1>
<title>Heading</title>
<para>foo</para>
</sect1>
That's sort of DocBook style, and in fact I was thinking that for the
internal representation, DocBook node names could be used where there is
no other better alternative. Of course, non-documentation things like
Test fragments or inclusions of external entities, like UML diagrams
won't have a representation in DocBook :-).
The uses of a leading = in a paragraph are fairly uncommon. For
instance, when quoting POD you would simply indent it a bit to make it
verbatim and there is no issue.
I see a middle ground; that is, `=` quoting is only is allowed if it
directly follows the initial POD marker;
=head1 Foo
=
= =head1
= =
= = =head1 That's just getting ridiculous
Which I see as represented by;
<sect1>
<title>Foo</title>
<para>=head1
=
= =head1 That's just getting ridiculous</para>
</sect1>
Which of course would lose the ='s. But that's OK, because if you
wanted verbatim you could have just indented the block.
If you wanted to lead a normal paragraph with it, you'd just use the
normally implicit =para (equivalent to =pod):
=para
=
= = This is what a Kwid =head1 looks like
As for going with =kwid to denote the starting of kwid, I have so far
been pessimistically assuming that something like `=dialect kwid`, or
`=use kwid` (as described in the design doc you attached) would be
required. However, we could allow `=unknown`, where `unknown` is an
unknown keyword, to try to load Pod::Dialect::unknown, and hope like
hell it provides the Role of Pod::Dialect.
While the `^=` escaping is “active”, the presence or absence of
whitespace following the initial `=` will delimit breaks in paragraphs.
This has to be so, otherwise the previous example would have been:
<sect1>
<title>Foo
=head1
=
= =head1 That's just getting ridiculous
</title>
</sect1>
Which is just plain silly. This follows what people are used to with
POD - blank lines must be empty, not just no non-whitespace characters
(an increasingly vague concept these days).
So, the POD processing happens in 3 levels (note: the first isn't really
mentioned in perlpodspec.kwid, which is a bug);
=list
- chunkification from the original source, into POD paragraphs, which
may or may not include an initial `^=foo` marker. At *this* level, the
only escaping that happens is the `^=` escaping.
That's all that needs to happen while the code is being read, and for
most code that is how the POD will remain, in memory, somewhere
intermingled with the Parse Tree for the code, so that the code can
still be spat back out by the P6 equivalent of `B::Deparse`
- parsing of these raw chunks into a real POD DOM. Please, tired XML
veterans, please don't get upset by the use of the term "DOM", I think
the last thing anyone wants is to have studlyCaps functions like
`getElementById` and `createTextNode`. It is the tree concept itself
which is important, and this pre-dates XML anyway.
Strictly speaking, this step actually converts POD paragraph chunk
events into POD DOM events. These can be used to build a real DOM, for
instance if you need to do an XPath style query for a link (I was amazed
that someone's actually gone and built Pod::XPath!), or they might
simply be passed onto the next stage by an output processor with no
intermediate tree being built.
So, at this point, dialects get hooks to perform custom mutation of POD
paragraph events into DOM events, and the arbitrator of this process
ensures that the output events are well "balanced" by spitting out
closing tags where it has to. They can store state in their parser
object, but none of this state will be preserved past the parsing state.
However, the nodes that they "spit out" after this point may still not
be "core" POD, such as for includes or out-of-band objects. These hooks
will be sufficient to allow them to hijack subsequent chunks that would
otherwise be served to other dialects, ie, they can choose to
"arbitrate" subsequent chunks.
I'm aiming to make it so that it is possible for dialects to be "round
trip safe", by being able to go back from this DOM state to the original
POD paragraph chunks. This would require dialects to "play nice" of
course, but is a potential option to help make things like smart text
editors be able to automatically syntax highlight POD dialects :).
Linking will be in terms of this intermediate tree, so you won't be able
to link to included portions of manual pages :). I'm not sure whether
that matters.
- "output ready" form may also either be a stream of events or a DOM
tree. In this mode, all of the events from the first stage are simply
fed through a loopback preprocessor, which asks Dialects to convert
their non-core nodes to core nodes, or drop them, or whatever. At this
point, the structure can have handles to out of band objects like
images, etc - that can't be converted to XML. Again, dialects are
capable of arbitrating the loopback process for any events that *follow*
theirs.
Of course, documents that are not in a dialect (and do not have nodes
that `=include` and suchlike) will not need any pre-processing to be
ready for “output”.
=end list
If there is anything that you think is ghastly wrong with the above
picture, let me know of course, but I don't think it's actually all that
much different from what has to go on under the hood in a Pod parser or
markup tool, anyway. In particular, MarkOv - as the author of the most
comprehensive POD markup system there is, this means you! :-)
There is a big question about inline styles still open, and how
converting paragraph bodies to a series of POD events works (clearly,
this is essential for single-paragraph Kwid list blocks, etc) - but I'm
hoping the answer will just smack me in the face as I start to work with
ingy on the prototype implementation, and specifying the details of what
node types the POD DOM and/or DTD allows.
Now, I've done plenty of planning for this now, it's even looking
hopeful! So time for me to keep quiet until I've built something :-).
Sam.