How does the site decides what is documentation and what not?
I've found a couple of problems playing with it:
I've not been able to find pgxn_utils via search: the reason is that
its page is <http://pgxn.org/dist/pgxn_utils/> is empty. This is
strange as the extension has a readme, which is not rendered its page.
In another test I've looked for "italian" trying to come up with the
italian_fts, and again no result. Going to the extension page
<http://pgxn.org/dist/italian_fts/> I see:
1. the readme is rendered: this mean that the readme is not indexed
for search: I think it should.
2. the extension actually has two documentation files: README.rst and
its Italian translation LEGGIMI.rst. The second is not rendered in any
page, and of course not indexed. Is there any way to have it listed in
the documentation? I think the only place where it is referenced in
the distro it is in the Makefile DOCS
(http://api.pgxn.org/src/italian_fts/italian_fts-1.2.1/Makefile),
which is probably not enough - and a pain in the neck to parse I
guess. Is there any way to have it included as docs and indexed?
3. reST rendering... ok, I've bothered you enough about it :P
What do you think?
-- Daniele
That will be fixed in next pgxn_utils version, because the extension was
packaged using a root dir '/' and not a dir like 'pgxn_utils-0.1.3'.
[]s
--
Dickson S. Guedes
mail/xmpp: gue...@guedesoft.net - skype: guediz
http://guedesoft.net - http://www.postgresql.org.br
http://github.net/guedes - twitter: @guediz
> How does the site decides what is documentation and what not?
Whatever files are recognized by Text::Markup are considered documentation.
> I've found a couple of problems playing with it:
>
> I've not been able to find pgxn_utils via search: the reason is that
> its page is <http://pgxn.org/dist/pgxn_utils/> is empty. This is
> strange as the extension has a readme, which is not rendered its page.
Yeah, that distribution looks a bit messed up.
> In another test I've looked for "italian" trying to come up with the
> italian_fts, and again no result. Going to the extension page
> <http://pgxn.org/dist/italian_fts/> I see:
>
> 1. the readme is rendered: this mean that the readme is not indexed
> for search: I think it should.
It's indexed if you search for distributions:
http://pgxn.org/search?q=italian&in=dists
The README is indexed for the whole distribution, not for individual extensions or documentation. I could see perhaps adding an exception for that, though: If a distribution has only one extension and no documentation for that extension, assume the readme is documentation and index accordingly.
> 2. the extension actually has two documentation files: README.rst and
> its Italian translation LEGGIMI.rst. The second is not rendered in any
> page, and of course not indexed. Is there any way to have it listed in
> the documentation? I think the only place where it is referenced in
> the distro it is in the Makefile DOCS
> (http://api.pgxn.org/src/italian_fts/italian_fts-1.2.1/Makefile),
> which is probably not enough - and a pain in the neck to parse I
> guess. Is there any way to have it included as docs and indexed?
> 3. reST rendering... ok, I've bothered you enough about it :P
I think that's you're answer, too: If reST support was added to Text::Markup, then those docs would be parsed and indexed. In the meantime, one can build the HTML versions and put them I'm the distribution, and *those* will be indexed. IIRC, you did that for a distribution, no?
Best,
David
>> In another test I've looked for "italian" trying to come up with the
>> italian_fts, and again no result. Going to the extension page
>> <http://pgxn.org/dist/italian_fts/> I see:
>>
>> 1. the readme is rendered: this mean that the readme is not indexed
>> for search: I think it should.
>
> It's indexed if you search for distributions:
>
> http://pgxn.org/search?q=italian&in=dists
Yes, so the point is that the readme is not considered documentation
because of its format.
The high level issue is "search doesn't work: I type "italian" in the
search box and nothing comes out". I don't think the user should
fiddle with the "search where": I think the website should just give
you an answer. If just adding the reST parser is a solution for this
problem, then fine :)
Would it work for readme in plain text format, with no extension or
.txt? I had assumed that if a format is not recognised (or when there
is no format at all) the text would have been indexed just as plain
text...
> The README is indexed for the whole distribution, not for individual extensions or documentation. I could see perhaps adding an exception for that, though: If a distribution has only one extension and no documentation for that extension, assume the readme is documentation and index accordingly.
I wasn't thinking about special-casing anything: I was just thinking
that the readme should be kept into account when searching.
>> 2. the extension actually has two documentation files: README.rst and
>> its Italian translation LEGGIMI.rst. [...]
>
> I think that's you're answer, too: If reST support was added to Text::Markup, then those docs would be parsed and indexed. In the meantime, one can build the HTML versions and put them I'm the distribution, and *those* will be indexed. IIRC, you did that for a distribution, no?
Yes, I did it for pgmp, but that's because rendering it takes an
environment more complex than just running rst2html < in > out. For a
simpler distribution, such as one with just the README or little more,
that file should be everything required IMO.
Thinking about pgmp, I see a future problem: once .rst is recognised,
the documentation will list both the .rst and the .html files... there
should be some heuristic to omit either version, such as looking at
the name of the file without the extension...
-- Daniele
> Yes, so the point is that the readme is not considered documentation
> because of its format.
No, it's not considered extension documentation because it's a README for the entire distribution, which might have many extensions in it.
> The high level issue is "search doesn't work: I type "italian" in the
> search box and nothing comes out". I don't think the user should
> fiddle with the "search where": I think the website should just give
> you an answer. If just adding the reST parser is a solution for this
> problem, then fine :)
Yes, I've been thinking about this, too, since there are so many distributions with a README and no other docs. I have two ideas:
1. If there is a README but no docs, and only one key under "provides" in META.json, then tread the README as the documentation for the provided extension and index it accordingly. This would be pretty easy to change.
2. Collapse all of the search code into a single search interface, with no "in" select list. Just one search field and results are returned for the various types in a single list. This is the approach taken by metacpan.org:
https://metacpan.org/search?q=theory
Note that users, distributions, and extensions are all listed in one set of search results. I think that this is much simpler for users of the site, but would take quite a bit of effort to modify in the source code. I do think it should probably be done in the long-term, though, as the current method is insufficient. I'm picking the metacpan.org guys' brains about it right now (multiple parallel searches of different objects, combining results, etc.).
OTOH, I sure would like to see everyone include proper docs in their distributions. People can help themselves quite a lot right now, which is what I tried to spell out in the HOWTO.
> Would it work for readme in plain text format, with no extension or
> .txt? I had assumed that if a format is not recognised (or when there
> is no format at all) the text would have been indexed just as plain
> text…
I think so, yes. It should then parse it as plain text. But again, for right now, the README goes only into the distribution index, not the extension or documentation indexes.
>> The README is indexed for the whole distribution, not for individual extensions or documentation. I could see perhaps adding an exception for that, though: If a distribution has only one extension and no documentation for that extension, assume the readme is documentation and index accordingly.
>
> I wasn't thinking about special-casing anything: I was just thinking
> that the readme should be kept into account when searching.
It is, if you search distributions. The README is not indexed for individual extensions, however. Or documentation, IIRC.
> Yes, I did it for pgmp, but that's because rendering it takes an
> environment more complex than just running rst2html < in > out. For a
> simpler distribution, such as one with just the README or little more,
> that file should be everything required IMO.
Which file? The README?
> Thinking about pgmp, I see a future problem: once .rst is recognised,
> the documentation will list both the .rst and the .html files... there
> should be some heuristic to omit either version, such as looking at
> the name of the file without the extension…
I would assume you wouldn't ship the .html files once .rst was recognized, now?
Best,
David
Well, the distribution is a good page where to land. The extensions
otoh don't even seem to have a page: if you search in extension, the
hits you get point to distributions anyway.
I kinda see your model: you say docs are linked to the ext. But this
model isn't entirely working:
- docs are *into* the zip, so they physically belong to the dist. the
only thing linking the docs with the ext seems the "docpath" in the
extension api. This however raises more questions: how do I document
an extension not stable? How do I link more than one file to an
extension?
- the "dist" api happily returns the docs (e.g. curl
http://api.pgxn.org/dist/pgmp.json). This is not documented in
<https://github.com/pgxn/pgxn-api/wiki/dist-api>, but is in the meta
api. Also note that your very example in
<https://github.com/pgxn/pgxn-api/wiki/meta-api> shows README in the
docs.
>> The high level issue is "search doesn't work: [...]
> Yes, I've been thinking about this, too, since there are so many distributions with a README and no other docs. I have two ideas:
>
> 1. If there is a README but no docs, and only one key under "provides" in META.json, then tread the README as the documentation for the provided extension and index it accordingly. This would be pretty easy to change.
Yes, it sounds a good start: easy and useful.
> 2. Collapse all of the search code into a single search interface, with no "in" select list. Just one search field and results are returned for the various types in a single list. This is the approach taken by metacpan.org:
>
> https://metacpan.org/search?q=theory
>
> Note that users, distributions, and extensions are all listed in one set of search results. I think that this is much simpler for users of the site, but would take quite a bit of effort to modify in the source code. I do think it should probably be done in the long-term, though, as the current method is insufficient. I'm picking the metacpan.org guys' brains about it right now (multiple parallel searches of different objects, combining results, etc.).
More than stitching the results of different search operations
together there could be a single table of "searchable text" (such as
distributions' abstracts and description, which may also have an
higher fts score, docs full text...), each item pointing to a
different url (the dist title to the dist, the docs to the html
rendering, where there would be probably a link to the dist/ext they
belong, the readme to the dist again).
>> Thinking about pgmp, I see a future problem: once .rst is recognised,
>> the documentation will list both the .rst and the .html files... there
>> should be some heuristic to omit either version, such as looking at
>> the name of the file without the extension…
>
> I would assume you wouldn't ship the .html files once .rst was recognized, now?
Hard to say what would be best. pgmp docs are complex to render, not
only for pgxn but mostly for the final user: when I wrote them it took
the trunk version of two different packages to have them the way I
wanted (now the stable of these tools should be ok though). I don't
like html in a source distribution though, and It would be ok for me
just to use the docs for searching into pgxn, as the correct rendering
is on the website. So I'd probably drop the html.
-- Daniele
> Well, the distribution is a good page where to land. The extensions
> otoh don't even seem to have a page: if you search in extension, the
> hits you get point to distributions anyway.
Only if there is no documentation for the extension. If there is, then that page will be linked to. Example:
http://pgxn.org/search?q=pgtap&in=extensions
> I kinda see your model: you say docs are linked to the ext. But this
> model isn't entirely working:
>
> - docs are *into* the zip, so they physically belong to the dist. the
> only thing linking the docs with the ext seems the "docpath" in the
> extension api. This however raises more questions: how do I document
> an extension not stable?
I don't understand the question. Extensions are not stable or unstable or testing; distributions are.
> How do I link more than one file to an
> extension?
You can have however many doc files you want, but only one will be linked from your extension. The indexer will list any doc files it finds in addition to extension-specific docs in a "Documentation" section of the distribution page. Example:
http://pgxn.org/dist/omnipitr/
> - the "dist" api happily returns the docs (e.g. curl
> http://api.pgxn.org/dist/pgmp.json). This is not documented in
> <https://github.com/pgxn/pgxn-api/wiki/dist-api>, but is in the meta
> api.
The “API Server Structure” section of https://github.com/pgxn/pgxn-api/wiki/dist-api describes it. Note that the API server is special, serving a superset of the data found on all other mirrors. Compare http://api.pgxn.org/dist/pgmp.json to http://master.pgxn.org/dist/pgmp.json.
> Also note that your very example in
> <https://github.com/pgxn/pgxn-api/wiki/meta-api> shows README in the
> docs.
Yes. But it's *distribution* documentation, not *extension* documentation.
>> 1. If there is a README but no docs, and only one key under "provides" in META.json, then tread the README as the documentation for the provided extension and index it accordingly. This would be pretty easy to change.
>
> Yes, it sounds a good start: easy and useful.
Yeah, I can make this happen soonish.
>> 2. Collapse all of the search code into a single search interface, with no "in" select list. Just one search field and results are returned for the various types in a single list. This is the approach taken by metacpan.org:
>>
>> https://metacpan.org/search?q=theory
>>
>> Note that users, distributions, and extensions are all listed in one set of search results. I think that this is much simpler for users of the site, but would take quite a bit of effort to modify in the source code. I do think it should probably be done in the long-term, though, as the current method is insufficient. I'm picking the metacpan.org guys' brains about it right now (multiple parallel searches of different objects, combining results, etc.).
>
> More than stitching the results of different search operations
> together there could be a single table of "searchable text" (such as
> distributions' abstracts and description, which may also have an
> higher fts score, docs full text...), each item pointing to a
> different url (the dist title to the dist, the docs to the html
> rendering, where there would be probably a link to the dist/ext they
> belong, the readme to the dist again).
Yeah, but then the format for all the results would have to be the same, and the lowest-common denominator is less useful.
I chatted with the metacpan folks; they don't have a separate distribution index at all, only modules (extensions) and authors. So they just show authors first in results, then modules.
> Hard to say what would be best. pgmp docs are complex to render, not
> only for pgxn but mostly for the final user: when I wrote them it took
> the trunk version of two different packages to have them the way I
> wanted (now the stable of these tools should be ok though). I don't
> like html in a source distribution though, and It would be ok for me
> just to use the docs for searching into pgxn, as the correct rendering
> is on the website. So I'd probably drop the html.
Makes sense.
Best,
David