Dynamic page lists and semantic Inline queries are not interpreted

28 views
Skip to first unread message

Tag Team

unread,
Oct 8, 2009, 3:02:27 PM10/8/09
to mwlib
This is a mediawiki collection extension question. I hope that this is
the right mailing list. The archive has
one discussion about graphviz where the following is also mentioned
but not solved.

For writing reports based on wiki information, inline queries as
obtained with the
Extension:Semantic_MediaWiki and dynamic page lists as obtained by the
Extension:DynamicPageList_(third-party) extension are very useful.
Unfortunately, however,
the result of these queries are not shown in the pdf obtained with the
Collection extension. Is that a known
missing feature, and would there be any workaround? Thanks for any
help!

Tomy

Andreas

unread,
Oct 26, 2009, 7:04:48 PM10/26/09
to mwlib
Will there be a solution to display the result of inline queries
produced by SemanticMediawiki oder DynamicPageList with the collection
extension in the near future?

Tanks, Andreas.

Ralf Schmitt

unread,
Oct 26, 2009, 7:26:17 PM10/26/09
to mw...@googlegroups.com
Andreas <andreas...@uni-klu.ac.at> writes:

> Will there be a solution to display the result of inline queries
> produced by SemanticMediawiki oder DynamicPageList with the collection
> extension in the near future?

we do not have anything planned. so most probably the answer is no.

patches are welcome.

AJim

unread,
Nov 14, 2009, 2:58:46 PM11/14/09
to mwlib
I just discovered this problem for myself. I was hoping to use both
the Dynamic Page List and Collection extensions; I would not like to
have to choose one or the other.

The dpl maintainers appear ready to cooperate with your suggestion
that "patches are welcome", http://semeb.com/dpldemo/index.php?title=Issue_talk:Export_to_pdf
, but they would need help to add this support. I assume, from your
answer, that it would be possible.

To get started, could provide a pointer to a discussion of what is
required to support a new tag, such as dpl, in the mwlib parser?

Thanks for your help.

On Oct 26, 6:26 pm, Ralf Schmitt <r...@brainbot.com> wrote:

Ralf Schmitt

unread,
Nov 16, 2009, 3:40:38 AM11/16/09
to mw...@googlegroups.com
AJim <mwlibtry....@dfgh.net> writes:

> I just discovered this problem for myself. I was hoping to use both
> the Dynamic Page List and Collection extensions; I would not like to
> have to choose one or the other.
>
> The dpl maintainers appear ready to cooperate with your suggestion
> that "patches are welcome", http://semeb.com/dpldemo/index.php?title=Issue_talk:Export_to_pdf
> , but they would need help to add this support. I assume, from your
> answer, that it would be possible.
>
> To get started, could provide a pointer to a discussion of what is
> required to support a new tag, such as dpl, in the mwlib parser?

Parsing a new tag is easy:

diff --git a/mwlib/refine/core.py b/mwlib/refine/core.py
index fa561a3..338ebf2 100755
--- a/mwlib/refine/core.py
+++ b/mwlib/refine/core.py
@@ -790,6 +790,9 @@ class parse_uniq(object):
txt = util.replace_html_entities(txt)
return T(type=T.t_text, text=txt)

+ def create_dpl(self, name, vlist, inner, xopts):
+ print "DPL:", (name, vlist, inner)
+
class XBunch(object):
def __init__(self, **kw):
self.__dict__.update(kw)
diff --git a/mwlib/uniq.py b/mwlib/uniq.py
index 5e73b6e..d5def22 100644
--- a/mwlib/uniq.py
+++ b/mwlib/uniq.py
@@ -60,7 +60,7 @@ class Uniquifier(object):
self.txt = txt
rx = self.rx
if rx is None:
- tags = set("nowiki math imagemap gallery source pre ref timeline".split())
+ tags = set("nowiki math imagemap gallery source pre ref timeline dpl".split())
from mwlib import tagext
tags.update(tagext.default_registry.names())

This lets me do:
>>> from mwlib.refine import core
>>> core.parse_txt("<dpl val=1>blabla</dpl>")
DPL: ('dpl', {'val': 1}, 'blabla')
[]


create_dpl would need to return a meaningful parse tree for the dpl tag
however.

Regards,
- Ralf

Jeremy

unread,
Nov 17, 2009, 1:11:52 PM11/17/09
to mwlib
Does it really make sense to parse the dpl tag? I would think since
dpl is designed mostly to dynamic create a wiki page by transclusion
that it should be left to be parsed by the mediawiki parser then the
output could be handled just as any other wiki text is handled. is
that facility available?

On Nov 16, 3:40 am, Ralf Schmitt <r...@brainbot.com> wrote:
> AJim <mwlibtry.x.jiml...@dfgh.net> writes:
> > I just discovered this problem for myself. I was hoping to use both
> > the Dynamic Page List and Collection extensions; I would not like to
> > have to choose one or the other.
>
> > The dpl maintainers appear ready to cooperate with your suggestion
> > that "patches are welcome",http://semeb.com/dpldemo/index.php?title=Issue_talk:Export_to_pdf

Gero

unread,
Nov 17, 2009, 1:39:37 PM11/17/09
to mwlib
I think that the expansion of the DPL tag (parser function and parser
tag, both are possible!)
should be handled by the php code of the DPL extension.

DPL has quite a lot of features and it would be very hard work to re-
implement all or even part of its functionality
in another language.

Also, wIthin DPL some MW parser functions are called to process
transcluded content.

So I think Jeremy is absolutely right: The collection extension should
process a page after
DPL has done its job. If the mwlib can live with that sequence of
processing it should
be possible to enable DPL with reasonable effort. Otherwíse I see no
chance.

DPL's output is in principle standard WIKI syntax; in some cases (~
10%) HTML tags will be found in the
output. These text portions are wrapped within HTML .. /HTML tags,
however, so they are also wiki-compatible.

Gero

Ralf Schmitt

unread,
Nov 18, 2009, 4:25:02 AM11/18/09
to mw...@googlegroups.com
Gero <gero....@t-online.de> writes:

> I think that the expansion of the DPL tag (parser function and parser
> tag, both are possible!)
> should be handled by the php code of the DPL extension.
>
> DPL has quite a lot of features and it would be very hard work to re-
> implement all or even part of its functionality
> in another language.
>
> Also, wIthin DPL some MW parser functions are called to process
> transcluded content.
>
> So I think Jeremy is absolutely right: The collection extension should
> process a page after
> DPL has done its job. If the mwlib can live with that sequence of
> processing it should
> be possible to enable DPL with reasonable effort. Otherwíse I see no
> chance.
>
> DPL's output is in principle standard WIKI syntax; in some cases (~
> 10%) HTML tags will be found in the
> output. These text portions are wrapped within HTML .. /HTML tags,
> however, so they are also wiki-compatible.

Only doing the DPL processing and not running the template expansion
is not possible.

We currently only handle the unprocessed wiki source and can't work with
mediawiki's html output. We do have plans to parse mediawiki's
output. However, nothing has can been coded so far.

If I had to implement this now, I would use lxml's html parser and
convert it's parse tree to our internal format as generated by
mwlib.refine.

- Ralf

jeremy....@gmail.com

unread,
Nov 18, 2009, 9:36:49 AM11/18/09
to mw...@googlegroups.com
what about having mwlib use the mw's api parse parameter to retrieve a parse tree from a page?

why would this project insist on being its own parser, you will struggle to keep up with the features offered by the mw native parser as well as cause more headaches when it comes to supporting extensions that are already working with the native parser? it would seem to me from a features standpoint you would simply parse the output of a printable page output or the simple html output that comes from the mw api.
> --
>
>
>
>
>
> You received this message because you are subscribed to the Google Groups "mwlib" group.
>
>
> To post to this group, send email to mw...@googlegroups.com.
>
>
> To unsubscribe from this group, send email to mwlib+un...@googlegroups.com.
>
>
> For more options, visit this group at http://groups.google.com/group/mwlib?hl=.
>
>
>
>
>
>
>
>
>

Ralf Schmitt

unread,
Nov 18, 2009, 10:06:46 AM11/18/09
to mw...@googlegroups.com
jeremy....@gmail.com writes:

> what about having mwlib use the mw's api parse parameter to retrieve a parse tree from a page?
>
> why would this project insist on being its own parser, you will struggle to keep up with the features offered by
> the mw native parser as well as cause more headaches when it comes to supporting extensions that are already
> working with the native parser? it would seem to me from a features standpoint you would simply parse the output of
> a printable page output or the simple html output that comes from the mw api.

This project doesn't insist on being its own parser. It did make sense
to write our own parser when we started.

>
> On Nov 18, 2009 4:25am, Ralf Schmitt <ra...@brainbot.com> wrote:
>> Gero gero....@t-online.de> writes:
>>
>>
>>
>>
>>
>> > I think that the expansion of the DPL tag (parser function and parser
>>
>>

Something is very wrong with your email clients quoting.

And while I'm here:

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

Heiko Hees

unread,
Nov 18, 2009, 10:25:52 AM11/18/09
to mwlib
On Nov 18, 3:36 pm, jeremy.breid...@gmail.com wrote:
> what about having mwlib use the mw's api parse parameter to retrieve a  
> parse tree from a page?

The parse tree as offered by the API is not really usable.

IMHO the only solution to implement this feature with the current
system would be to extend the API with an option that would return the
wikitext of a page with the DPL-tag already expanded to markup.

Heiko

Ralf Schmitt

unread,
Nov 18, 2009, 11:07:11 AM11/18/09
to mw...@googlegroups.com
Heiko Hees <he...@pediapress.com> writes:

> IMHO the only solution to implement this feature with the current
> system would be to extend the API with an option that would return the
> wikitext of a page with the DPL-tag already expanded to markup.
>

This won't work (as I have already written). One would have to do
template expansion in order to expand the dpl parser function. There is
no obvious representation for the result of such an expansion. mediawiki
markup may look like it is such a representation, but it is *not*.

Jeremy

unread,
Nov 18, 2009, 12:43:55 PM11/18/09
to mwlib
I myself would like to see some resolution to this as my
implementation is heavily dependent upon dpl. I would really have no
need for mwlib if management wasnt so intent on having some way to
export the documents. I hope sometime in the future when this
implementation becomes widely adopted that needed will dissipate.

Jan Schoonderbeek

unread,
Nov 18, 2009, 4:17:03 PM11/18/09
to mw...@googlegroups.com
Ralf, if you've already declared the following not possible, then my apologies for not understanding this stuff properly.

It looks to me like a possible solution to the problem of mwlib not interpreting things like DPL and SMW might be the upcoming MW function "expand templates", found e.g. at http://www.mediawiki.org/wiki/Special:ExpandTemplates
I could imagine the following construction (while knowing zilch about how mwlib/Collection actually operate):
* I tell Collection that I want to render page Somepage
* Collection fetches the wikitext at Somepage
* Collection visits Special:ExpandTemplates on the originating wiki, feeds the location Somepage and the fetched page content, and virtually presses OK
* Collection harvests the resulting expanded text, which now contains the answers to all queries, expanded templates etc., and feeds ''that'' into mwlib
* mwlib does not find any awkward {{#ask: or {{#dpl: code, but clean freshly-generated wikitext where the queries originally were. Rendering goes like usual.
Of course this also depends on people using MW1.16, as 1.15.x do not have Expand Templates...

--
Jan S.
"I'm a stream of noughts and crosses in your R.A.M."

Jeremy

unread,
Nov 18, 2009, 5:01:33 PM11/18/09
to mwlib


On Nov 18, 4:17 pm, Jan Schoonderbeek <saru...@gmail.com> wrote:
> On Wed, Nov 18, 2009 at 6:43 PM, Jeremy <jeremy.breid...@gmail.com> wrote:
>
> > On Nov 18, 11:07 am, Ralf Schmitt <r...@brainbot.com> wrote:
> > > Heiko Hees <he...@pediapress.com> writes:
> > > > IMHO the only solution to implement this feature with the current
> > > > system would be to extend the API with an option that would return the
> > > > wikitext of a page with the DPL-tag already expanded to markup.
>
> > > This won't work (as I have already written). One would have to do
> > > template expansion in order to expand the dpl parser function. There is
> > > no obvious representation for the result of such an expansion. mediawiki
> > > markup may look like it is such a representation, but it is *not*.
>
> > I myself would like to see some resolution to this as my
> > implementation is heavily dependent upon dpl. I would really have no
> > need for mwlib if management wasnt so intent on having some way to
> > export the documents. I hope sometime in the future when this
> > implementation becomes widely adopted that needed will dissipate.
>
> Ralf, if you've already declared the following not possible, then my
> apologies for not understanding this stuff properly.
>
> It looks to me like a possible solution to the problem of mwlib not
> interpreting things like DPL and SMW might be the upcoming MW function
> "expand templates", found e.g. athttp://www.mediawiki.org/wiki/Special:ExpandTemplates
> I could imagine the following construction (while knowing zilch about how
> mwlib/Collection actually operate):
> * I tell Collection that I want to render page Somepage
> * Collection fetches the wikitext at Somepage
> * Collection visits Special:ExpandTemplates on the originating wiki, feeds
> the location Somepage and the fetched page content, and virtually presses OK
> * Collection harvests the resulting expanded text, which now contains the
> answers to all queries, expanded templates etc., and feeds ''that'' into
> mwlib
> * mwlib does not find any awkward {{#ask: or {{#dpl: code, but clean
> freshly-generated wikitext where the queries originally were. Rendering goes
> like usual.
> Of course this also depends on people using MW1.16, as 1.15.x do not have
> Expand Templates...
>
> --
> Jan S.
> "I'm a stream of noughts and crosses in your R.A.M."- Hide quoted text -
>
> - Show quoted text -

actually expand templates is an extension available to versions prior
to 1.16, i currently have it installed. how will mwlib access it,
through the mw api?

Jan Schoonderbeek

unread,
Nov 18, 2009, 5:12:01 PM11/18/09
to mw...@googlegroups.com
On Wed, Nov 18, 2009 at 11:01 PM, Jeremy <jeremy....@gmail.com> wrote:


On Nov 18, 4:17 pm, Jan Schoonderbeek <saru...@gmail.com> wrote:
> On Wed, Nov 18, 2009 at 6:43 PM, Jeremy <jeremy.breid...@gmail.com> wrote:
>
> > On Nov 18, 11:07 am, Ralf Schmitt <r...@brainbot.com> wrote:
[snip]

--

Hmmm I didn't notice that expand template was a separate extension; thanks Jeremy, I'll be doing some experimenting this weekend to see if it correctly expands Semantic MediaWiki syntax! :-)

I don't know how mwlib would access the extension; I only recognize technical possibilities - my programming skillz are currently limited to bash scripting (I don't even know how to spell PHP).

I could see someone modifying the ExpandTemplates extension so as to be machine-addressable, or adding code to Collection to be able to recognize and use the HTML pages from ExpandTemplates, and fetching the processed output from it. I'd be happy to help testing and documenting, but programming for MW or its extensions is way out of my league :-(

Ralf Schmitt

unread,
Nov 18, 2009, 6:26:52 PM11/18/09
to mw...@googlegroups.com
Jan Schoonderbeek <sar...@gmail.com> writes:

> > This won't work (as I have already written). One would have to do
> > template expansion in order to expand the dpl parser function. There is
> > no obvious representation for the result of such an expansion. mediawiki
> > markup may look like it is such a representation, but it is *not*.
>
> Ralf, if you've already declared the following not possible, then my apologies for not understanding this stuff
> properly.

no problem, I wasn't very explicit.

>
> It looks to me like a possible solution to the problem of mwlib not interpreting things like DPL and SMW might be
> the upcoming MW function "expand templates", found e.g. at http://www.mediawiki.org/wiki/Special:ExpandTemplates
> I could imagine the following construction (while knowing zilch about how mwlib/Collection actually operate):
> * I tell Collection that I want to render page Somepage
> * Collection fetches the wikitext at Somepage
> * Collection visits Special:ExpandTemplates on the originating wiki, feeds the location Somepage and the fetched
> page content, and virtually presses OK
> * Collection harvests the resulting expanded text, which now contains the answers to all queries, expanded
> templates etc., and feeds ''that'' into mwlib

say, after expanding those templates, you end up with the following:

,----
| <ref>bar</ref>
|
| <references />
`----

So, this looks like a ref tag? It's not. I've expanded the following:

,----
| <{{#if:1|}}ref>bar</ref>
|
| <references />
`----

Rendering that with mediawiki results in:

,----
| <ref>bar</ref>
`----

not quite what you expected after seeing the expanded text only.

The root cause of this problem is that mediawiki substitutes certain
tags with UNIQ strings (these are just strings of the form
%7FUNIQ5381658174da799b-nowiki-00000001-QINU%7F) when expanding
templates and does *not* substitute them back right after template
expansion has finished. (Try expanding {{urlencode:
<nowiki>bla</nowiki>}} to see what I mean). It's probably a good thing
however, since it allows them to place <nowiki> tags inside other
tags. At least it's a feature extensively used on wikipedia.

This is what I mean with with "no obvious representation" for the result
of template expansion.

> * mwlib does not find any awkward {{#ask: or {{#dpl: code, but clean freshly-generated wikitext where the queries
> originally were. Rendering goes like usual.

looking for template calls isn't that easy. At least you have to expand
those templates, since the template names used may be generated
dynamically (and those names literally may depend on the phase of the
moon, e.g. {{LOCALTIME}})). but that is not my point anyway.

AJim

unread,
Nov 19, 2009, 12:44:22 PM11/19/09
to mwlib


On Nov 18, 4:25 am, Ralf Schmitt <r...@brainbot.com> wrote:
Ralf, in reply to Gero, you wrote:

> Only doing the DPL processing and not running the template expansion
> is not possible.

It seems clear to me that one reason the collection extension needs to
be first in line for processing the wikitext is that you need to
filter out templates that should not be included in the printed
document. Is that the only reason?

For instance, would it be possible to process the wikitext in multiple
passes? In the first pass remove the templates that are not to be
printed but leave the result as wikitext. In the second pass give this
result to dpl and allow it to add the dynamic content, again resulting
in legal wikitext. Finally, in a third pass, allow mwlib to parse and
process the wikitext resulting from the dpl processing.

Ralf Schmitt

unread,
Nov 19, 2009, 3:39:36 PM11/19/09
to mw...@googlegroups.com
AJim <mwlibtry....@dfgh.net> writes:

>
> Ralf, in reply to Gero, you wrote:
>
>> Only doing the DPL processing and not running the template expansion
>> is not possible.
>
> It seems clear to me that one reason the collection extension needs to
> be first in line for processing the wikitext is that you need to
> filter out templates that should not be included in the printed
> document. Is that the only reason?
>

mwlib can only handle the unprocessed wikitext. that's the main reason.
filtering out templates could sure be done in php. as I've written in a
prior mail, there is no obvious representation to use for the result of
template expansion. One could define such a format (i.e wikitext still
containing those UNIQ strings and a mapping of all UNIQ strings to their
replacement is what mwlib internally uses).

But why all this hassle? In the end we'd still had to parse wikitext.
I think the right thing to do here would be to work with mediawiki's
html output. It's a bit sad, as that would most likely annihilate
mwlib's mediawiki parser.

> For instance, would it be possible to process the wikitext in multiple
> passes? In the first pass remove the templates that are not to be
> printed but leave the result as wikitext. In the second pass give this
> result to dpl and allow it to add the dynamic content, again resulting
> in legal wikitext. Finally, in a third pass, allow mwlib to parse and
> process the wikitext resulting from the dpl processing.
>

I think this is a rather hard problem.
Reply all
Reply to author
Forward
0 new messages