Adding support for extensions

Ryan Lane

unread,

Feb 13, 2009, 10:00:18 AM2/13/09

to mwlib

I'd like to add support for graphviz, and can't figure out how to do
it.

I'm assuming I need to add a graphviz class to tagext.py, and return
something inside of __call__. I can't figure out what I need to return
though. The example says "this function builds wikimarkup and returns
a parse tree for this". I'm not exactly sure what this means.

What I need to do it take the source, put it into a file, run graphviz
on the file (depending on the attributes), and include the outputted
file (png, svg, etc) into whatever the output is.

If you guys can give me some help, I'll gladly document it.

V/r,

Ryan Lane

Heiko Hees

unread,

Feb 13, 2009, 10:18:11 AM2/13/09

to mw...@googlegroups.com

Hi,

On Feb 13, 2009, at 4:00 PM, Ryan Lane wrote:

>
> I'd like to add support for graphviz, and can't figure out how to do
> it.
>
> I'm assuming I need to add a graphviz class to tagext.py, and return
> something inside of __call__. I can't figure out what I need to return
> though. The example says "this function builds wikimarkup and returns
> a parse tree for this". I'm not exactly sure what this means.
>
> What I need to do it take the source, put it into a file, run graphviz
> on the file (depending on the attributes), and include the outputted
> file (png, svg, etc) into whatever the output is.

there are two possibilites:
1) Either support the tag (see other examples like <source/>) and
implement a method in the PDF writer, which triggers graphviz and
includes the image.

2) Extend the extension to serve images via HTTP-GET requests. You
then could return [[Image:]] markup with a name that encodes the
location and the parameter for the graphviz server. We'd probably need
to add support for arbitrary URLs in images, but this shouldn't be too
hard. I'd prefer this option because it can be used as a webservice
by other applications as well. This approach would also work for
other extensions such as <hiero/> or <timeline/>.

I assume both option to be some work but we'd assist you in that if
possible.

Heiko

Ryan Lane

unread,

Feb 13, 2009, 11:19:19 AM2/13/09

to mwlib

On Feb 13, 9:18 am, Heiko Hees <he...@pediapress.com> wrote:
> Hi,

>
> there are two possibilites:
> 1) Either support the tag (see other examples like <source/>) and
> implement a method in the PDF writer, which triggers graphviz and
> includes the image.
>
> 2) Extend the extension to serve images via HTTP-GET requests. You
> then could return [[Image:]] markup with a name that encodes the
> location and the parameter for the graphviz server. We'd probably need
> to add support for arbitrary URLs in images, but this shouldn't be too
> hard. I'd prefer this option because it can be used as a webservice
> by other applications as well. This approach would also work for
> other extensions such as <hiero/> or <timeline/>.
>
> I assume both option to be some work but we'd assist you in that if
> possible.
>

I think I also like #2. I'd imagine this could also work for dynamic
page list or semantic mediawiki queries... Both of which I'm
interested in.

Should this be a GET, or a POST? The graphviz extension can take
arbitrarily long content between the opening and closing tags, and
would likely be too long for a GET request.

I'm thinking I could make a special page that takes the content, and
attributes, and returns the location of the image. Should I be posting
and returning XML? API-like work is fairly new to me.

V/r,

Ryan Lane

Heiko Hees

unread,

Feb 13, 2009, 11:48:22 AM2/13/09

to mw...@googlegroups.com

On Feb 13, 2009, at 5:19 PM, Ryan Lane wrote:
> I think I also like #2. I'd imagine this could also work for dynamic
> page list or semantic mediawiki queries... Both of which I'm
> interested in.
>
> Should this be a GET, or a POST? The graphviz extension can take
> arbitrarily long content between the opening and closing tags, and
> would likely be too long for a GET request.

It needs to be a GET request and these are limited to 50K in the
client library we are using (could probably be patched) and 15K in MW/
PHP/or the Webserver (didn't check who generated the 413 response).

>
> I'm thinking I could make a special page that takes the content, and
> attributes, and returns the location of the image. Should I be posting
> and returning XML? API-like work is fairly new to me.

I assume that your extension (did not find a working demo) uses some
hashing to store and reference generated images. If this is the case
you could use this hash to identify these images for the PDFs.

Heiko

Ryan Lane

unread,

Feb 13, 2009, 1:26:53 PM2/13/09

to mwlib

This isn't my extension. There are about 6-7 of them all over the web.
I'm going to take the best one, clean it up, add this functionality to
it, and commit it to MediaWiki's SVN.

Like EasyTimeline, the earlier Graphviz plugin uses a hash of the
source for the filename (in fact, the plugin is a derivative of
EasyTimeline). Later versions allow different hashes, and a filename
based off of the title. I'm going to limit it to an MD5 hash (like
EasyTimeline).

I think I'm confused about what we are trying to accomplish here.

Do we want to:

1. Be able to take a tag like:

<graphviz renderer="dot">
digraph G {
node1->node2->node3
}
</graphviz>

send it as:

http://<servername>/w/index.php?title=Special%3AGraphviz&source=%3Cgraphviz%20renderer%3D%22dot%22%3Edigraph%20G%20%7Bnode1-%3Enode2-%3Enode3%7D%3C/graphviz%3E&attr=dot

and have a filename returned as:

http://<servername>/w/images/graphviz/233b20134c14f65a896fbfe01f0021d2.png

or:

2. On the client side, take md5(source).png
(233b20134c14f65a896fbfe01f0021d2.png), return this to the parse tree
as:

[[Image:<<Special:Graphviz>>/233b20134c14f65a896fbfe01f0021d2.png]]

make the following request:

http://<servername>/w/index.php?title=Special%3AGraphviz&filename=233b20134c14f65a896fbfe01f0021d2.png

and have the server return an image?

V/r,

Ryan Lane

Heiko Hees

unread,

Feb 13, 2009, 2:34:17 PM2/13/09

to mw...@googlegroups.com

First of all we probably will have to hack the handling of the
[[Image:]] construct to support external links:
http://meta.wikimedia.org/wiki/Help:Images_and_other_uploaded_files#Link
we currently don't if I got the source right:
http://code.pediapress.com/hg/mwlib/file/71009b0f6bc2/mwlib/mwapidb.py#l392
( note to self: what about the involved attribution issues?)

It then depends, whether 15K is enough for most of the graphviz
descriptions and can be handled by the majority of servers, PHP and
MediaWiki deployments.

If so, construct an URL like the 1st one in your example 1), put it
into an image construct [[Image:Pic.jpg|link=<your url>]], return this
from your function in tagext.py and patch the extension to directly
return an image from this URLs location based on the parametrized data.

If there are problems with the 15K limit, hash the graphviz-
description, construct an URL like the second one in your example 1)
put it into an image construct [[Image:Pic.jpg|link=<your url>]],
return this from your function in tagext.py and patch the extension to
directly return an image from this URLs location based on the
parametrized data. In this case you'll probably depend on a cached
version of this image. Also there obviously needs to be a consistent
mapping of the graphviz description and the md5 hash. You could
implement this approach as a fallback if the former exceeds the GET
size limit.

Heiko

Ryan Lane

unread,

Feb 13, 2009, 3:06:46 PM2/13/09

to mwlib

On Feb 13, 1:34 pm, Heiko Hees <he...@pediapress.com> wrote:
> First of all we probably will have to hack the handling of the
> [[Image:]] construct to support external links:http://meta.wikimedia.org/wiki/Help:Images_and_other_uploaded_files#Link

> we currently don't if I got the source right:http://code.pediapress.com/hg/mwlib/file/71009b0f6bc2/mwlib/mwapidb.p...

> ( note to self: what about the involved attribution issues?)
>
> It then depends, whether 15K is enough for most of the graphviz
> descriptions and can be handled by the majority of servers, PHP and
> MediaWiki deployments.
>
> If so, construct an URL like the 1st one in your example 1), put it
> into an image construct [[Image:Pic.jpg|link=<your url>]], return this
> from your function in tagext.py and patch the extension to directly
> return an image from this URLs location based on the parametrized data.
>
> If there are problems with the 15K limit, hash the graphviz-
> description, construct an URL like the second one in your example 1)
> put it into an image construct [[Image:Pic.jpg|link=<your url>]],
> return this from your function in tagext.py and patch the extension to
> directly return an image from this URLs location based on the
> parametrized data. In this case you'll probably depend on a cached
> version of this image. Also there obviously needs to be a consistent
> mapping of the graphviz description and the md5 hash. You could
> implement this approach as a fallback if the former exceeds the GET
> size limit.
>

Ok. That sounds great. I think I'll try the hash+fallback approach.

Thanks for the info, I think this is enough to get me started!

V/r,

Ryan Lane

Heiko Hees

unread,

Feb 13, 2009, 3:21:11 PM2/13/09

to mw...@googlegroups.com

On Feb 13, 2009, at 9:06 PM, Ryan Lane wrote:
> Ok. That sounds great. I think I'll try the hash+fallback approach.
>
> Thanks for the info, I think this is enough to get me started!

Please consider to open an issue for this on http://
code.pediapress.com that bothers us to implement the [[Image:|link=]]
thing.

Heiko

Ryan Lane

unread,

Feb 13, 2009, 3:34:10 PM2/13/09

to mwlib

Reading over the meta page on image links, it looks like [[Image:|
link=]] is meant for pulling in an uploaded image, but making the <a
href=> link point to something else.

What we are looking at doing here is pulling an image from the link=
portion. Will this adversely affect the original usage of link=?

V/r,

Ryan Lane

Heiko Hees

unread,

Feb 14, 2009, 1:10:13 PM2/14/09

to mw...@googlegroups.com

On Feb 13, 2009, at 9:34 PM, Ryan Lane wrote:
>
> Reading over the meta page on image links, it looks like [[Image:|
> link=]] is meant for pulling in an uploaded image, but making the <a
> href=> link point to something else.
>
> What we are looking at doing here is pulling an image from the link=
> portion. Will this adversely affect the original usage of link=?

Sorry, I did not look at this construct long enough.

So we'll need to find a different solution to fetch these images. I'll
discuss this with my colleagues on Monday. I think besides of this we
can stick to the old plan to serve images via HTTP.

Heiko

Reply all

Reply to author

Forward