Wiki markup Parser HELP.....

1 view

Skip to first unread message

hitesh thakkar

unread,

Mar 24, 2008, 10:40:58 AM3/24/08

to mw...@googlegroups.com

hello,

We am doing knowledge Harvesting project,
where we are parsing wikipedia xml pages and
there it contain wiki markup .

so we are suppose to parse that wiki too,
we got stuck here, how to parse wiki markup.

do i need to write wiki parser of its already available.

do reply if any one know about this query.

even we tried to install mwlib but facing error as follows:

error: can't copy 'mwlib/_expander.cc': doesn't exist or not a regular file

Thanking you,

Hitesh Thakkar.

Ralf Schmitt

unread,

Mar 24, 2008, 12:56:23 PM3/24/08

to mw...@googlegroups.com, hitesht...@gmail.com

On Mon, Mar 24, 2008 at 3:40 PM, hitesh thakkar <hitesht...@gmail.com> wrote:

hello,

We am doing knowledge Harvesting project,
where we are parsing wikipedia xml pages and
there it contain wiki markup .

so we are suppose to parse that wiki too,
we got stuck here, how to parse wiki markup.

>>> from mwlib import uparser
>>> uparser.simpleparse("* hello\n* world")
parser.info >> Parsing "'unknown'"
Article 'unknown': 1 children
     Paragraph '': 1 children
         Node '': 1 children
             ItemList '': 2 children
                 Item '': 1 children
                     ' hello'
                 Item '': 1 children
                     ' world'
Article 'unknown': 1 children

Have a look at the source for advanced usage...

do i need to write wiki parser of its already available.

do reply if any one know about this query.

even we tried to install mwlib but facing error as follows:

error: can't copy 'mwlib/_expander.cc': doesn't exist or not a regular file

You are using the development version. You need to install re2c and run make.
Or you can use the source tarball on pypi by running easy_install mwlib.

- Ralf

Reply all

Reply to author

Forward

0 new messages