On Apr 25, 12:57 pm, Olivier Ricordeau <
olivier.ricord...@gmail.com>
wrote:
> Hi group,
>
> I've discovered Bliki yesterday and it looks great.
> I need some help in order to be able to to what I need. I'm currently
> importing the latest wikipedia dump in MySQL (thanks to xml2sql). So
> what I need now is to build a java parser that does the following:
> * For each internal link, store the link (source and destination) in
> MySQL
> * For each external link, same thing (but in a different table)
> * Output the article as raw text (no HTML markups, etc.)
At the moment you can only use a derived
info.bliki.wiki.model.WikiModel or
info.bliki.wiki.model.AbstractWikiModel
and modify the append*() methods. I.e.
appendExternalLink(), appendInternalLink(),
appendInterWikiLink(), ...
Note you can speed things up in the AbstractWikiModel#render()
method,
when you use a <code>null</code> converter argument (if you don't need
the HTML output),
if you make these changes:
http://plog4u.svn.sourceforge.net/viewvc/plog4u?view=rev&revision=358
> For the third point, maybe the simpliest way to do this is to first
> convert to HTML using Blikki, and then remove tags (using htmlcleaner
> for instance).
Yes, that's possible you can derive from the
info.bliki.html.IHTMLToWiki
interface. For example copy the
info.bliki.html.wikipedia.ToWikipedia
class and modify it for your needs.