How to add new templates types.

9 views
Skip to first unread message

applegrew

unread,
Feb 18, 2008, 9:22:29 AM2/18/08
to Bliki - Java/Eclipse Wikipedia API
I am very new to Bliki. I have just now ran Bliki over a wiki markup
from wictionary.org. I must say it is one of the best renderers for
Wiki markup. The only artifact I am facing currently is that the
templates {{wikipedia}}, {{ML.}}, {{SAMPA}}, etc. are not getting
displayed as-is.

Is there a way so that Bliki makes the substitution itself? Or Is
there a way where my custom function gets called every time a template
is encountered and the arguments are passed on to my function? It has
been more than an hour that I am pondering over the javadoc (http://
matheclipse.org/doc/index.html), but I am yet to find a solution. I am
confused. Pls help.

Axel Kramer

unread,
Feb 19, 2008, 3:59:36 PM2/19/08
to Bliki - Java/Eclipse Wikipedia API
You can create your own derived WikiModel and overwrite the
getRawWikiContent() method for this purpose.

See for example the JUnit
info.bliki.wiki.test.filter.WikiTestModel#getRawWikiContent() method:
http://plog4u.svn.sourceforge.net/viewvc/plog4u/info.bliki.wiki.test/src/info/bliki/wiki/test/filter/WikiTestModel.java?view=markup

Unfortunately there's currently no WikiModel implemeted which can
directly read the raw content from a Wikipedia MySQL database.

applegrew

unread,
Feb 20, 2008, 7:52:34 AM2/20/08
to Bliki - Java/Eclipse Wikipedia API


On Feb 20, 1:59 am, Axel Kramer <axel...@gmail.com> wrote:
> On Feb 18, 3:22 pm, applegrew <appleg...@gmail.com> wrote:> I am very new to Bliki. I have just now ran Bliki over a wiki markup
> > from wictionary.org. I must say it is one of the best renderers for
> > Wiki markup. The only artifact I am facing currently is that the
> > templates {{wikipedia}}, {{ML.}}, {{SAMPA}}, etc. are not getting
> > displayed as-is.
>
> > Is there a way so that Bliki makes the substitution itself? Or Is
> > there a way where my custom function gets called every time a template
> > is encountered and the arguments are passed on to my function? It has
> > been more than an hour that I am pondering over the javadoc (http://
> > matheclipse.org/doc/index.html), but I am yet to find a solution. I am
> > confused. Pls help.
>
> You can create your own derived WikiModel and overwrite the
> getRawWikiContent() method for this purpose.
>
> See for example the JUnit
> info.bliki.wiki.test.filter.WikiTestModel#getRawWikiContent() method:http://plog4u.svn.sourceforge.net/viewvc/plog4u/info.bliki.wiki.test/...
>
> Unfortunately there's currently no WikiModel implemeted which can
> directly read the raw content from a Wikipedia MySQL database.

Thank you for the reply. I have one more concern. What will happen in
case of recursive templates? In the example you gave. CITE_WEB_TEXT
seems to be recursive template. How does Bliki handle that? Will
getRawContent will be called again to parse the inner templates?

Axel Kramer

unread,
Feb 20, 2008, 2:25:48 PM2/20/08
to Bliki - Java/Eclipse Wikipedia API
On Feb 20, 1:52 pm, applegrew <appleg...@gmail.com> wrote:
...
> Thank you for the reply. I have one more concern. What will happen in
> case of recursive templates? In the example you gave. CITE_WEB_TEXT
> seems to be recursive template.
In CITE_WEB_TEXT there are the "template parameters" "url" and
"title", which are inserted if you use the template with additional
parameters.
So this is not a good example for recursive templates.

But you are right that the engine in general supports nested (or
recursive) templates.
The WikipediaParser.RECURSION_LIMIT (= 25 at the moment) constant
should ensure that the recursion doesn't nest to deep.

> How does Bliki handle that? Will
> getRawContent will be called again to parse the inner templates?
By testing your question in the JUnit environment I found a small
problem with the recursive rendering and created two new JUnit test
methods
testNestedTemplate() and testEndlessRecursion() in TemplateParserTest:
http://plog4u.svn.sourceforge.net/viewvc/plog4u?view=rev&revision=304

I've to make these changes in the library to make them work:
http://plog4u.svn.sourceforge.net/viewvc/plog4u?view=rev&revision=303

So the getRawContent() method should now be alled again and again for
inner templates assuming that the RECURSION_LIMIT isn't exceeded.

applegrew

unread,
Feb 20, 2008, 3:16:57 PM2/20/08
to Bliki - Java/Eclipse Wikipedia API
Hey thanks for taking all the trouble. I am trying to use bliki to
create an offline Wikipedia dump reader (not only Wikipedia, but
Wikdictionary, Wikibooks, or Mediawiki, in general), and of course it
too would be open source. All current applications for that are either
very behind mediawiki, so they no longer give great results.
Hopefully, using bliki I can make one. :)

Axel Kramer

unread,
Feb 20, 2008, 3:55:37 PM2/20/08
to Bliki - Java/Eclipse Wikipedia API


On Feb 20, 9:16 pm, applegrew <appleg...@gmail.com> wrote:
> Hey thanks for taking all the trouble. I am trying to use bliki to
> create an offline Wikipedia dump reader (not only Wikipedia, but
> Wikdictionary, Wikibooks, or Mediawiki, in general), and of course it
> too would be open source.
Are you using info.bliki.wiki.dump.WikiXMLParser to read the XML from
the dumps?

> All current applications for that are either
> very behind mediawiki, so they no longer give great results.
> Hopefully, using bliki I can make one. :)
I would like to get noticed if you publish your project.
Can you add me to your mailing list :-)

Axel

applegrew

unread,
Feb 20, 2008, 4:17:16 PM2/20/08
to Bliki - Java/Eclipse Wikipedia API


On Feb 21, 1:55 am, Axel Kramer <axel...@gmail.com> wrote:
> On Feb 20, 9:16 pm, applegrew <appleg...@gmail.com> wrote:> Hey thanks for taking all the trouble. I am trying to use bliki to
> > create an offline Wikipedia dump reader (not only Wikipedia, but
> > Wikdictionary, Wikibooks, or Mediawiki, in general), and of course it
> > too would be open source.
>
> Are you using info.bliki.wiki.dump.WikiXMLParser to read the XML from
> the dumps?

I am undecided about that. Maybe not, because I plan to read directly
from the compressed dump. I am planning this to make this for normal
users with normal computers, hence hard disk space is a big
constraint. For that I plan to index the bz2 file where I record the
page title or other unique and necessary meta data and store the block
number and byte offset inside the block, in a normal uncompressed
file. OR better I could extend RandomAccessFile to implement my own
class that would provide transparent access to the bz2 file. Well, my
plans are quite elaborate, its only time which is my enemy.

>
> > All current applications for that are either
> > very behind mediawiki, so they no longer give great results.
> > Hopefully, using bliki I can make one. :)
>
> I would like to get noticed if you publish your project.
> Can you add me to your mailing list :-)
>
> Axel

Sure. :-D
BTW I don't know your email address. Google only shows a teaser of the
email address. Or u can subscribe to the https://lists.wikimedia.org/mailman/listinfo/wikitech-l
mailing list. I will definitely announce its availability there. My
exams are after only two weeks, so I will have to halt progress on
this for now, but will resume when my exams are finished.

Regards,
Apple

applegrew

unread,
Feb 21, 2008, 4:00:09 AM2/21/08
to Bliki - Java/Eclipse Wikipedia API
Have you fixed the problem you mentioned in your 2nd post in this
thread? Will have to get the updated code from the SVN? Can I get the
jar of the updated code? (because I don't know how to create a jar)

Axel Kramer

unread,
Feb 21, 2008, 2:54:10 PM2/21/08
to Bliki - Java/Eclipse Wikipedia API
On Feb 21, 10:00 am, applegrew <appleg...@gmail.com> wrote:
> Have you fixed the problem you mentioned in your 2nd post in this
> thread? Will have to get the updated code from the SVN? Can I get the
> jar of the updated code? (because I don't know how to create a jar)
>
I've just released the 3.0.2 version here:
http://sourceforge.net/project/showfiles.php?group_id=128886&package_id=206292&release_id=578430

At least under Eclipse you can use an export wizard to create a JAR
file.
"Right click" on the project in the package explorer and select menu
"Export...".
In the first dialog choose the "Java->JAR file" node and press the
"Next" button.
In the second dialog select only the "src" and "addon-src"
subdirectory nodes, define your JAR name, and press the "Next" button.
In the third dialog you can select "Save the description of the JAR in
the workspace" if you like, to create a *.jardesc file which
simplifies the creation of JAR files, and press the "Next" button.
Press the "Finish" button.

In the distributions ZIP file there's already a bliki.jardesc
contained, which you can modify for your workspace settings.
After that do a "Right click" on the bliki.jardesc file in the package
explorer and select menu "Create JAR".

applegrew

unread,
Feb 21, 2008, 3:25:22 PM2/21/08
to Bliki - Java/Eclipse Wikipedia API
Thaks, I will do that.

This is unrelated to this issue, but since yesterday I am stuck at one
point. This is getting very frustrating now. The people at Sun Java
forum too have not been helpful. All I want is to read the UTF-8
encoded file, but my sin is that I want to read it randomly. Java
seems to have no way of doing that or has some very obscure way of
doing that. I have toyed with ByteBuffer and CharsetDecoder and it
seems to work (though there are some rough edges), but the code looks
ugly and un-maintainable and I really don't understand fully what I am
doing there. All-in-all, I will just say - Help! and I feel so tired. :
(

Otherwise, I will have to use Mysql to let it worry about all encoding
issues, which I don't want, as this will unnecessarily increase the
dependency and we will need to decompress the dump file. Any
suggestions?

On Feb 22, 12:54 am, Axel Kramer <axel...@gmail.com> wrote:
> On Feb 21, 10:00 am, applegrew <appleg...@gmail.com> wrote:> Have you fixed the problem you mentioned in your 2nd post in this
> > thread? Will have to get the updated code from the SVN? Can I get the
> > jar of the updated code? (because I don't know how to create a jar)
>
> I've just released the 3.0.2 version here:http://sourceforge.net/project/showfiles.php?group_id=128886&package_...

Axel

unread,
Feb 21, 2008, 4:01:03 PM2/21/08
to bl...@googlegroups.com
On Thu, Feb 21, 2008 at 9:25 PM, applegrew <appl...@gmail.com> wrote:
> This is unrelated to this issue, but since yesterday I am stuck at one
> point. This is getting very frustrating now. The people at Sun Java
> forum too have not been helpful. All I want is to read the UTF-8
> encoded file, but my sin is that I want to read it randomly. Java
> seems to have no way of doing that or has some very obscure way of
> doing that. I have toyed with ByteBuffer and CharsetDecoder and it
> seems to work (though there are some rough edges), but the code looks
> ugly and un-maintainable and I really don't understand fully what I am
> doing there. All-in-all, I will just say - Help! and I feel so tired. :
> (
>
> Otherwise, I will have to use Mysql to let it worry about all encoding
> issues, which I don't want, as this will unnecessarily increase the
> dependency and we will need to decompress the dump file. Any
> suggestions?
Do you have some example coding?

--
Axel Kramer
WikiBlog: http://www.groovy-news.org/e/page/axelclk

applegrew

unread,
Feb 21, 2008, 4:02:19 PM2/21/08
to Bliki - Java/Eclipse Wikipedia API
Never mind. I think I should go for sqlite. This way I would be able
to put my energy in more important features.

applegrew

unread,
Feb 21, 2008, 4:09:58 PM2/21/08
to Bliki - Java/Eclipse Wikipedia API
Ah, yes. I am posting below the code of my class, but as I mentioned
above, I will now go with sqlite first. When the project is up and
running then if there is need then I will think of this approach.
Anyway, if you have time then please go through the code and notify me
of any gotchas that I maybe doing. I am still a learner; and sorry for
the sparse comments, it is mostly a prototype.

------
import java.io.*;
import java.util.*;

/**
* Parses and reads specially created Template dumps.
* <u>File format of dump:-<u>
* <code>template1 name &lt;template1 value&gt;template2 name
&lt;template2 value&gt;...</code>
* New-lines and space don't bother, but they are read and stored.<br>
* <b>Note:</b> The dump file must specify a template only once, else
the class would generate
* Runtime error.
*/
public class TemplateFetcher {
private RandomAccessFile rafTemIn = null;
private Hashtable<String,TemVal> temMap;
private Vector<String> lruQ; //Least recently used templates
queue.
private int cacheLimit = 10240; //in B (default: 10kB)
private int currCacheSize = 0; //in in B

final private static long SIZEOF_CHAR = 2;

public TemplateFetcher(String templateDat){
boolean parsingTemName = true;
boolean parsingTemVal = false;
try{
String TemName = new String("");
long bloc = -1;
long eloc = -1;

rafTemIn = new RandomAccessFile(new
File(templateDat), "r");
temMap = new Hashtable<String,TemVal>();
lruQ = new Vector<String>();

while(true){
byte[] b = new byte[2];
rafTemIn.read(b);
String chr = new String(b,"UTF-8");
System.err.println("~>"+chr);
if( !chr.equals("`") &&
parsingTemName)
TemName = TemName + chr;
else
if(chr.equals("`") && parsingTemName){
parsingTemName = false;
parsingTemVal = true;
bloc =
rafTemIn.getFilePointer() + SIZEOF_CHAR;
} else
if(chr.equals("`") && parsingTemVal){
parsingTemName = true;
parsingTemVal = false;
eloc =
rafTemIn.getFilePointer() - SIZEOF_CHAR;

//Adding these values to the
map.
TemName = TemName.trim();
temMap.put(TemName,new
TemVal(bloc,eloc));
System.err.println(TemName+"--
>"+bloc+","+eloc);

TemName = "";
bloc = -1;
eloc = -1;
}
}
} catch (EOFException e) {
System.err.println("Template dump read.");
/* */
} catch (IOException e) {
System.err.println("Template Dump file input
error");
}
}

/**
* Relases any resources that needs explicit call.
*/
public void putToRest(){
try{
rafTemIn.close();
}catch(IOException e){
System.err.println("Error while closing
template dump file.");
}
}

/**
* Sets max cache size in KB to cache the templates.
*/
public void setCacheLimit(int cacheSize){
cacheLimit = 1024 * cacheSize;
}

public String getTemplateVal(String templateName){
if(temMap.get(templateName)!=null){
TemVal temVal = temMap.get(templateName);
if(temVal.content!=null){
int index =
lruQ.indexOf(templateName);
lruQ.removeElementAt(index);
lruQ.add(templateName);

return temVal.content;
} else {
String stemval=null;;
try {
rafTemIn.seek(temVal.bloc);
byte[] b = new byte[(int)
(temVal.eloc - temVal.bloc + 1L)];
rafTemIn.read(b,0,(int)
(temVal.eloc - temVal.bloc + 1L));
stemval = new
String(b,"UTF-8");
} catch(IOException eio) {
System.err.println("Template
Dump file IO error.");
}

currCacheSize += stemval.length() *
(int)SIZEOF_CHAR;
temVal.content = stemval;

while(currCacheSize > cacheLimit &&
lruQ.size()!=0){
int index =
lruQ.indexOf(templateName);
TemVal firstTem =
temMap.get(lruQ.get(0));
firstTem.content = null;
currCacheSize = currCacheSize
- firstTem.content.length() / (int)SIZEOF_CHAR;

lruQ.removeElementAt(0);
}
lruQ.add(templateName);

return temVal.content;
}

} else
return null;
}

private class TemVal{
long bloc = -1; //Beging location of value block.
long eloc = -1; //End location of the template value.
String content = null; //The value of the template.
TemVal() { }
TemVal(long Bloc,long Eloc){
bloc=Bloc;
eloc=Eloc;
}
}
}
------

On Feb 22, 2:01 am, Axel <axel...@gmail.com> wrote:
Reply all
Reply to author
Forward
0 new messages