Is there anyone out here who know what are these "natural concepts"
defined in Wikipedia?
Are there any manual tagging of a phrase with a concept name like
COMPUTER_SCIENCE, INDIA or may be something like LANGUAGE??? I came
across this "natural concepts defined in Wikipedia" in a research
paper and I am unable to understand where inside a Wikipedia article
are such naturally defined concepts mentioned! Are there any such
tagging for any particular wikipedia article?
Any help regarding this would be highly appreciated.
Thanks
With Regards,
Abhishek S
There is one additional feature that Wiki has. After <Title......</
Title There is a list of the keywords translated into a large number
of languages. Thus is you want to translate from one language to
another it is possible to derive an ontology of key words and phrases.
If then you find a keyword (any language) the translation will be
there.
total=double_between(buf,total,"<title","</title","[[ar:","]]");
Gives Arabic translations of the headers. If you do manage to download
it I have a simple C++ program I could let you have. The Arabic, and I
suspect this goes for other non Roman scripts too, is NOT in Unicode
but is in some rather odd binary/hex format which I am in the process
or working out. A section of Wiki is in "buf" which is of type
unsigned char.
- Ian Parker
Thanks Ian,
I figured it out that the title of the Wikipedia is itself referred to
as the name of the concept, where the concept is nothing but the
article body of that particular title.
Thank you for your insights into the possibility of using the keywords
to derive ontology in translation :)
With Regards,
Abhishek S