Discussion on the flow model of project

8 views
Skip to first unread message

Amanpreet Singh

unread,
Apr 10, 2014, 9:24:08 AM4/10/14
to annotation...@googlegroups.com
Hello everyone,
Based on the comments on the proposal at http://www.google-melange.com/gsoc/proposal/review/student/google/gsoc2014/apsdehal/5629499534213120 , I think we need to discuss the model of how the project is going to work more extensively.
As far as I understand how pundit works is
It uses its triple composer to compose a statement where it takes the sentence highlighted as subject and then feed predicates as preloaded list and objects from vocabularies. Here vocabularies are retrieved from various servers like DBpedia etc. I think we don't need to change the predicate part. I have attached a screenshot of the triple composer.
Now lets come to wikidata
Glossary is as specified:
Item
It is a page in Wikidata main namespace representing a real-life topicconcept, or subject. Items are identified by a prefixed id, or by a sitelink to an external page, or by a unique combination of multilingual label and description.
Wikidata statement.svg
Properties
It is a descriptor of a value for a particular item. In other words, it is an attribute for an item. E.g. country
Statements
is a piece of data about an item, recorded on the item's page. A statement consists of a claim (a property-value pair such as "Season: Winter" about an item, together with optional qualifiers), supported by optional references (giving the source for the claim).
Value
Simply an information about item that explains something about one of its property.For e.g. United States of America for property country

For e.g. to see how statments are saved on a wikidata item's page see this. You can see in this item how property country is linked with value United States of America and option to add references
So we need to discuss how we can turn the pundit annotation model (i.e. subject, object, predicate) into wikidata model (i.e. item, property, value).

Kindly pour your ideas into this.

Note: Pundit also contains a entity extractor which is based on external Entity Recognition services (like DataTXT and DBPedia spotlight). Here user selects a sentence and entities are suggested as tags associated to the sentence

Thanks
Amanpreet










pundit1.jpg
pundit2.jpg

Christian Morbidoni

unread,
Apr 10, 2014, 10:30:31 AM4/10/14
to Amanpreet Singh, annotation...@googlegroups.com
Dear Aman,

The two models are both based on RDF triples.
The Wikidata example you show consists of: 

- one triple: 
London(Subject) - population(Predicate) - "8173456"(Object)

- and a reference (that you do not mention in the description), but I think is the link to the actual annotated text that provides a justification for the validity of the triple itself.

In Pundit:
the triple:
London(Subject) - population(Predicate) - "8173456"(Object)

can be build with the triple composer. 
Then you need to create an extra triple to link the annotation to the annotated text.
In my public reply I was suggesting this triple to be something like:
   text-fragment - isA - ex:Evidence
You can find more info on the RDF data model used in Pundit in this page:
Annotation are also represented in JSON (this is the standard format provided by the Pundit Serve API)

I think mapping to the Wikidata model should not be complicated, but I don't know the details.

best,

Christian


--
You received this message because you are subscribed to the Google Groups "Annotation tool GSoC" group.
To unsubscribe from this group and stop receiving emails from it, send an email to annotation-tool-...@googlegroups.com.
Visit this group at http://groups.google.com/group/annotation-tool-gsoc.
For more options, visit https://groups.google.com/d/optout.

Christian Morbidoni

unread,
Apr 14, 2014, 4:14:15 AM4/14/14
to Amanpreet Singh, annotation...@googlegroups.com

Dear Aman,
Did you received my publi message on the gsoc web site? Could you reply andò tell me what you think?
Best
Christian

David Cuenca

unread,
Apr 14, 2014, 4:44:59 AM4/14/14
to Christian Morbidoni, Amanpreet Singh, annotation-tool-gsoc
Dear Amanpreet,

I think Christian's comments make most sense. Regarding point 2 of his comment (a wikidata tool to import periodically pundit annotations), that can be done with a bot running on Wikimedia Tool Labs (https://tools.wmflabs.org/).
If you want to get some info about how to create a Wikidata bot, you can find it here:
http://www.wikidata.org/wiki/Wikidata:Creating_a_bot
http://www.mediawiki.org/wiki/Manual:Pywikipediabot/Wikidata

I hope it helps to improve your application. Please reply on the gsoc web site as soon as possible, and modify your project description as needed.

Thanks,
David -- User:Micru


--
Etiamsi omnes, ego non

Amanpreet Singh

unread,
Apr 14, 2014, 6:59:05 AM4/14/14
to David Cuenca, Christian Morbidoni, annotation-tool-gsoc
Hello Christian, David,
Thanks for your valuable replies.
Sorry for delay on my side.
I have replied to proposal comments and I will update the proposal by night. 

Thanks a lot.

Amanpreet

Christian Morbidoni

unread,
Apr 14, 2014, 7:53:42 AM4/14/14
to Amanpreet Singh, David Cuenca, annotation-tool-gsoc
Thank you Aman,
I replied on google melange, please answer there...

best

Christian

Amanpreet Singh

unread,
Apr 15, 2014, 8:16:11 AM4/15/14
to Christian Morbidoni, Amanpreet Singh, David Cuenca, annotation-tool-gsoc
Thanks to everyone for their valuable replies.

I have edited the proposal and I think it includes all information now.

Thanks again.
Amanpreet Singh,
IIT Roorkee

Christian Morbidoni

unread,
Apr 16, 2014, 9:55:23 AM4/16/14
to Amanpreet Singh, Amanpreet Singh, David Cuenca, annotation-tool-gsoc, Simone Fonda
Dear Amanpreet,

I think the project is very interesting and if successful the output would be a very useful object for the wikidata community (and not only).
I think you have more than one proposal submitted, and as far as I understand you can only run one.
The choice is all yours. Of course I'll be glad to supervise this work (I think this hold also for the other menters), but if this will not be the case then. .. it has been anyway useful design work for the future.

best,

Christian

Amanpreet Singh

unread,
Apr 16, 2014, 9:59:24 AM4/16/14
to Christian Morbidoni, Amanpreet Singh, David Cuenca, annotation-tool-gsoc, Simone Fonda
Hi Christian,

As far as know, the clash comes only in the case when the proposal is selected by both organizations.
I submitted a proposal in Moodle, since I contribute to it also. So I am not sure if Moodle has selected my proposal.
Even if they have selected the proposal, I would always like to go with Pundit and Mediawiki community since they have replied everytime I have asked for thanks.
Can you clarify my doubts regarding proposals?

Thanks

Christian Morbidoni

unread,
Apr 16, 2014, 11:17:59 AM4/16/14
to Amanpreet Singh, Amanpreet Singh, David Cuenca, annotation-tool-gsoc, Simone Fonda
Hi,
I don't know if the other proposal was selected actually. 
They tell me nothing is confirmed at this point, but the mentors of both organizations (including me of course) would need to know your preference. And also they told you not to worry, both orgs will collaborate to make sure you don't miss a chance.

best,

Christian

Amanpreet Singh

unread,
Apr 16, 2014, 12:24:45 PM4/16/14
to Christian Morbidoni, Amanpreet Singh, David Cuenca, annotation-tool-gsoc, Simone Fonda
Dear Christian,
I have received mail from other organization regarding the project that they have accepted it, also they told me that Mediawiki has also accepted our project, so I have a question. 
Can you tell me who are the final two mentors for my project at mediawiki?

Thanks a lot.

Christian Morbidoni

unread,
Apr 16, 2014, 12:57:21 PM4/16/14
to Amanpreet Singh, Amanpreet Singh, David Cuenca, annotation-tool-gsoc, Simone Fonda
Yes all the mentors agree in approving the project. I think you should just tell us what your preference is, no form to fill as I understand.
I'm signed as "primary" mentor now, as Simone is moving to an other job...but he will mentor anyway and provide support. Then there are the wikimedia guys. 
The list is in melange: CristianCantoro, Luca Martinelli, Andrea Zanni, Simone Fonda, Christian Morbidoni
So it seems you have a lot of mentors :-)

best,
Christian

Amanpreet Singh

unread,
Apr 16, 2014, 1:46:34 PM4/16/14
to Christian Morbidoni, Amanpreet Singh, David Cuenca, annotation-tool-gsoc, Simone Fonda
Thanks for your interest Christian and all others,
So finally I am going to go with Mediawiki it means you guys.
Thanks a lot guys.

We will start working on this soon.

Regards

Christian Morbidoni

unread,
Apr 16, 2014, 2:34:42 PM4/16/14
to Amanpreet Singh, Amanpreet Singh, David Cuenca, annotation-tool-gsoc, Simone Fonda
Hi Aman,

good news indeed:-)
Looking forward to be of help...
Unfortunately I'll not be available from 18 of April to 6 of May...
I know it is not a very good start for a mentor :-/ But we will have time.
In the meantime, you can keep in contact with the others.

best,

Christian
Reply all
Reply to author
Forward
0 new messages