Test Site

Amanpreet Singh

unread,

May 27, 2014, 4:57:33 AM5/27/14

to annotation-tool-gsoc

Dear all,

Meanwhile the project goes on, I have setup a test page here .

It may hang a bit due to large no. of Vocabs.

Views?

Thanks

--

Amanpreet Singh,

IIT Roorkee

Christian Morbidoni

unread,

May 27, 2014, 5:09:13 AM5/27/14

to Amanpreet Singh, annotation-tool-gsoc

Good. I see it is a bit slow. so this is due to the predicates right?

On thing: it would be good to have proper previews when searching wikidata.

--
You received this message because you are subscribed to the Google Groups "Annotation tool GSoC" group.
To unsubscribe from this group and stop receiving emails from it, send an email to annotation-tool-...@googlegroups.com.
Visit this group at http://groups.google.com/group/annotation-tool-gsoc.
For more options, visit https://groups.google.com/d/optout.

Amanpreet Singh

unread,

May 27, 2014, 5:50:28 AM5/27/14

to Christian Morbidoni, annotation-tool-gsoc

Yes it is slow due to no. of predicates,

On preview: Can you explain what extra can be done, I am already showing descriptions I am getting from Wikidata.

Christian Morbidoni

unread,

May 27, 2014, 5:55:21 AM5/27/14

to Amanpreet Singh, annotation-tool-gsoc

I tried a search for "Pippo Barzizza" and there were no description or picture, may be they are not always present in wikidata?

David Cuenca

unread,

May 27, 2014, 8:20:05 AM5/27/14

to Christian Morbidoni, Amanpreet Singh, annotation-tool-gsoc

Why is it necessary to load all the predicates? Cannot be they just queried/retrieved live from WD?

I couldn't try the test website now because I'm behind a proxy and it remains endlessly loading...

"Pippo Barzizza" -> just added the description and image on wd ;)

Etiamsi omnes, ego non

Amanpreet Singh

unread,

May 28, 2014, 3:56:50 AM5/28/14

to David Cuenca, Christian Morbidoni, annotation-tool-gsoc

Dear David,

As per Pundit, for loading objects we use a created dojo module called selector, these dynamically load the objects from Wikidata, thats why Objects are loaded as the user types,

On the other hand, predicates are loaded from a fixed JSON file, called vocabularies, thats why the whole file is loaded.

In reality, there is no such vocab till now existed in Pundit world that is as big as that of Wikidata, so current Pundit setup loads all of them.

I think predicates can also be dynamically loaded if we create a selector for them, thats as far as I understand, Maybe Christian and Simone can put more light on what would be its side effects

Thanks.

Simone Fonda

unread,

May 28, 2014, 4:34:54 AM5/28/14

to Amanpreet Singh, David Cuenca, Christian Morbidoni, annotation-tool-gsoc

On Wed, May 28, 2014 at 9:56 AM, Amanpreet Singh
<amanpreet...@gmail.com> wrote:

> I think predicates can also be dynamically loaded if we create a selector
> for them, thats as far as I understand, Maybe Christian and Simone can put
> more light on what would be its side effects

I'll quote myself from a mail of the 11th may:

Build a selector for predicates is certainly possible and it might
even be very easy to accomplish. A thing i haven't seen in that json,
though, is the crucial information about range and domain. Are they
present in the wikimedia predicates or are they missing completely?

Very tied to these two informations is the fact that pundit suggest
you only the predicates that fits any subject and/or object already
present in the statement you are composing. If you decide to go for a
selector, this will not happen anymore, and, basicly, you are forcing
the user to type something each time they need to build a statement.

Simone

David Cuenca

unread,

May 28, 2014, 5:06:23 AM5/28/14

to Simone Fonda, Amanpreet Singh, Christian Morbidoni, annotation-tool-gsoc

On Wed, May 28, 2014 at 10:34 AM, Simone Fonda <fo...@netseven.it> wrote:

If you decide to go for a
selector, this will not happen anymore, and, basicly, you are forcing
the user to type something each time they need to build a statement.

Yes, that would be perfect! :-)

The last predicates/properties used could be saved, but other than that it is great to allow the user to type the property he/she wants to use and to not constrain the input values other than the type of data that the property/predicate admits (number, string, etc).

Cheers,

Micru

Amanpreet Singh

unread,

May 29, 2014, 1:52:58 PM5/29/14

to David Cuenca, Simone Fonda, Christian Morbidoni, annotation-tool-gsoc

So finally selectors or vocabs?

Christian Morbidoni

unread,

May 30, 2014, 4:10:21 AM5/30/14

to Amanpreet Singh, David Cuenca, Simone Fonda, annotation-tool-gsoc

At this point, if implementing a selector is not so complicated, I would go for it.

The possibility of loading the properties as a single big vocab will remain, but I understand the use of a selector per properties could improve usability for WikiData and possibly others.

One thing: if you are going to implement a "property selector" it should be possible to add a numebr of them by configuration, the same way it happens for normal selectors ("object selectors?").

We could probably use the very same configuration that we use for object selectors and add a field like "selector-type: objects/properties"...

May be Simone can suggest the best way to tune the conf.

best,

Christian

Amanpreet Singh

unread,

Jun 2, 2014, 4:33:24 AM6/2/14

to Christian Morbidoni, David Cuenca, Simone Fonda, annotation-tool-gsoc

I have updated test site with selectors for Wikidata Properties and Items, and I have done a hack around to make sure that in case of predicates only Wikidata Properties and else Wikidata Items, I haven't merged this code into my project's master. I wanted to get everyone's view on this, Do you feel this is right?

David Cuenca

unread,

Jun 4, 2014, 1:29:31 PM6/4/14

to Amanpreet Singh, Christian Morbidoni, Simone Fonda, annotation-tool-gsoc

It looks nice, however when selecting a text fragment the contextual menu shows:

- "Annotate text fragment" (text as subject)

- "Connect this text to" (text as subject)

- "Add comment or tags" (text as subject)

But I don't see any option to use the annotation as source for a triple. And same for the image selection.

One question to Christian and Simone, do you support this kind of structure? If not, would it be easy to implement?

Thanks

Micru

Christian Morbidoni

unread,

Jun 4, 2014, 1:56:20 PM6/4/14

to David Cuenca, Amanpreet Singh, Simone Fonda, annotation-tool-gsoc

Hi all,

one question: why is it still so slow? are you still loading all the properties? Or is a problem of my connection?

I have a problem with the demo:

I select a text and "Annotate it", then in the triple composer I click on subject and search for Napoleon. I choose one from the WikiData Items,

Then I choose a property (e.g. place of birth). At this point when I click on the object nothing happens. seems broken.

Reply to David:

we do not have a specific support for this kind of annotations. One quick (from Aman view-point :-)) solution would be to use a two-triples annotations:

"...Napoelon was an emperor..." ex:is-a ex:Source

wd:Napoleon wd:occupation wd:Emperor

This would mean the user want to add the triple "Napoleon occupation Emperor" and cites as source the text he is annotating ("...Napoelon was an emperor...").

In theory it would be possible even to omit the first triple, as the annotated text is saved anyway in the RDF annotation. But here there might be some technical details I do not know (Simone?).

best,

Christian

Amanpreet Singh

unread,

Jun 4, 2014, 2:01:52 PM6/4/14

to Christian Morbidoni, David Cuenca, Simone Fonda, annotation-tool-gsoc

Christian,
I too got that problem of selecting predicates then not able to choose object, I don't get why.
When I do this, first select object then predicate it works, is it something related to range or domain, since selector doesn't give domain and range values to predicates.

And no, site isn't loading predicates now, so I don't know why its still slow.

Amanpreet Singh

unread,

Jun 5, 2014, 4:01:35 AM6/5/14

to Christian Morbidoni, David Cuenca, Simone Fonda, annotation-tool-gsoc

I fixed the problem by adding range and domain attributes to properties returned by properties selector, this way it can be further extensible if someday Wikidata starts providing ranges and domain, currently I am leaving them empty. Now it works.

David Cuenca

unread,

Jun 5, 2014, 3:52:27 PM6/5/14

to Amanpreet Singh, Christian Morbidoni, Simone Fonda, annotation-tool-gsoc

Works for me too. A possible solution for the sources would be to add another option to the contextual menu "feed information to Wikidata".

After selecting it, it could create just a normal triple and when saving it, then initiate the OAuth procedure to login to Wikidata and save the information there too after saving it to the Pudit server first.
Important when creating the triple in Wikidata would be to add the permalink to the annotation, so when a user follows the link, it displays the annotation in its context.

And yes, it is very slow....

Micru

Simone Fonda

unread,

Jun 8, 2014, 10:13:39 AM6/8/14

to David Cuenca, Amanpreet Singh, Christian Morbidoni, annotation-tool-gsoc

On Thu, Jun 5, 2014 at 9:52 PM, David Cuenca <dac...@gmail.com> wrote:

> And yes, it is very slow....

Hey guys,

sorry for not being present at all. Things are settling down and in a
week or two i should be able to get back up to speed.

I have some time today and tomorrow to watch the tests Aman did.. but
if i try to load the page
(http://tools.wmflabs.org/wikidata-annotation-tool/WikidataAnnotationFeeder/tests/simple_tests/pundit.html)
nothings loads.

Is there something im missing??!

Best,
Simone

Amanpreet Singh

unread,

Jun 8, 2014, 11:14:35 AM6/8/14

to Simone Fonda, David Cuenca, Christian Morbidoni, annotation-tool-gsoc

Sorry Simone,

I am trying to work on Wikimedia login system, so I removed that route, I will fix it soon.
I hope you come back to pace soon.
Thanks.

Amanpreet Singh

unread,

Jun 8, 2014, 11:47:02 AM6/8/14

to Simone Fonda, David Cuenca, Christian Morbidoni, annotation-tool-gsoc

Simone,
I fixed it.
Go ahead on the link

Sorry for inconvenience.

Simone Fonda

unread,

Jun 9, 2014, 4:41:42 AM6/9/14

to Amanpreet Singh, David Cuenca, Christian Morbidoni, annotation-tool-gsoc

On Sun, Jun 8, 2014 at 5:47 PM, Amanpreet Singh
<amanpreet...@gmail.com> wrote:

> Simone,
> I fixed it.
> Go ahead on the link

Good stuff Aman!! Congratulations! :)

So pretty much the basics are there:
- predicates get queried along with descriptions; they get shown
correctly in the preview
- same goes for items

What you could focus on:
- it would be cool to have (even a very little) pic for each item
returned, but i dont know if the API (or wikidata itself!) got them
- raise the limit for queried predicates and items at 20 or 30, the
service looks very fast so there shouldnt be any problem with that
(add a limit=N parameter to the ajax calls)
- looks a bit weird to me that you can use predicates as
subject/object but.. hey.. welcome to the open world assumption

How is it going with the auth-related stuff you were thinking at?

About what David said, be aware that if we duplicate the information
in Pundit and Wikidata servers, then you need to pay attention at what
happens on edit and deletion. Keep the info in sync could become
cumbersome.

About the slowness, i see you are using demo.as.thepund.it, you could
try to switch to demo-cloud.as.thepund.it:8080, it is a cloud based
instance (on digital ocean) which should be more performant and more
reliable.

Moreover, at the moment the sources are still in their dev version (a
lot of files, a lot of comments, ...). If Aman compiles a bookmarklet
at least this slowness should be solved. Moreover, he can compiles all
of the sources in a single file (as the bookmarklet does) and get
better first-load performances in the test pages too.

Best,
Simone

Amanpreet Singh

unread,

Jun 11, 2014, 11:49:29 AM6/11/14

to Simone Fonda, David Cuenca, Christian Morbidoni, annotation-tool-gsoc

On Mon, Jun 9, 2014 at 2:11 PM, Simone Fonda <fo...@netseven.it> wrote:

What you could focus on:
- it would be cool to have (even a very little) pic for each item
returned, but i dont know if the API (or wikidata itself!) got them

Wikidata itself doesn't provide it, but their is an alter version of Wikidata called Resonator which grabs images from various sources, I can do that but I think it doesn't make sense, instead we can grab images if present for a particular item from other selectors.

- raise the limit for queried predicates and items at 20 or 30, the
service looks very fast so there shouldnt be any problem with that
(add a limit=N parameter to the ajax calls)

I have done that :)

How is it going with the auth-related stuff you were thinking at?

I am working on it currently and I am halfway through it.

About what David said, be aware that if we duplicate the information
in Pundit and Wikidata servers, then you need to pay attention at what
happens on edit and deletion. Keep the info in sync could become
cumbersome.

Thats our goal in later part of GSoC, we will work mostly on the bot then.

About the slowness, i see you are using demo.as.thepund.it, you could
try to switch to demo-cloud.as.thepund.it:8080, it is a cloud based
instance (on digital ocean) which should be more performant and more
reliable.

I did this also, thanks.

Moreover, at the moment the sources are still in their dev version (a
lot of files, a lot of comments, ...). If Aman compiles a bookmarklet
at least this slowness should be solved. Moreover, he can compiles all
of the sources in a single file (as the bookmarklet does) and get
better first-load performances in the test pages too.

I would try this also.

Thanks

Amanpreet Singh

unread,

Jun 11, 2014, 2:39:55 PM6/11/14

to Simone Fonda, David Cuenca, Christian Morbidoni, annotation-tool-gsoc

Login system is whole ready, currently I am retrieving data from Wikimedia, but due to unknown reasons which I tried to fix very much I am unable to get email of the user,
Also I don't know what id to set to user since we need to interact with the Pundit Server, I want to know would any id work with Pundit Server or it generates its unique one, since Wikimedia is returning an id to me, maybe I thought if I can use that.

And what should I do about email, should I leave it blank?

Amanpreet Singh

unread,

Jun 11, 2014, 3:29:29 PM6/11/14

to Simone Fonda, David Cuenca, Christian Morbidoni, annotation-tool-gsoc

I discussed this thing of not getting email and real name with Chris Steipp (Maintainer of OAuth) and we came to conclusion that this is bug in a OAuth system and I have reported it to get an action on it.
https://bugzilla.wikimedia.org/show_bug.cgi?id=66493

David Cuenca

unread,

Jun 12, 2014, 2:49:22 AM6/12/14

to Amanpreet Singh, Simone Fonda, Christian Morbidoni, annotation-tool-gsoc

On Wed, Jun 11, 2014 at 5:49 PM, Amanpreet Singh <amanpreet...@gmail.com> wrote:

Wikidata itself doesn't provide it, but their is an alter version of Wikidata called Resonator which grabs images from various sources, I can do that but I think it doesn't make sense, instead we can grab images if present for a particular item from other selectors.

You can also read the value of the property P18 (if available) and retrieve the image from Wikimedia Commons, that is what reasonator does

https://www.wikidata.org/wiki/Property:P18

Example:

https://www.wikidata.org/wiki/Q153 -->> Ethanol-3D-vdW.png

https://commons.wikimedia.org/wiki/File:Ethanol-3D-vdW.png

Cheers

Micru

Amanpreet Singh

unread,

Jun 12, 2014, 5:52:18 PM6/12/14

to David Cuenca, Simone Fonda, Christian Morbidoni, annotation-tool-gsoc

After much of tries in the past week, I am unable to integrate Wikimedia API with Pundit Server and Pundit itself, pundit server checks at every step that user is logged in or not, so I going to go with David's idea of creating separate login option and giving people option to feed to Wikidata.

Thanks

Christian Morbidoni

unread,

Jun 13, 2014, 7:42:25 AM6/13/14

to Amanpreet Singh, David Cuenca, Simone Fonda, annotation-tool-gsoc

Not sure if you already tried this possibility:

If the annotations that users do are public, then you can may be use the "open" API of the Pundit server.

It does not require authentication, so you could from WikiData use this to import annotations....

I honestly would prefer this solution...as Simone said, if we duplicate the data at "save annotation" time the synch problems could arise...

David Cuenca

unread,

Jun 13, 2014, 8:16:30 AM6/13/14

to Christian Morbidoni, Amanpreet Singh, Simone Fonda, annotation-tool-gsoc

Before rushing into anything let's take a look into the options:

1) save simultaneously in Pundit+WD: as per Christian and Simone, not wished because it is hard to manage synchronization

2) use Pundit as master server, several options

2.1) with a tool that automatically transfers the annotations from a Pundit user account to a Wikidata user account

2.2) with a user initiated action from Pundit to move the annotations to WD

3) use WD as master server: requires replicating Pundit's data structure with WD properties, and perhaps retrieving them whenever needed (or not)

4) mixed approach: using WD as master server for some annotations and Pundit for others, the user decides

If we discard (1) as too problematic, then (2.1) seems the easiest but it also poses some challenges, like what to do when an annotation is deleted, and how to make the workflow easy.

What about (3) and (4)? Are they even possible? Or would they take too much work?

Amanpreet Singh

unread,

Jun 13, 2014, 1:20:11 PM6/13/14

to David Cuenca, Christian Morbidoni, Simone Fonda, annotation-tool-gsoc

As per the suggestions of David and Christian, and after some thinking, I think we should do the following step:

1. Completely remove the openid login particularly for this plugin.

2. Give an option of Wikimedia Login to the user.

3. If we user is logged in through Wikimedia, attach something (this is the main point) to the triples of the annotation something that provides identity to this annotation that it was annotated by that particular user. I don't know how this can be done, maybe add an extra reference or something, that identifies, Christian, Simone??

4. Create a bot on Wikimedia Tools labs that has a cron job of pulling these annotations from Pundit server, fixing it structure and retrieve Wikimedia User's identity, possibly through something that we attached through step 3, the feeding these annotations to the particular items page.

5. Add info about who annotated to the above fetched annotations.

If this suits I want to ask:

What to attach to these annotations so as to identify user?

Thanks

Simone Fonda

unread,

Jun 15, 2014, 12:24:18 PM6/15/14

to Amanpreet Singh, David Cuenca, Christian Morbidoni, annotation-tool-gsoc

On Fri, Jun 13, 2014 at 7:20 PM, Amanpreet Singh
<amanpreet...@gmail.com> wrote:

> As per the suggestions of David and Christian, and after some thinking, I
> think we should do the following step:
> 1. Completely remove the openid login particularly for this plugin.

Mh, if you remove the open id completely, you are going to have
troubles with the my items: the server will not be able to know who
you are therefore what are your items.

Moreover, without authentication you will be subject to merciless
spam. Dont underestimate the power of spammers!

> 2. Give an option of Wikimedia Login to the user.
> 3. If we user is logged in through Wikimedia, attach something (this is the
> main point) to the triples of the annotation something that provides
> identity to this annotation that it was annotated by that particular user. I
> don't know how this can be done, maybe add an extra reference or something,
> that identifies, Christian, Simone??

This kind of metadata is added automatically by the pundit server,
since it knows who you are. If you remove the auth step, you will need
to augment the post APIs to include an author parameter. Which could
be easily faked anyway. And, again, can be easily spammed.

> 4. Create a bot on Wikimedia Tools labs that has a cron job of pulling these
> annotations from Pundit server, fixing it structure and retrieve Wikimedia
> User's identity, possibly through something that we attached through step 3,
> the feeding these annotations to the particular items page.

I can easily write a curl script to DOS this bot, which just posts an
endless amount of crap to your pundit server instance, which will then
get pulled from the bot.

Are you sure you want to follow this path?

The _absolute_ minimum workaround could be to add a server-to-server
communication between the pundit server and the WD one, to just ask if
the user is logged or not, and who is he. Sorry but, as you know, i
never dealt with the pundit server internals, so i dont know what else
to suggest here.

Simone

Christian Morbidoni

unread,

Jun 16, 2014, 2:39:52 AM6/16/14

to Simone Fonda, annotation-tool-gsoc, David Cuenca, Amanpreet Singh

Dear all,
I finally have the time to read...
I agree with Simone about removing openid. And it would also require a change in the Pundit server, thus more work for Aman and the need to then deploy a dedicate server. I cannot estimate but it seems to me a lot of extra work...
The fact is you cannot write to the Pundit server If not authenticated (open apis are read only!)
The idea of keeping a triple in the data to connect a wikidata user to a Pundit annotation (If I understand it correctly) is a little weird...

I would propose:
The user logs in in Pundit in the normal way.
The user creates his annotations.
He then finds a "push to wikidata" button.
When clicked it initiates the oauth and communicates to wikidata the ID of the user's current notebook (it has to be public).
Wikidata now can use the non authenticated api to fetch the annotations from the Pundit server.
Wikidata could do a one time import. Or even pool periodically to synch.
Additionally the user cold be able to later decide to "disconnect his notebook" from wikidata.
What do you think?

P.s. Not sure where the "connect notebook to wikidata" button should live. It also could male sense inside Ask...

Amanpreet Singh

unread,

Jun 17, 2014, 4:48:28 AM6/17/14

to Christian Morbidoni, Simone Fonda, annotation-tool-gsoc, David Cuenca

I am happy with what Christian suggested, infact its best solution available IMO, this would perfectly sync the Wikidata with Pundit, I would see further to it to place Push to Wikidata button at an easily recognizable place. but first I want to get suggestions of David also, about what he says, as soon as his opinion is available I will start working on this.

Thanks

David Cuenca

unread,

Jun 17, 2014, 4:55:56 AM6/17/14

to Amanpreet Singh, Christian Morbidoni, Simone Fonda, annotation-tool-gsoc

Sure, it seems a good compromise. And given the existing constraints there are not that many options.

I assume that you would need a bot running in labs that would perform the import on behalf of the user.

Bear in mind too, that not all annotations can be imported, just the ones that have wikidata correspondence (and hopefully with source data).

If needed you can propose a new property to link with the pundit annotation in the reference section of the triple in Wikidata.