question about the format automatic annotation web service should return

443 views
Skip to first unread message

Luka Bradesko

unread,
Aug 1, 2014, 7:06:55 PM8/1/14
to brat-...@googlegroups.com
Hi,

I am unable to find what is the format that the web service that can be used for automatic Brat annotation.

I have a Named Entity Disambiguation web service and would like to set it up to work with Brat.

Thank you for the help, 
Luka 

Goran Topic

unread,
Aug 1, 2014, 11:30:58 PM8/1/14
to brat-...@googlegroups.com
Hello, Luka.

There are several sample web services included with brat that document
how the service is supposed to look like: tools/*taggerservice.py

Off the top of my head, it should accept the text to be tagged in the
POST HTTP request body. It should return an `application/json`
response with a JSON-formatted collection, with keys being annotation
names (typically in the format of `T#`, with `#` being a number), and
the values an object with three keys:

* `type`, a string that specifies the found type
* `offsets`, a list of text spans, each span being a two-element list
of `[start_offset, end_offset]`
* `texts`, a list of the original text's substrings matching the offsets.

For example, a POST request might have `to boldly go where no man has
gone before` in its body. A valid response might be:

{
"T1": {
"type": "Infinitive",
"offsets": [[0, 2], [10, 12]],
"texts": ["to", "go"]
}
}

Please note that there is an error in the current version of
nersuitetaggerservice.py: it incorrectly accepts GET instead of POST.
Someone might want to fix that (**nudges Pontus and Sampo**)

Goran
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "brat-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to brat-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Luka Bradesko

unread,
Sep 15, 2014, 7:58:46 AM9/15/14
to brat-...@googlegroups.com, ama...@mad.scientist.com
Thank you. This will help. I am almost there, but in meantime I found some strange behaviour with norm_db_init.py. I will start another question regarding that.

L

Luka Bradesko

unread,
Sep 25, 2014, 11:01:08 AM9/25/14
to brat-...@googlegroups.com, ama...@mad.scientist.com
I have the service implemented now. First there were some glitches with my JSON, which was presented to me as an error string. After I fixed the JSON response I started to get that:
"Unknown Error Server Crash", The server encountered a serious error, please contact the administrators at ____ and give the id #1411656370

How do I debug that? Where can I get something regarding the given ID?

Thanks for help.

Luka B

Goran Topic

unread,
Sep 25, 2014, 11:40:11 AM9/25/14
to brat-...@googlegroups.com
Hi, Luka.

The "ID" is just a timestamp that admins can use to match up the time
of the request with the times logged in `/var/log/apache2/error_log`.
That's where the real error should be.

Goran

Luka Bradesko

unread,
Sep 25, 2014, 4:18:49 PM9/25/14
to brat-...@googlegroups.com, ama...@mad.scientist.com
Hi, I am almost there. I managed to fix the first error, but then I get this:
AssertionError: Tagger response has multiple offsets (discontinuous spans not supported), 

The problem is that I have some span's that starts in one line and ends in another. This works if I generate .ann directly. Does this message means that it's not possible through the service? Your example above has multiple offsets as well.

L

Goran Topic

unread,
Sep 25, 2014, 11:01:28 PM9/25/14
to brat-...@googlegroups.com
On Fri, Sep 26, 2014 at 5:18 AM, Luka Bradesko <luka.b...@gmail.com> wrote:
> AssertionError: Tagger response has multiple offsets (discontinuous spans
> not supported),

That error is not possible since commit bbd695b (thanks, Joern!). So
try pull the newest brat from GitHub and give it another go.

Goran

Luka Bradesko

unread,
Jun 26, 2015, 11:00:02 AM6/26/15
to brat-...@googlegroups.com, ama...@mad.scientist.com
Hi, after a while I am returning to this. The tagging works fine now. (the update helped), but I am not sure how to continue:
All the examples of the taggers seem to return this:

"T1": { 
"type": "Infinitive", 
"offsets": [[0, 2], [10, 12]], 
"texts": ["to", "go"] 
} ,
...

but my service also disambiguates the tagged texts in a same way that normalization does manually. Is it possible that I return the "N1 Reference T1 Wikipedia:534366Barack Obama" as well as the T1 JSON. If yes, what is the syntax?
Something in this direction:

"T1": { 
"type": "Infinitive", 
"offsets": [[0, 2], [10, 12]], 
"texts": ["to", "go"] 
} ,
"N1": { 
"ref":"T1"
"type": "Wikipedia", 
"id": 534366, 
"name": "Barack Obama"
} ,
...

??

I was thinking that for this, a separate "disambiguator" service is needed. But I am unaware of any examples.

I really appreciate your help.

Thanks
Reply all
Reply to author
Forward
0 new messages