Distributed UIMA as agent coordinators?

28 views
Skip to first unread message

Jack Park

unread,
Apr 24, 2015, 7:32:32 PM4/24/15
to qa-...@googlegroups.com
I'd like to open a conversation about Distributed UIMA, e.g.

to separate multiple processes up over different servers in a cluster.

Any experience with that? Projects to study?

Many thanks in advance.

Jack

Xuchen Yao

unread,
Apr 24, 2015, 8:29:39 PM4/24/15
to Jack Park, qa-...@googlegroups.com
Just my two cents: don't use UIMA as a distributed computing framework. So far I've only seen successful stories of UIMA from Watson, which used a dedicated IBM team (who invented UIMA!) for deployment. Few people are using distributed UIMA outside IBM, so you are quite alone there.

If I may add more: I've had great success using akka.io (lightweight concurrent asynchronous computing based on actors) and spray.io (web frontend). My own in-house QA engine for answering Jeopardy style questions with web searches runs in real time on my laptop. And it can be very easily scaled across multiple machines using akka actors. Originally I used Akka just for distributed computing, with no requirement for realtime response (the text book answer for distributed real time computing is Apache Storm). However, the time response by Akka is satisfactory enough so I stayed with Akka. My experience rhymes well with this SO thread.

Akka supports both Scala and Java. My own project is done in Scala.

Xuchen

--
You received this message because you are subscribed to the Google Groups "qa-oss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qa-oss+un...@googlegroups.com.
To post to this group, send email to qa-...@googlegroups.com.
Visit this group at http://groups.google.com/group/qa-oss.
To view this discussion on the web visit https://groups.google.com/d/msgid/qa-oss/CAH6s0fz4KhSnufy7FZpbsAkFLSD5Cw0UENYQjjkBeDuyhAV7nA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Jack Park

unread,
Apr 25, 2015, 2:26:53 PM4/25/15
to Xuchen Yao, qa-...@googlegroups.com
Xuchen,
Thank you for that response. I've worked with Scala (Lift) on websites, and will study your thoughts further.

Meanwhile, if [1] represents your work, please say more about your project. Code appears at [2] and apparently [3]. Any plans to rescue the googlecode over to GitHub?

Jack

Petr Baudis

unread,
Apr 26, 2015, 4:55:06 AM4/26/15
to Xuchen Yao, Jack Park, qa-...@googlegroups.com
Hi!

On Fri, Apr 24, 2015 at 05:29:19PM -0700, Xuchen Yao wrote:
> My own in-house QA engine for answering Jeopardy style questions
> with web searches runs in real time on my laptop.

Does that include full text search and parsing time? Or did you
pre-process some large corpus or does the real-time guarantee not
concern unseen questions?

Thanks,

Petr Baudis

Xuchen Yao

unread,
Apr 26, 2015, 11:37:59 PM4/26/15
to Jack Park, qa-...@googlegroups.com
I stopped maintaining the jacana question answering code after graduation. However, I totally re-wrote it with the akka/spray framework for my startup. I'm inclining on not open sourcing it, as a QA engine is just too complicated as an open-source project.

Maybe some graduate students at my school are willing to continue to develop jacana. That's the rescue plan so far..

Xuchen Yao

unread,
Apr 26, 2015, 11:42:23 PM4/26/15
to Petr Baudis, Jack Park, qa-...@googlegroups.com
I used Google/Bing web search and parse the front page (i.e., snippets from 10 links). My laptop has 4 cores. I used 1 core for question analysis, and the rest three to parse the 10 snippets and extract answers. Usually I can get an answer within 1 or 2 seconds counting network time.

Note that using Google/Bing is just for production, but it's bad science in the sense that it doesn't provide a static corpus for comparison. Thus I think your choice of searching over a static wiki corpus should be the way to go.

Xuchen
 

  Thanks,

                                Petr Baudis

Reply all
Reply to author
Forward
0 new messages