Ookaboo now loading "everything"

4 views
Skip to first unread message

Paul Houle

unread,
Nov 10, 2010, 4:01:10 PM11/10/10
to ook...@googlegroups.com
Until now, Ookaboo's image loading process has included a
taxonomic filter that has limited our topics to places and people. Last
week we removed this filter, and about 24 hours ago, Ookaboo started
loading topics without restriction. The results are delightfully
heterogeneous, and include topics like

http://ookaboo.com/o/pictures/topic/12243240/Intermodal_container
http://ookaboo.com/o/pictures/topic/12245931/Song_Thrush
http://ookaboo.com/o/pictures/topic/12245887/Eurovision_Song_Contest_1964

Right now we've got about 440,000 images visible, and 260,000
images in the queue, which will keep the image pipeline running for
another month.

We've been making changes to improve full-text search; most
significantly, we're using unique labels from dbpedia instead of
Freebase labels -- this means a search like

http://ookaboo.com/o/pictures/noindex/index.fulltext?q=manchester&submit=Search
<http://ookaboo.com/o/pictures/noindex/index.fulltext?q=manchester&submit=Search>

has useful labels, however, once we build up more of a taxonomy,
we expect to generate titles that are more regular.

At the moment we're thinking about an "ultra-wide taxonomy" which
will help organize topics into large categories such as Person, Place,
Life Form, Organization, and Activity. This will immediately improve
the usability of search and will be a prelude to the development of a
more detailed taxonomy. (We have taxonomic data from dbpedia and
Freebase, but both of these have quality problems and neither one
reflects vernacular understanding.)

Behind the scenes, we've switching to Freebase mids as our
reference to Freebase, and we're also loading associations from
sameas.org; we'll soon be exposing these via the Semantic API, so it
will soon be possible to search by OpenCyc, UMBEL, YAGO, Geonames and
other identifiers. (Currently, however, it looks like recall in
sameas.org is less than I wish it was.)

Reply all
Reply to author
Forward
0 new messages