Issue 35 in xappy: A cached not applied won't work

0 views
Skip to first unread message

xa...@googlecode.com

unread,
Mar 4, 2011, 9:30:08 AM3/4/11
to xappy-...@googlegroups.com
Status: New
Owner: ----

New issue 35 by brunovia...@gmail.com: A cached not applied won't work
http://code.google.com/p/xappy/issues/detail?id=35

If we use as cache manager a cache that is not applied to an index, then
the cached results won't be returned. The attached test reproduces the
problem.

The problem seems to be on how cached items get its weight. Xapian has
something called 'posting source' which is something that gives a weight
for a given document for a query. What happens is that if you apply a cache
to an index, the cache weights are stored in a slot inside the index, if
you don't apply, this slot won't have the weights
and xappy uses a ValueWeightPostingSource, which reads from this slot.
Maybe using a Xapian::ValueMapPostingSource would work?

Attachments:
test_cache.py 3.4 KB

xa...@googlecode.com

unread,
Mar 4, 2011, 10:27:41 AM3/4/11
to xappy-...@googlegroups.com

Comment #1 on issue 35 by boulton.rj: A cached not applied won't work
http://code.google.com/p/xappy/issues/detail?id=35

I don't think ValueMapPostingSource does what you need (but I'm willing to
be proved wrong by a patch).

Even if a cache isn't applied, it is still stored in a Xapian database, and
it is permissible to use a ValueWeightPostingSource from one database for a
search on another database (as long as the document IDs are compatible).
This worked at one point, so I'm certain that it's possible to make it
work. I'm not sure whether it's a bug in what you're doing, or a bug in
xappy, but I'll try and take a look later today.

xa...@googlecode.com

unread,
Mar 7, 2011, 8:33:03 AM3/7/11
to xappy-...@googlegroups.com

Comment #2 on issue 35 by artem.bo...@gmail.com: A cached not applied won't
work
http://code.google.com/p/xappy/issues/detail?id=35

As I found out the cache feature for a search index isn't completely
implemented. If we request more items than we have in the cache, the cached
results are skipped and a search request is executed. I'm not sure if
cached data and the search request are combined at the end, looks like just
the search results are used. In the code I saw several comments and FIXME's
about this, so I hope it will be changed\fixed soon :)

Richard, could you explain how to use ValueWeightPostingSource and
additional database to combine cached and searched results? Some code or
links would be cool.


xa...@googlecode.com

unread,
Mar 10, 2011, 8:59:27 AM3/10/11
to xappy-...@googlegroups.com

Comment #3 on issue 35 by brunovia...@gmail.com: A cached not applied won't
work
http://code.google.com/p/xappy/issues/detail?id=35

I tried some things, but it didn't work:

1. make the cache a proper xapian index, with an empty document for each
docid, then apply the cache to itself, create a ValueWeightPostingSource
query for this db, then '|' it with the base query and search. There was an
error in xappy like "Queries are not from the same connection". I hacked
something to ignore this exception, but then the search simple ignored the
cached results

2. I did the same as above to create a proper xapian index in cache and
used search_conn.add_database(cache_conn._index), to do a multidabase
search. The result then showed repeated documents in the answer (I think
this is expected...)

3. I thought on using a ExternalWeightPostingSource, just like we do when
doing search_conn.query_external_weight(source), but it is a slow method
and I give up before trying.

at this point I think I don't have the needed xapian + xappy expertise to
fix the issue. Richard, could you provide me some guidance on how to make a
ValueWeightPostingSource from one database for a search on another database?

Reply all
Reply to author
Forward
0 new messages