Searching Strategies

5 views
Skip to first unread message

Rob Keane

unread,
Jul 13, 2010, 12:02:29 PM7/13/10
to Eureka Streams Development
There was some good discussion on different searching strategies today
to handle more advanced queries and the team is still working on
forming a consensus.

Currently the system leverages Lucene and Memcached in different
capacities depending on the scenario. Custom lists are stored in
Memcached, while searches come back from Lucene. The concept of
Custom Lists and Saved Searched are being merged into Custom Streams
and we'd like to merge these concepts from a technical perspective as
well.

Currently, the Following list represents a special case. When the
following list is searched, we actually search the everyone stream and
then collide the results with the activities contained in the
Following list.

I'm proposing we adopt this method for all searches on lists, where
the search is done against the everyone list and then the results are
collided with the Memcached list of activity. Since this approach
must necessarily be taken for the Following list it requires no
significant amount of extra code, and after performance testing we can
decide if a more tiered approach is required.

Steve T

unread,
Jul 14, 2010, 7:07:28 AM7/14/10
to Eureka Streams Development
+1

Brian H. Mayo

unread,
Jul 14, 2010, 9:04:05 AM7/14/10
to Eureka Streams Development
We current have a performance challenge relating to search. I am
concerned with this approach (as well as the approach for the
following list) considering the implications it has on the system's
ability to scale.

We need to resolve the performance challenge we are facing with search
first to better understand the viability of the following list
approach.

On Jul 13, 12:02 pm, Rob Keane <rob.ke...@gmail.com> wrote:

Steve T

unread,
Jul 14, 2010, 9:44:19 AM7/14/10
to Eureka Streams Development
I retract my +1, this isn't terribly clear. Are you proposing that
custom lists behave the same way as following lists? Or that a new
approach will be taken for all lists including the Following list?

On Jul 13, 12:02 pm, Rob Keane <rob.ke...@gmail.com> wrote:

Rob Keane

unread,
Jul 14, 2010, 11:05:11 AM7/14/10
to Eureka Streams Development
Steve, it's a similar approach, but Following will have some special
handling in memcached since it will allow many more lists to be
searched (all the people you follow) vs the custom stream UI which
will only allow a few.

Brian, this solution was partially crafted to take load OFF of search,
but admittedly wasn't articulated well. This was created in response
to the idea of appending a ton of conditions to the Lucene query, this
makes a simple Lucene query, then leverages memcached to do the "hard
work." Blake and I are working on a more formal description that
describes the consensus we came to, which is a hybrid approach. The
hybrid approach allows us to use memcached when it makes sense to take
load of Lucene, while still leveraging Lucene's strengths in terms of
keyword searching and sorting.

Rob Keane

unread,
Jul 14, 2010, 5:56:45 PM7/14/10
to Eureka Streams Development
For some reason my replies aren't showing up....Are they also
moderated?

On Jul 14, 9:44 am, Steve T <stephen.terle...@lmco.com> wrote:

Rob Keane

unread,
Jul 14, 2010, 5:55:00 PM7/14/10
to Eureka Streams Development
@Brian The intent of this approach is actually to take load off of
Lucene. Rather than bombarding Lucene with tons of parameters we give
Lucene a comparatively easy query and then colliding with a relevant
list in memcached. We've ran benchmarks and have gotten sub 10
millisecond results for colliding large sets of data. Blake and I are
preparing some performance metrics.

@Steve We are evaluating if Following will remain a special case.
We'll have to run some benchmarks based on the max size of following
with the general approach. It's likely we'll still need to keep a
special list in memcached for Following.

On Jul 14, 9:44 am, Steve T <stephen.terle...@lmco.com> wrote:
Reply all
Reply to author
Forward
0 new messages