Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home for elasticsearch.com
« Groups Home
elasticsearch vs solr : indexing speed
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  6 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
massi  
View profile  
 More options Apr 16 2011, 9:56 pm
From: massi <mehdi.a...@gmail.com>
Date: Sat, 16 Apr 2011 18:56:21 -0700 (PDT)
Local: Sat, Apr 16 2011 9:56 pm
Subject: elasticsearch vs solr : indexing speed
Hi Guys,

What do you think of this article:
http://dmurphy747.wordpress.com/2011/04/02/solr-vs-elasticsearch-deat...
where elasticsearch and solr are compared with regard to the indexing
speed?

A quote from the article: "I ran each test 4 times, killing the JVM
and removing the data directory for both Solr and elasticsearch. The
final averaged results expressed as throughputs were 43204 docs/sec
for Solr, 44052 docs/sec for Solr direct streaming, and 9823 docs/sec
for elasticsearch."

PS: Don't take me wrong, I know that it is only one (partial) test,
and that some features in elasticsearch make it unique!


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Clinton Gormley  
View profile  
 More options Apr 17 2011, 7:40 am
From: Clinton Gormley <clin...@iannounce.co.uk>
Date: Sun, 17 Apr 2011 13:40:41 +0200
Local: Sun, Apr 17 2011 7:40 am
Subject: Re: elasticsearch vs solr : indexing speed
Hiya

> What do you think of this article:
> http://dmurphy747.wordpress.com/2011/04/02/solr-vs-elasticsearch-deat...
> where elasticsearch and solr are compared with regard to the indexing
> speed?

I've posted a reply (currently awaiting moderation) but his benchmark is
severely flawed.  eg, he wasn't actually indexing what he thought he was
indexing.

With a few simple changes, I got much better performance out of ES than
he was getting.

On a side note, it seems refresh_interval is not being respected in
0.15.2, which would also decrease raw indexing speed

clint


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
K.B.  
View profile  
 More options Apr 17 2011, 3:08 pm
From: "K.B." <korbinian.ba...@googlemail.com>
Date: Sun, 17 Apr 2011 12:08:09 -0700 (PDT)
Local: Sun, Apr 17 2011 3:08 pm
Subject: Re: elasticsearch vs solr : indexing speed
if you look at:

{"add":{"doc":{ "id":"1582039702", "field1_s":"1184645701" }} in case
of SOLR compared to
{"index": {"_index":"test", "_type":"type1", "_id":"1582039702",
"field1":"1184645701" }} for ES

he can't be serious; it's also not sure how the fields were treated
and configurated as no config options were stated.

From my own ES usage I know ES can index 1500 doc's containing each 45
fields (some very long language ones with up to 10 000 chars) in under
0.6 seconds; So if I just think about 2 fields here and take 1500 * 45
fields at 0.6 secs, I would expect that ES can take at least about 57
000 of his 2 field demo's without any problems;

On 17 Apr., 03:56, massi <mehdi.a...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Otis  
View profile  
 More options Apr 17 2011, 10:56 pm
From: Otis <otis.gospodne...@gmail.com>
Date: Sun, 17 Apr 2011 19:56:26 -0700 (PDT)
Local: Sun, Apr 17 2011 10:56 pm
Subject: Re: elasticsearch vs solr : indexing speed
Hi,

I wouldn't pay much attention to that post/benchmark.  A good
benchmark needs to publish a lot more details than the above, starting
with basic stuff like -Xmx.  I'm also of the opinion that if you are
going to publish a benchmark comparing 2 pieces of software then you
better invite experts from both sides and let them tune and optimize
things.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/

On Apr 16, 9:56 pm, massi <mehdi.a...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Shay Banon  
View profile  
 More options Apr 18 2011, 12:09 am
From: Shay Banon <shay.ba...@elasticsearch.com>
Date: Mon, 18 Apr 2011 07:09:15 +0300
Local: Mon, Apr 18 2011 12:09 am
Subject: Re: elasticsearch vs solr : indexing speed

Heya,

Here is clinton answer: https://gist.github.com/0382ed3913f0c3e40d62, and I'd like to add to that:

1. In order to completely compare the two in terms of overhead when indexing, at least for this very simple doc, the _source and _all field needs to be disabled.
2. The type used for Solr field1 is, when used in ES, of index set to not_analyzed, and omit_norms set to true. It should be the same for ES.
3. Again, ES will index two more additional fields, _id and _type. To really compare, they should be set to index to no. When doing so, the only thing one looses is the ability to query them on search time (this is in master).

I posted a sample as a comment on clinton post.

 Some more aspects to how ES works differently than Solr:

1. When indexing data its there. If you "kill -9" ES (even with a single server), and start it back up, all data indexing up until that point will be there with local gateway (this is not done through committing Lucene on each change, as this will not scale). Solr, on the other hand, will loose all changes until the last commit. This does come with a (small) overhead.
2. The bulk API format for elasticsearch is more optimized for distributed execution, where it needs to be sliced and diced in order to point the bulk items to the correct shards. This does come with an overhead compared to a single big json that is parsed and processed in a single shard scenario, while proves very crucial when working with several shards.

-shay.banon

On Monday, April 18, 2011 at 5:56 AM, Otis wrote:

Hi,


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
medcl2...@gmail.com  
View profile  
 More options Apr 18 2011, 1:19 am
From: <medcl2...@gmail.com>
Date: Mon, 18 Apr 2011 13:19:50 +0800
Local: Mon, Apr 18 2011 1:19 am
Subject: Re: elasticsearch vs solr : indexing speed

great~

From: Shay Banon
Sent: Monday, April 18, 2011 12:09 PM
To: us...@elasticsearch.com
Subject: Re: elasticsearch vs solr : indexing speed

Heya,

  Here is clinton answer: https://gist.github.com/0382ed3913f0c3e40d62, and I'd like to add to that:

1. In order to completely compare the two in terms of overhead when indexing, at least for this very simple doc, the _source and _all field needs to be disabled.
2. The type used for Solr field1 is, when used in ES, of index set to not_analyzed, and omit_norms set to true. It should be the same for ES.
3. Again, ES will index two more additional fields, _id and _type. To really compare, they should be set to index to no. When doing so, the only thing one looses is the ability to query them on search time (this is in master).

  I posted a sample as a comment on clinton post.

   Some more aspects to how ES works differently than Solr:

1. When indexing data its there. If you "kill -9" ES (even with a single server), and start it back up, all data indexing up until that point will be there with local gateway (this is not done through committing Lucene on each change, as this will not scale). Solr, on the other hand, will loose all changes until the last commit. This does come with a (small) overhead.
2. The bulk API format for elasticsearch is more optimized for distributed execution, where it needs to be sliced and diced in order to point the bulk items to the correct shards. This does come with an overhead compared to a single big json that is parsed and processed in a single shard scenario, while proves very crucial when working with several shards.

-shay.banon

On Monday, April 18, 2011 at 5:56 AM, Otis wrote:

  Hi,

  I wouldn't pay much attention to that post/benchmark. A good
  benchmark needs to publish a lot more details than the above, starting
  with basic stuff like -Xmx. I'm also of the opinion that if you are
  going to publish a benchmark comparing 2 pieces of software then you
  better invite experts from both sides and let them tune and optimize
  things.

  Otis
  ----
  Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
  Lucene ecosystem search :: http://search-lucene.com/

  On Apr 16, 9:56 pm, massi <mehdi.a...@gmail.com> wrote:

    Hi Guys,

    What do you think of this article:http://dmurphy747.wordpress.com/2011/04/02/solr-vs-elasticsearch-deat...
    where elasticsearch and solr are compared with regard to the indexing
    speed?

    A quote from the article: "I ran each test 4 times, killing the JVM
    and removing the data directory for both Solr and elasticsearch. The
    final averaged results expressed as throughputs were 43204 docs/sec
    for Solr, 44052 docs/sec for Solr direct streaming, and 9823 docs/sec
    for elasticsearch."

    PS: Don't take me wrong, I know that it is only one (partial) test,
    and that some features in elasticsearch make it unique!


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »