IndexedDB and performance, strange on Chrome

1,944 views
Skip to first unread message

Rodrigo Reyes

unread,
Sep 30, 2012, 9:56:29 AM9/30/12
to chromiu...@chromium.org
Hi,
I'm trying to benchmark the offline storage engines on several browsers, and I'm observing some strange results when testing with chrome 21. Basically, I'm testing a few simple cases:
- injecting data, one object injected per transaction
- bulk-injecting data: 100 object injected per transaction
- looking up objects.

What I'm obrrserving, on the few computers I own:
- on Linux, bulk-injection (ie, using one transaction to inject multiple object) is not better than using one separate transaction for each injected object. This is the strangest, because I absolutly didn't expect that.
- On Windows, bulk injection seems fine, but lookup is super slow (it's slow on linux too, but not that slow).
When runnning the tests on firefox, the numbers seem fine and consistant. I'm not giving numbers here, as they have no meaning outside the tested computer (it's just good enough to observe performance diffs on a single computer with different browsers).

I'm really puzzled by this, I checked my code twice, but I may be doing the things wrong with respect to transactions handling (I'm using setTimeout to yield).
Are you guys observing the same strange behavior ? Or is there something obviously wrong in my code ?

The benchmark test can be run on a github page: http://reyesr.github.com/html5-storage-benchmark/

Any advice: welcome!

thanks
Rodrigo


Joshua Bell

unread,
Oct 1, 2012, 12:46:30 PM10/1/12
to Rodrigo Reyes, chromiu...@chromium.org
On Sun, Sep 30, 2012 at 6:56 AM, Rodrigo Reyes <reye...@gmail.com> wrote:
Hi,
I'm trying to benchmark the offline storage engines on several browsers, and I'm observing some strange results when testing with chrome 21. Basically, I'm testing a few simple cases:
- injecting data, one object injected per transaction
- bulk-injecting data: 100 object injected per transaction
- looking up objects.

What I'm obrrserving, on the few computers I own:
- on Linux, bulk-injection (ie, using one transaction to inject multiple object) is not better than using one separate transaction for each injected object. This is the strangest, because I absolutly didn't expect that.
- On Windows, bulk injection seems fine, but lookup is super slow (it's slow on linux too, but not that slow).

Nitty gritty details: In Chromium, the bulk of the time for IndexedDB spent doing IPC. You can see this by preparing your test, launching chrome://tracing in a second tab and starting recording, starting your test, then when complete switching back to the tracing tab to stop recording. When the analysis is complete you can see the IndexedDB calls which will be occuring in the Renderer thread and Browser's WebKit thread.

You should give Beta, Dev, and Canary versions a try, as we're starting to do more work on performance in these scenarios. Some have already landed (including completely eliminating one of those slow IPC hops), more are in progress. 
 
I'm really puzzled by this, I checked my code twice, but I may be doing the things wrong with respect to transactions handling (I'm using setTimeout to yield).
Are you guys observing the same strange behavior ? Or is there something obviously wrong in my code ?

I don't see anything obviously wrong in your code.
 
The benchmark test can be run on a github page: http://reyesr.github.com/html5-storage-benchmark/

Cool, thanks for sharing! 

I don't have a hypothesis as to why lookups on Windows are particularly slow for you, but will look into it. Please do give Beta/Dev/Canary a try and see if those have better performance characteristics for you.

Alec Flett

unread,
Oct 1, 2012, 1:48:30 PM10/1/12
to Rodrigo Reyes, chromiu...@chromium.org
On Sun, Sep 30, 2012 at 6:56 AM, Rodrigo Reyes <reye...@gmail.com> wrote:

I'm really puzzled by this, I checked my code twice, but I may be doing the things wrong with respect to transactions handling (I'm using setTimeout to yield).
Are you guys observing the same strange behavior ? Or is there something obviously wrong in my code ?

The benchmark test can be run on a github page: http://reyesr.github.com/html5-storage-benchmark/


I was entirely not surprised by your finding as I assumed (like Josh) that so many other things dominate, that transactions might not make a noticable difference. Then I played with the benchmark. One thing that might be helpful is to show the rate (i.e. ops/second) or per-op time so you can compare apples-to-apples

I'm a little confused because for example when using Chrome Canary on my machine, I see 128ms for Inject-L (implying 2.5ms per put) and 939ms for InjectBulk-L (implying 0.4ms per put) - am I reading the numbers wrong?  My interpretation (again, on Canary) implies that the "bulk" operations are 5x faster. 
 
It's entirely possible that we managed to speed things up that much for this benchmark in Canary, but to be honest I'd be surprised if we sped it up by 5x!

Alec

Rodrigo Reyes

unread,
Oct 1, 2012, 5:13:28 PM10/1/12
to chromiu...@chromium.org, Rodrigo Reyes
Hi Joshua,


Nitty gritty details: In Chromium, the bulk of the time for IndexedDB spent doing IPC. You can see this by preparing your test, launching chrome://tracing in a second tab and starting recording, starting your test, then when complete switching back to the tracing tab to stop recording. When the analysis is complete you can see the IndexedDB calls which will be occuring in the Renderer thread and Browser's WebKit thread.

Wow, thanks for this chrome://tracing tip, I had no idea this existed!
 
I don't have a hypothesis as to why lookups on Windows are particularly slow for you, but will look into it. Please do give Beta/Dev/Canary a try and see if those have better performance characteristics for you.

I made another test on another windows computer, and got totally different, much more normal results. I have no idea why the lookups are so slow on the first window computers I tested the benchmark on. It's a computer my son uses to play videogames, so it'll be hard to negociate a re-install on it :), but I suspect it's a specific issue (although firefox gives normal performance, including for lookups, on this very same computer). I'll make more tests when I can get  my hands on more windows-installed machines.

Rodrigo

Rodrigo Reyes

unread,
Oct 1, 2012, 6:25:43 PM10/1/12
to chromiu...@chromium.org, Alec Flett
Hi Alec,

I was entirely not surprised by your finding as I assumed (like Josh) that so many other things dominate, that transactions might not make a noticable difference. Then I played with the benchmark. One thing that might be helpful is to show the rate (i.e. ops/second) or per-op time so you can compare apples-to-apples

You are totally right, I should have normalized results using a rate rather than trying to figure things out of raw numbers. I updated the code to include operations/millisecond (op/ms), and it makes much more sense (indexeddb bulk provides less performance increase than websql, but the base performance of a single inject is better, I think that's what puzzled me). I updated the page at http://reyesr.github.com/html5-storage-benchmark/ too.
 
I'm a little confused because for example when using Chrome Canary on my machine, I see 128ms for Inject-L (implying 2.5ms per put) and 939ms for InjectBulk-L (implying 0.4ms per put) - am I reading the numbers wrong?  My interpretation (again, on Canary) implies that the "bulk" operations are 5x faster. 

Yes, that's what I get too, bulk-injection, with 100 puts per transaction (instead of 1 put per transaction for Inject-L), makes the performance ~5x faster. Note that this is not as much as the performance boost that bulk-injection provides with websql (which is more like x10  or x12).
 
It's entirely possible that we managed to speed things up that much for this benchmark in Canary, but to be honest I'd be surprised if we sped it up by 5x!

Actually that's just the boost due to not creating a new transaction for each put, so that's not really unexpected AFAIK. However, I made some quick tests to compare the Chrome 22, Chrome 24 (Canary), and Firefox on the same WINDOWS laptop, this is what I roughly get:
Chrome 22:
- Inject-L: ~0.10 op/ms to 0.20 op/ms
- InjectBulk-L: around 2.1 op/ms

Chome 24 (Canary)
- Inject-L: 0.10 to 0.20 op/ms
- InjectBulk-L: around 1.3 op/ms

From this very quick test (I only ran the benchmark 6 times alternatively on chrome 22 and Canary, waiting a few seconds between each run) it seems on windows there's a performance drop for bulk-inject. But that's probably to be confirmed with tests on more computers.

For information, on firefox 15, same windows platform, I get:
- Inject-L: 0.04 op/ms
- InjectBulk: 1.65 op/ms

Hope it helps,
Rodrigo

Parashuram Narasimhan

unread,
Oct 1, 2012, 7:23:00 PM10/1/12
to chromiu...@chromium.org
Just sent a pull request for this to record the benchmark test results in Browser Scope - this way, we may have better numbers.

On the side note, is there an IndexedDB performance benchmark test suite ? If not, I think starting one would be interesting.

Alec Flett

unread,
Oct 1, 2012, 7:31:39 PM10/1/12
to Rodrigo Reyes, chromiu...@chromium.org
On Mon, Oct 1, 2012 at 3:25 PM, Rodrigo Reyes <reye...@gmail.com> wrote:
Yes, that's what I get too, bulk-injection, with 100 puts per transaction (instead of 1 put per transaction for Inject-L), makes the performance ~5x faster. Note that this is not as much as the performance boost that bulk-injection provides with websql (which is more like x10  or x12).

[FYI: below I am refering to "non-transactional requests" - what I mean is the pattern of having one or two requests per transaction, but lots of transactions, as opposed to a single transaction with lots of requests]

I don't have a feel for how much transactional vs non-transactional performance improvement should be, but our goal is to (ultimately) be significantly faster than WebSQL for most operations, regardless of transaction granularity. 

I think it's worth pointing out that the purpose of transactions is not for performance, but instead for correctness.
We want to make sure that all operations in a given transaction are consistent with each other (meaning they all commit, or none of them do) and with other transactions (meaning that two "simultaneous" transactions to not have overlapping writes - all of one transaction writes, and then the other) - whatever the performance characteristics are given those goals just kind of fall out of these goals.

 
It's entirely possible that we managed to speed things up that much for this benchmark in Canary, but to be honest I'd be surprised if we sped it up by 5x!

Actually that's just the boost due to not creating a new transaction for each put, so that's not really unexpected AFAIK. However, I made some quick tests to compare the Chrome 22, Chrome 24 (Canary), and Firefox on the same WINDOWS laptop, this is what I roughly get:
Chrome 22:
- Inject-L: ~0.10 op/ms to 0.20 op/ms
- InjectBulk-L: around 2.1 op/ms

Chome 24 (Canary)
- Inject-L: 0.10 to 0.20 op/ms
- InjectBulk-L: around 1.3 op/ms

From this very quick test (I only ran the benchmark 6 times alternatively on chrome 22 and Canary, waiting a few seconds between each run) it seems on windows there's a performance drop for bulk-inject. But that's probably to be confirmed with tests on more computers.

Yeah, that's disappointing. Hopefully some of our upcoming changes are going to improve this.  

Alec

Michael Nordman

unread,
Oct 1, 2012, 8:03:18 PM10/1/12
to Alec Flett, Rodrigo Reyes, Chromium HTML5
Very nice apples to apples comparison!

My windows system gets the slow IDB lookup behavior with m22. I haven't run it with a local build yet.

I was surprised that the batch mode didn't help more in he websql case, but the numbers made more sense after noticing the batch size was only 100. I bumped it up to 2000 to further reduce the transaction overhead.

Also, while looking in websql-storage.js, I noticed in some cases test fixture completion was based on statement completion instead of transaction completion callbacks. That made me question the validity of the numbers because the last statement callback is invoked prior to the its containing transction being committed. So I shuffled some completion callbacks around to be based on xaction callbacks.

    function bulk(self, keys, values, offset, batchSize, callback) {
        if (offset >= keys.length) {
            callback(true);
        } else {
            var end = Math.min(keys.length, offset + batchSize);
            self.db.transaction(
                function(tx) {
                  for (var i= offset; i<end; ++i)
                      tx.executeSql("INSERT OR REPLACE INTO " + self.storeName+ " (key,value) VALUES (?,?)", [keys[i], values[i]]);
                  },
                function() { callback(false); },   // <---- here
                function() { setTimeout(function(){bulk(self, keys,values,end,batchSize,callback)},1); }   // <---- and here
            );
        }
    }

    this.injectBulk = function(keys, values, callback) {
        bulk(this, keys, values, 0, this.batchSize, callback);
    }

    this.clear = function(callback) {
        var self = this;
        this.db.transaction(
            function(tx) {tx.executeSql("DELETE FROM " + self.storeName,[]);},
            function(){callback(false);},   // <---- and here
            function(){callback(true);});   // <---- and here
    }

    this.lookup = function(key, callback) {
//        callback(localStorage[key]);
        var self = this;
        this.db.readTransaction(function(tx) {   // <---- also switched to use readTransaction here
            tx.executeSql("SELECT * FROM " + self.storeName + " WHERE key=?", [key],
                function(tx,res) {
                    if (res.rows.length) {
                        var item = res.rows.item(0);
                        callback(item.value);
                    } else {
                        callback(undefined);
                    }
                    }, function() {
                    callback(false)
                });
        });
    };



--
You received this message because you are subscribed to the Google Groups "Chromium HTML5" group.
To post to this group, send email to chromiu...@chromium.org.
To unsubscribe from this group, send email to chromium-html...@chromium.org.
For more options, visit this group at http://groups.google.com/a/chromium.org/group/chromium-html5/?hl=en.

Rodrigo Reyes

unread,
Oct 2, 2012, 2:19:00 AM10/2/12
to Alec Flett, chromiu...@chromium.org
Hi Alec, 

[FYI: below I am refering to "non-transactional requests" - what I mean is the pattern of having one or two requests per transaction, but lots of transactions, as opposed to a single transaction with lots of requests]
I don't have a feel for how much transactional vs non-transactional performance improvement should be, but our goal is to (ultimately) be significantly faster than WebSQL for most operations, regardless of transaction granularity. 

On my current linux, non-ssd laptop, indexeddb is already faster than websql, and it is significativelly on single-op per transaction and lookups (but it's still slower than localStorage, which seems pretty normal though).
 
I think it's worth pointing out that the purpose of transactions is not for performance, but instead for correctness.
We want to make sure that all operations in a given transaction are consistent with each other (meaning they all commit, or none of them do) and with other transactions (meaning that two "simultaneous" transactions to not have overlapping writes - all of one transaction writes, and then the other) - whatever the performance characteristics are given those goals just kind of fall out of these goals.

You're right about the main goal being correctness, and it may be a bias we developers have regarding transactions, but anything that provides ACID ends up involved with some kind of resource acquisition, be it copy-on-write or exclusive locking, that necessarily impacts performance (although I suppose there may be some better strategy in a less-likely-to-be-multitasking environment like javascript). Most real-world applications are probably going to use multiple ops per transaction just for performance, and unfortunately in most cases little importance will be given to correctness (but I personnally agree with you on correctness being the most important feature of transactions).

Rodrigo

Rodrigo Reyes

unread,
Oct 2, 2012, 4:19:03 AM10/2/12
to Michael Nordman, Alec Flett, Chromium HTML5
Hi Michael,

Very nice apples to apples comparison!
My windows system gets the slow IDB lookup behavior with m22. I haven't run it with a local build yet.

Ho, so I'm not the only one... This is really strange, because I can't see any major difference between the two windows systems I used for my tests, the one with the slow lookups is an i7 with no ssd, and the laptop with normal lookups is an i5 with no ssd, both running windows 7.
 
I was surprised that the batch mode didn't help more in he websql case, but the numbers made more sense after noticing the batch size was only 100. I bumped it up to 2000 to further reduce the transaction overhead.

You're right, actually I did set a small batch size because when I initially used websql on an application to store some moderate amount of data, I had to tweak it as I observed some kind of hangup (probably a memory limit somewhere) when the transaction was too big (meaning: too many puts with too much data in them in a single transaction) on some browsers (I hope my memory does not fail me, but I think it was on the native Android 3 or 4 browser). It was like: browser silently failed when my batch size was 300 or 200, but was running fine with 150. So I set it at 100 just to be on the safe side, but you're right, this is suboptimal on desktop browsers. I should make some more tests and find a size closer to the limit for this benchmark.
 
Also, while looking in websql-storage.js, I noticed in some cases test fixture completion was based on statement completion instead of transaction completion callbacks. That made me question the validity of the numbers because the last statement callback is invoked prior to the its containing transction being committed. So I shuffled some completion callbacks around to be based on xaction callbacks.

Wow, thanks for this code review. I applied the changes you provided, there was not much difference in the numbers on my laptop, but they should be much more accurate that way.

Rodrigo

Michael Nordman

unread,
Oct 2, 2012, 6:16:51 PM10/2/12
to Rodrigo Reyes, Alec Flett, Chromium HTML5
On Tue, Oct 2, 2012 at 1:19 AM, Rodrigo Reyes <reye...@gmail.com> wrote:
Hi Michael,

Very nice apples to apples comparison!
My windows system gets the slow IDB lookup behavior with m22. I haven't run it with a local build yet.

Ho, so I'm not the only one... This is really strange, because I can't see any major difference between the two windows systems I used for my tests, the one with the slow lookups is an i7 with no ssd, and the laptop with normal lookups is an i5 with no ssd, both running windows 7.
 
I was surprised that the batch mode didn't help more in he websql case, but the numbers made more sense after noticing the batch size was only 100. I bumped it up to 2000 to further reduce the transaction overhead.

You're right, actually I did set a small batch size because when I initially used websql on an application to store some moderate amount of data, I had to tweak it as I observed some kind of hangup (probably a memory limit somewhere) when the transaction was too big (meaning: too many puts with too much data in them in a single transaction) on some browsers (I hope my memory does not fail me, but I think it was on the native Android 3 or 4 browser). It was like: browser silently failed when my batch size was 300 or 200, but was running fine with 150. So I set it at 100 just to be on the safe side, but you're right, this is suboptimal on desktop browsers. I should make some more tests and find a size closer to the limit for this benchmark.

Interesting, I can see how queueing up a lot of insert statements in the xaction callback could put pressure on memory. An object per statement will get quickly created and dumped into a queue. There could be another way to alleviate that memory pressure w/o incurring the full cost of a second xaction. And if your goal is to insert a large amount of stuff atomically (the real point of xactions), this would be important for that.

Instead of queueing up all statements in the xaction callback, queue up the first <n> statements in the xaction callback, and then in the <n>th statement callback, queue up <n> more statements. You don't have to queue up all statements at the very beginning of the transaction. The transaction will stay open until no more statements are queued up. By deferring some statements till later, the size of the statement Q can be controlled as needed. Instead of waiting till the <n>th to queue up more... you could do so in the <n-afew>th to keep the queue from running dry at any point.

I think you could issue an arbitrarily large number of statements in a single transaction in this way, provided your choice of <n+afew> of them does blow you out of memory. Statement batch size vs transaction batch size i guess. I think statement batch size is what's limiting you.

Michael Nordman

unread,
Oct 2, 2012, 9:39:02 PM10/2/12
to Rodrigo Reyes, Alec Flett, Chromium HTML5
In addition chrome://tracing there is also chrome://profiler to look at. IndexedDBContextImpl::QueryAvailableQuota shows up as a pretty tall pole in debug builds in that data, but  I really need to look at a release build to get meaningful data.

Michael Nordman

unread,
Oct 3, 2012, 3:02:28 PM10/3/12
to Rodrigo Reyes, Alec Flett, Chromium HTML5, Joshua Bell
I've been toying with this page and then looking at the profiler and tracing results. Having this instrumention in release builds is so excellent!.  Here's are some results from a near-tip-of-tree release build. 

A sliver of chrome://tracing data captured while running the Lookup2 test (looking up 2000x entries one-by one) against IDB. Total wall time duraction of the slice is ~24 seconds, the WEBKIT_THREAD was occupied for ~22 seconds in this window, almost all of which was spent in the commit() method.

The interesting thing is that since no mutations occurred in this transaction there's really nothing to be committed, yet that's where all the time went. I'm not sure what all happens under commit(), but it feels like theres might be a relatively straightforward optimization in here somewhere... like don't commit the empty writebatch, just delete it type of thing, or better yet have leveldb act like that when asked to commit an empty writebatch.


Slices:
MessageLoop::RunTask22115.092 ms8262 occurrences
IDBTransactionBackendImpl::taskEventTimerFired21789.476 ms1102 occurrences
IDBTransactionBackendImpl::commit21787.412 ms1102 occurrences
IDBLevelDBBackingStore::Transaction::begin1.615 ms1101 occurrences
IDBTransactionBackendImpl::taskTimerFired135.913 ms1101 occurrences
IDBObjectStoreBackendImpl::get4.68 ms1100 occurrences
IDBLevelDBBackingStore::Transaction::commit21341.566 ms1102 occurrences
IDBObjectStoreBackendImpl::getInternal129.083 ms1101 occurrences
IDBLevelDBBackingStore::openObjectStoreCursor82.296 ms1101 occurrences
IDBLevelDBBackingStore::getObjectStoreRecord22.06 ms1101 occurrences
*Totals 87409.193 ms18173 occurrences
Selection start24279.595 ms
Selection extent 24197.593 ms


On Tue, Oct 2, 2012 at 7:06 PM, Michael Nordman <mich...@google.com> wrote:
Hi Rodrigo,

In release builds the really tall pole in the chrome://profiler data is a call to WebKitPlatformSupportImpl::DoTimeout in the browser process which really doesn't narrow it down very much.

Something you might want to do is to give a little breathing room between different tests. Some subsystems defer work so shortly after a flurry of activity against the api, the subsystem will wake up and finish things up. That finishing up could bleed over and color the results for a new flurry of activity against a different subsystem api that's being timed. LocalStorage is like that, changes are lazily committed to disk on a timer sometime after the api has returned from setting values. I believe IDB does compaction in the background.

Michael Nordman

unread,
Oct 3, 2012, 3:30:22 PM10/3/12
to Rodrigo Reyes, Alec Flett, Chromium HTML5, Joshua Bell
Yup, the addition of the blue text below makes a monumental difference in read-only access. Where the Lookup2 test was taking ~70 seconds, it's now taking ~2 seconds!

Really it'd be better to push a change like this all the way down into leveldb::DBImpl::Write(const WriteOptions& options, WriteBatch* my_batch).

bool LevelDBTransaction::commit()
{
    ASSERT(!m_finished);
    if (m_tree.is_empty()) {
        m_finished = true;
        return true;       
    }

    OwnPtr<LevelDBWriteBatch> writeBatch = LevelDBWriteBatch::create();

    TreeType::Iterator iterator;
    iterator.start_iter_least(m_tree);

    while (*iterator) {
        AVLTreeNode* node = *iterator;
        if (!node->deleted)
            writeBatch->put(node->key, node->value);
        else
            writeBatch->remove(node->key);
        ++iterator;
    }

    if (!m_db->write(*writeBatch))
        return false;

    clearTree();
    m_finished = true;
    return true;

Rodrigo Reyes

unread,
Oct 3, 2012, 5:25:50 PM10/3/12
to Michael Nordman, Chromium HTML5
Hi Michael,

In release builds the really tall pole in the chrome://profiler data is a call to WebKitPlatformSupportImpl::DoTimeout in the browser process which really doesn't narrow it down very much.
Something you might want to do is to give a little breathing room between different tests. Some subsystems defer work so shortly after a flurry of activity against the api, the subsystem will wake up and finish things up. That finishing up could bleed over and color the results for a new flurry of activity against a different subsystem api that's being timed. LocalStorage is like that, changes are lazily committed to disk on a timer sometime after the api has returned from setting values. I believe IDB does compaction in the background.

I added 500ms between each tests to give it some breathing, and added a few options to modify the number of elements injected or the batch size, in case it can help tuning the perfs. Anyway congratulations on your recent optimizations, I hope it reaches the release channel soon! 

Rodrigo

Joshua Bell

unread,
Oct 3, 2012, 6:08:27 PM10/3/12
to mich...@chromium.org, Rodrigo Reyes, Alec Flett, Chromium HTML5
On Wed, Oct 3, 2012 at 12:30 PM, Michael Nordman <mich...@chromium.org> wrote:
Yup, the addition of the blue text below makes a monumental difference in read-only access. Where the Lookup2 test was taking ~70 seconds, it's now taking ~2 seconds!

Really it'd be better to push a change like this all the way down into leveldb::DBImpl::Write(const WriteOptions& options, WriteBatch* my_batch).

bool LevelDBTransaction::commit()
{
    ASSERT(!m_finished);
    if (m_tree.is_empty()) {
        m_finished = true;
        return true;       
    }


That looks suspiciously like the patch I have in webkit.org/b/89239 :)
 
(although one line higher up does make more sense)

As you can see by the notes in the bug I've been waffling about landing it. I guess I'll go ahead and get it reviewed.
Reply all
Reply to author
Forward
0 new messages