Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

SQLite and Firefox

330 views
Skip to first unread message

Nicholas Nethercote

unread,
Nov 15, 2011, 8:54:36 PM11/15/11
to dev-pl...@lists.mozilla.org, Richard Hipp
Hi,

(TL;DR: There are numerous opportunities to greatly improve SQLite's
performance and memory consumption, but we need to talk to the SQLite
developers to let them know what we need.)

The recent e10s/responsiveness thread on dev.planning discussed SQLite quite
a bit, starting with this:

> E) Places optimizations: Taras' team is working on optimizing; however,
> we need to make hard decisions about when and where to use an SQL
> database. We also need to consider alternatives to SQLite.
> [...]
> it was decided that the front-end IOC work (1) should be suspended in
> order to focus on places optimizations

Some people clearly don't like SQLite at all, and would simply like it to be
removed from Firefox. (It's already been removed from Fennec). I want to
understand the dislike.

I can see and understand one clear complaint: SQLite uses too much memory.
I'll discuss this below. Other than that (i.e. responsiveness), I haven't
seen any specific data or bug numbers. Taras Glek joined today's MemShrink
meeting and the topic of slow SQLite initialization time came up. Richard
Hipp (an SQLite developer) was also in the meeting; he's very keen to help,
and he and Taras talked about how to improve this, and it sounds like SQLite
has some features that Taras might be able to use to help here. They're
continuing the discussion via email.

On the memory issue, I've also been talking a lot with Richard. He's been
extremely responsive and has come up with numerous ideas and patches to
reduce SQLite's memory usage in Firefox. In particular, there's a one-line
patch that makes SQLite discard cache pages more eagerly than it currently
does. This drastically reduces the memory usage -- in limited testing I saw
a 2--3x reduction with smallish databases, but Richard said he saw 60MB+
usage drop down to 3MB(!). The slow-down associated with this was only
5--10% on SQLite performance tests, but it's unclear how those correlate to
Firefox's SQLite performance.

SQLite also has the ability to drop lots of memory on request, so long as
the requests are made per-connection.

So, Richard wants to help, but he needs information from Mozilla -- what do
we want from SQLite? What are the time-space trade-offs we're interested
in?

--

Another issue relating to SQLite -- we've only ever used official SQLite
releases in Firefox, and never modified them. Richard suggested we might
like to consider taking unmodified revisions from the SQLite repository.
These obviously are uniquely identifiable. The advantage is that we could
get any improvements into Firefox without having to wait for the SQLite
release cycle. Are there any objections to doing this?

--

Yet another issue relating to SQLite -- the Storage module is owned by Shawn
Wilsher and has one peer, Andrew Sutherland. BOTH of them, AIUI, have
little time to work on Firefox. This seems sub-optimal. Richard even said
that he hadn't heard anything from Mozilla people for months and so he
assumed that Mozilla was really happy with how SQLite was performing.

--

To conclude, Shawn said this, with which I agree:

> I have yet to see any hard data saying "X would be better than SQL for this
> component based on requirements Y", which makes me hesitant to jump on the
> bandwagon yet.

Nick

Andreas Gal

unread,
Nov 15, 2011, 9:50:47 PM11/15/11
to Nicholas Nethercote, Richard Hipp, dev-pl...@lists.mozilla.org

This is a combination of a couple things:

First of all, our use of SQLite is probably really bad. The places code in particular is a total mess. It does stuff like use SQLite synchronously and asynchronously, creating all sorts of opportunities for deadlocks (bug 686025). Beyond that, the places code also uses an extremely complicated database scheme for even the most trivial things, which causes complex queries and joins to be run for simple operations such as give-me-the-last-seen-fav-icon-for-a-page.

That having said, SQLite seems a wrong tool for us in some cases. Its a general purpose database that is optimized to achieve good performance average case, but arbitrarily bad performance worst case. Bug 691509 has shown that we can get long hangs (minutes?) if we didn't run analyze in a while and SQLite takes a bad query path. Many of the things we currently use SQLite for we can get specialized implementations with predictable performance behavior with much less code involved. Favicons for example should be a simple circular cache on disk (we can easily fetch them again once we evict something).

On top of this, we have also seen horrible performance behavior in our indexeddb code. Walking 5000 entries took something like 30s. I am sure we are using SQLite in a bad way (Richard might have more data on this, I think bent contacted him about this), but this points to the same problem. SQLite can have fragile performance behavior if we rub it the wrong way. Thats not really SQLite's fault. It simply optimized for max throughput on average case, and not max responsiveness, which we really need for many of these tasks.

Andreas
> _______________________________________________
> dev-platform mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform

Andrew Sutherland

unread,
Nov 15, 2011, 10:56:01 PM11/15/11
to dev-pl...@lists.mozilla.org
On 11/15/2011 05:54 PM, Nicholas Nethercote wrote: To conclude, Shawn
said this, with which I agree:
>> I have yet to see any hard data saying "X would be better than SQL for this
>> component based on requirements Y", which makes me hesitant to jump on the
>> bandwagon yet.

As my storage involvement came from Thunderbird, I can't speak to
Firefox's needs, only Thunderbird's use in its "gloda" global database.

gloda could be more performant in terms of random access I/O under
something like LevelDB/IndexedDB-on-LevelDB which provides
lexicographically ordered data stored on disk the same way (using
log-structured merge periodic compaction). It's hard to achieve the
same thing in SQLite unless you VACUUM like crazy because:
1) row ordering is by row id / integer primary key, so unless you
manually allocate integer primary keys and do some kind of clever id
allocation to allow for clustering, row accesses will end up being
random access rather than clustered. Covering indices can help address
this problem (with nontrivial overhead) by providing a btree ordered by
more complex keys, but then you run up against...
2) there is no defragmentation support outside of VACUUM. Your btree
pages can easily end up randomly shuffled throughout the database*
3) VACUUM is effectively an offline operation (and "incremental VACUUM"
is a misnomer; it's automatic file shrinking) and becomes untenable if
your database is very large, as gloda databases can end up. This can be
worked around by using SQLite's support for attaching multiple database
files to a single connection and some combination of sharding and manual
record migration between databases.

It's probably also worth noting that for all the mork-haters out there,
something LevelDB-ish would be a better replacement for Thunderbird's
mork-based .msf index files than SQLite because of the random-access
problem too. Also of note: a number of people using naive backup
systems like Apple's Time Machine that backup entire files if any part
of them changes were not happy about the implications of large SQLite
database files on backup. Whereas something like LevelDB whose files
are immutable once written would be friendlier as long as the complete
compaction turnover rate is less frequent than the backup interval.

Which is to say, there are definitely use cases that would benefit from
LevelDB-in-platform, although just using it to back IndexedDB would be
sufficient. And I know there are plans along this line, but the sooner
the better...

Andrew

* The SQLite testing frameworks "dbstat" virtual table can be used to
illustrate the internal fragmentation of databases. A patch that you
can apply to the "spaceanal.tcl" script to dump its representation as a
JSON structure that you can process can be found at the following URL.
(I patch the tcl script because it was the easiest way to get at a
SQLite binary with the 'dbstat' table built-in.)
https://github.com/asutherland/tb-test-help/blob/master/sqlite/analyzer-tcl-script-json-dbstats.patch

Shawn Wilsher

unread,
Nov 16, 2011, 1:40:15 AM11/16/11
to dev-pl...@lists.mozilla.org, Richard Hipp
On 11/15/2011 5:54 PM, Nicholas Nethercote wrote:
> Some people clearly don't like SQLite at all, and would simply like it to be
> removed from Firefox. (It's already been removed from Fennec). I want to
> understand the dislike.
It can't have been completely removed; does Fennec use some other cookie
engine too?

> On the memory issue, I've also been talking a lot with Richard. He's been
> extremely responsive and has come up with numerous ideas and patches to
> reduce SQLite's memory usage in Firefox. In particular, there's a one-line
> patch that makes SQLite discard cache pages more eagerly than it currently
> does. This drastically reduces the memory usage -- in limited testing I saw
> a 2--3x reduction with smallish databases, but Richard said he saw 60MB+
> usage drop down to 3MB(!). The slow-down associated with this was only
> 5--10% on SQLite performance tests, but it's unclear how those correlate to
> Firefox's SQLite performance.
Our performance testing story has always been poor here. Our best
metric is tp, and that's just not a really good metric for "SQLite
performance" for most of our use cases.

> Another issue relating to SQLite -- we've only ever used official SQLite
> releases in Firefox, and never modified them. Richard suggested we might
> like to consider taking unmodified revisions from the SQLite repository.
> These obviously are uniquely identifiable. The advantage is that we could
> get any improvements into Firefox without having to wait for the SQLite
> release cycle. Are there any objections to doing this?
There are two reasons we've only ever used release version of SQLite:
1) Linux distributions (at least used to) use system SQLite, and they
can't do that if we ship something that isn't released.
2) I never felt comfortable shipping a release that didn't go through
SQLite's testing. They have pretty extensive test coverage, and I'd
hate to ship a buggy version of SQLite to a few hundred million users.

With that being said, Mozilla is a member of the SQLite consortium, with
all the benefits that entails:
http://www.sqlite.org/consortium.html

> Yet another issue relating to SQLite -- the Storage module is owned by Shawn
> Wilsher and has one peer, Andrew Sutherland. BOTH of them, AIUI, have
> little time to work on Firefox. This seems sub-optimal. Richard even said
> that he hadn't heard anything from Mozilla people for months and so he
> assumed that Mozilla was really happy with how SQLite was performing.
While October was a bad month (computer issues), I should be generally
responsive to e-mails and review requests. As for the peer situation,
well, it's a complicated module and I haven't seen someone with the
mastery of the invariants in it like asuth yet (although mak is getting
there). If more people end up hacking on it, ideally more people will
become proficient with the code. However, it would be irresponsible of
me as the module owner to promote someone to peer who isn't ready.

Cheers,

Shawn

Shawn Wilsher

unread,
Nov 16, 2011, 1:59:37 AM11/16/11
to dev-pl...@lists.mozilla.org
On 11/15/2011 6:50 PM, Andreas Gal wrote:
> First of all, our use of SQLite is probably really bad. The places code in particular is a total mess. It does stuff like use SQLite synchronously and asynchronously, creating all sorts of opportunities for deadlocks (bug 686025). Beyond that, the places code also uses an extremely complicated database scheme for even the most trivial things, which causes complex queries and joins to be run for simple operations such as give-me-the-last-seen-fav-icon-for-a-page.
Bug 686025 was not a deadlock, it was lock contention (or at worst
resource starvation). Exaggerating the situation only makes this
perception worse, FWIW, and doesn't help solve any problems. I'd love
to know more details on how the current places database schema is
"extremely complicated" for the "most trivial things". Making those
claims without backing it up with data is also not helping and counter
to the whole purpose of this thread.

> That having said, SQLite seems a wrong tool for us in some cases. Its a general purpose database that is optimized to achieve good performance average case, but arbitrarily bad performance worst case. Bug 691509 has shown that we can get long hangs (minutes?) if we didn't run analyze in a while and SQLite takes a bad query path. Many of the things we currently use SQLite for we can get specialized implementations with predictable performance behavior with much less code involved. Favicons for example should be a simple circular cache on disk (we can easily fetch them again once we evict something).
Bug 691509 was us using SQLite in the worst possible way. We'd run
`ANALYZE` once, so all table counts would be very low. Then, we'd add a
bunch of data making those counts inaccurate. Because SQLite had data
saying that there were a small number of rows, it choose to do a table
scan because it would have been more efficient if those numbers were
accurate.

I'll note that SQLite would have picked a perfectly workable query plan
for these queries if we had never chosen to run `ANALYZE`. Because we
did, and populated it with the wrong data, we shot ourselves in the foot
(think of `ANALYZE` as "trust me, I know what I'm doing").

> On top of this, we have also seen horrible performance behavior in our indexeddb code. Walking 5000 entries took something like 30s. I am sure we are using SQLite in a bad way (Richard might have more data on this, I think bent contacted him about this), but this points to the same problem. SQLite can have fragile performance behavior if we rub it the wrong way. Thats not really SQLite's fault. It simply optimized for max throughput on average case, and not max responsiveness, which we really need for many of these tasks.
bent, sicking, and I all knew that SQLite wasn't ideal for IndexedDB,
but it was the only tool we had at the time that was cross platform (I
don't currently remember the databases we looked at at the time). We
didn't want to write our own database engine. Now that LevelDB exists,
it's very likely the right choice (I even filed the bug [1] tracking that).

Cheers,

Shawn

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=679267

Andreas Gal

unread,
Nov 16, 2011, 2:17:54 AM11/16/11
to Shawn Wilsher, dev-pl...@lists.mozilla.org
Hi Shawn,

On Nov 15, 2011, at 10:59 PM, Shawn Wilsher wrote:

> On 11/15/2011 6:50 PM, Andreas Gal wrote:
>> First of all, our use of SQLite is probably really bad. The places code in particular is a total mess. It does stuff like use SQLite synchronously and asynchronously, creating all sorts of opportunities for deadlocks (bug 686025). Beyond that, the places code also uses an extremely complicated database scheme for even the most trivial things, which causes complex queries and joins to be run for simple operations such as give-me-the-last-seen-fav-icon-for-a-page.
> Bug 686025 was not a deadlock, it was lock contention (or at worst resource starvation). Exaggerating the situation only makes this perception worse, FWIW, and doesn't help solve any problems. I'd love to know more details on how the current places database schema is "extremely complicated" for the "most trivial things". Making those claims without backing it up with data is also not helping and counter to the whole purpose of this thread.

Pointing out that places merely spins over contented locks for minutes instead of deadlocking doesn't really increase my confidence in the code.

As for places being overly complex, I think the database schema and the queries used to do trivial actions like favicon lookup do back that statement:

http://people.mozilla.org/~dietrich/places-erd.png

nsCOMPtr<mozIStorageStatement> stmt = mDB->GetStatement(
"SELECT f.id, f.url, length(f.data), f.expiration "
"FROM moz_places h "
"JOIN moz_favicons f ON h.favicon_id = f.id "
"WHERE h.url = :page_url "
"LIMIT 1"
);

Fetching a single favicon for a page shouldn't require running a join. Someone went a bit overboard with relational database design here, and we are paying the responsiveness price.

>
>> That having said, SQLite seems a wrong tool for us in some cases. Its a general purpose database that is optimized to achieve good performance average case, but arbitrarily bad performance worst case. Bug 691509 has shown that we can get long hangs (minutes?) if we didn't run analyze in a while and SQLite takes a bad query path. Many of the things we currently use SQLite for we can get specialized implementations with predictable performance behavior with much less code involved. Favicons for example should be a simple circular cache on disk (we can easily fetch them again once we evict something).
> Bug 691509 was us using SQLite in the worst possible way. We'd run `ANALYZE` once, so all table counts would be very low. Then, we'd add a bunch of data making those counts inaccurate. Because SQLite had data saying that there were a small number of rows, it choose to do a table scan because it would have been more efficient if those numbers were accurate.
>
> I'll note that SQLite would have picked a perfectly workable query plan for these queries if we had never chosen to run `ANALYZE`. Because we did, and populated it with the wrong data, we shot ourselves in the foot (think of `ANALYZE` as "trust me, I know what I'm doing").

You aren't really replying to the argument here. Do you agree that we should target simple problems (favicon cache) with simple solutions with linear worst case performance?

>
>> On top of this, we have also seen horrible performance behavior in our indexeddb code. Walking 5000 entries took something like 30s. I am sure we are using SQLite in a bad way (Richard might have more data on this, I think bent contacted him about this), but this points to the same problem. SQLite can have fragile performance behavior if we rub it the wrong way. Thats not really SQLite's fault. It simply optimized for max throughput on average case, and not max responsiveness, which we really need for many of these tasks.
> bent, sicking, and I all knew that SQLite wasn't ideal for IndexedDB, but it was the only tool we had at the time that was cross platform (I don't currently remember the databases we looked at at the time). We didn't want to write our own database engine. Now that LevelDB exists, it's very likely the right choice (I even filed the bug [1] tracking that).

I am glad we agree. We should see if Richard can help us fix the performance problems short term, and mid-term we should eliminate SQLite from indexeddb.

Andreas

Wolfgang Rosenauer

unread,
Nov 16, 2011, 3:11:50 AM11/16/11
to
Am 16.11.2011 07:40, schrieb Shawn Wilsher:
> On 11/15/2011 5:54 PM, Nicholas Nethercote wrote:
>> Another issue relating to SQLite -- we've only ever used official SQLite
>> releases in Firefox, and never modified them. Richard suggested we might
>> like to consider taking unmodified revisions from the SQLite repository.
>> These obviously are uniquely identifiable. The advantage is that we
>> could
>> get any improvements into Firefox without having to wait for the SQLite
>> release cycle. Are there any objections to doing this?
> There are two reasons we've only ever used release version of SQLite:
> 1) Linux distributions (at least used to) use system SQLite, and they
> can't do that if we ship something that isn't released.

I can comment on that at least from the openSUSE perspective.

Some time ago we really tried to use system sqlite. But we ran into
several issues with that and stopped doing that. I cannot remember
everything but it included the following reasons:
- sqlite as used by Mozilla is usually newer than what we have on the
system and updating sqlite in the same pace as Firefox didn't make
sense
- at some point Mozilla used an "experimental" feature of sqlite which
we didn't want to use system wide
- at some point I really think there were modifications or patches which
were pulled from newer changes in mozilla's sqlite not available in
the release version (cannot remember details and it's probably not
important to track down anymore).

So to sum up:
Yes, it would be nice if we could use system sqlite but I'm not sure if
it's feasible anyway (haven't checked the exact situation for at least a
year or two though).
So currently I do not care that much but not sure about the others.


Wolfgang

Mike Hommey

unread,
Nov 16, 2011, 3:37:47 AM11/16/11
to Wolfgang Rosenauer, dev-pl...@lists.mozilla.org
FWIW, in Debian, we've always used system sqlite.

Mike

Wolfgang Rosenauer

unread,
Nov 16, 2011, 4:00:17 AM11/16/11
to
Thanks for the feedback.
But then again you usually do not update Firefox at a "high speed" in
Debian.
For example the oldest openSUSE release I support with Firefox 8
currently has system sqlite 3.6.4. AFAIK configure does not allow that
anyway.

Wolfgang

Mike Hommey

unread,
Nov 16, 2011, 4:21:17 AM11/16/11
to Wolfgang Rosenauer, dev-pl...@lists.mozilla.org
I'm providing backports of last release, beta and aurora for Debian
Squeeze, alongside with a backport of the system sqlite, currently 3.7.8.

Mike

Mark Finkle

unread,
Nov 16, 2011, 9:57:41 AM11/16/11
to Shawn Wilsher, dev-pl...@lists.mozilla.org, Richard Hipp
On 11/16/2011 01:40 AM, Shawn Wilsher wrote:
> On 11/15/2011 5:54 PM, Nicholas Nethercote wrote:
>> Some people clearly don't like SQLite at all, and would simply like it
>> to be
>> removed from Firefox. (It's already been removed from Fennec). I want to
>> understand the dislike.
> It can't have been completely removed; does Fennec use some other cookie
> engine too?

Technically, Fennec only removed the use of Places. SQLite is used for
other features. In fact, Fennec just uses the Android OS SQLite-based
history and bookmark systems now.

However, SQLite performance and memory usage are major concerns for Fennec.

Mark Finkle

unread,
Nov 16, 2011, 9:57:41 AM11/16/11
to Shawn Wilsher, Richard Hipp, dev-pl...@lists.mozilla.org
On 11/16/2011 01:40 AM, Shawn Wilsher wrote:
> On 11/15/2011 5:54 PM, Nicholas Nethercote wrote:
>> Some people clearly don't like SQLite at all, and would simply like it
>> to be
>> removed from Firefox. (It's already been removed from Fennec). I want to
>> understand the dislike.
> It can't have been completely removed; does Fennec use some other cookie
> engine too?

Marco Bonardo

unread,
Nov 16, 2011, 11:55:12 AM11/16/11
to
On 16/11/2011 02:54, Nicholas Nethercote wrote:
>> E) Places optimizations:

I'd prefer calling this Storage optimizations, Places is just one of the
many users, and some of those don't even try to avoid mainthread IO
(that luckily Places does regarding history and favicons, not bookmarks
yet). All users should be fixed, included the Storage API (bug 699820)

> Some people clearly don't like SQLite at all, and would simply like it to be
> removed from Firefox. (It's already been removed from Fennec).

This is partly untrue, I think the Android system history and
bookmarking stuff, that is being used as a Places replacement, is just
another SQLite database, based on an older version of SQLite (depending
on what the system has available). And I don't know if it's doing
mainthread IO, nor nobody measured it being faster as far as I can tell.
It may be, since it's likely simpler, as well as we may go simpler (and
we are doing).

> I can see and understand one clear complaint: SQLite uses too much memory.

To be clear, for a long time this was done "by design". Probably a bad
decision in the past. The current code in nightly has removed most of
those bad decisions and we are down to "acceptable" (but still
improvable, see below) levels.

> The slow-down associated with this was only
> 5--10% on SQLite performance tests, but it's unclear how those correlate to
> Firefox's SQLite performance.

Right, this should be measured, the issue we may hit is an increase in
the number of disk reads. Though it's possible our usecase may not reuse
pages that much, and we'd not see any tangible perf hit.
This can be measured using xperf or any other profiler to count disk
reads, maybe during a Tp5 test run.

----

There is also another possibility to reduce memory, that is avoding slop
bytes in pages allocations (10% to 50% win), but this has a cost
involved in:
- 1 additional malloc per page alloc
- 1 additional page boundaries check

The performance cost of this has been evaluated in about 3%, mostly due
to the boundaries check. The problem is that for certain database
corruptions SQLite may try to read some bytes after the end of a page,
and this is a problem when those fall out of the mapped space.
To alleviate the problem SQLite allocs page+header so that the over-read
bytes fall into the header and never out of mapped space.
If we could provide a jemalloc way to add some padding to certain
allocations or to do 2 separate allocations but ensure one follows the
other one, this may be solved with no perf penalty.
Otherwise we may try to help optimizing the boundaries check code, that
you can find in sqlite.c between the SQLITE_OVERSIZE_CELL_CHECK defines.
Any help or ideas about this would be appreciated.

> SQLite also has the ability to drop lots of memory on request, so long as
> the requests are made per-connection.

And this would be really useful to solve the memory-pressure problem
(bug 411894). We can use it, we could not in the past due to
SQLITE_MEMORY_MANAGEMENT being particularly expensive in threadsafe
mode, but Richard said this new api would not be expensive, it should
just be invoked by each thread.

> Another issue relating to SQLite -- we've only ever used official SQLite
> releases in Firefox, and never modified them. Richard suggested we might
> like to consider taking unmodified revisions from the SQLite repository.
> These obviously are uniquely identifiable. The advantage is that we could
> get any improvements into Firefox without having to wait for the SQLite
> release cycle. Are there any objections to doing this?

I think the main objection from the past was that we should not make
SQLite updates harder. So we likely won't take internal patches (we
should send them to the SQLite team instead), but imo we may take
special versions from them even if those are not officially released.
But Sdwilsh is the module owner here, so this should go through him, I'm
only giving my personal opinion.

Cheers,
Marco

Marco Bonardo

unread,
Nov 16, 2011, 12:02:58 PM11/16/11
to
On 16/11/2011 15:57, Mark Finkle wrote:
> Technically, Fennec only removed the use of Places. SQLite is used for
> other features. In fact, Fennec just uses the Android OS SQLite-based
> history and bookmark systems now.

Was it measured if dropping SQLite per SQLite is a perf gain?
I see a lot of reasoning about the fact using the system API allows to
share the user's data (that I disagree with for privacy reasons, but
it's just my personal opinion) on the entire system, but I have not seen
a performance and off-maint-hread IO measure.
Since we already plan a Places resize, would be really nice to see "how
far we are" from what you consider a target.

Cheers,
Marco

Marco Bonardo

unread,
Nov 16, 2011, 12:05:41 PM11/16/11
to
On 16/11/2011 07:59, Shawn Wilsher wrote:
> I'll note that SQLite would have picked a perfectly workable query plan
> for these queries if we had never chosen to run `ANALYZE`.

Yeah I still think the solution there is to populate analyze table with
known data, we build queries to run in a certain way, we can drive
SQLite rather then being driven.

-m

Marco Bonardo

unread,
Nov 16, 2011, 12:16:34 PM11/16/11
to
On 16/11/2011 08:17, Andreas Gal wrote:
> Pointing out that places merely spins over contented locks for minutes instead of deadlocking doesn't really increase my confidence in the code.

The fact the code is wrongly using SQLite connection is well known, and
this can happen in a lot of components, not just Places (see the
dependecies in bug 699820). The storage api introduced async later, and
now a bunch of consumers are mixed sync/async api users, and that's bad.
We should just have a single async connection api that acts on a single
thread, and nothing more.

> Fetching a single favicon for a page shouldn't require running a join. Someone went a bit overboard with relational database design here, and we are paying the responsiveness price.

Honestly, the first thought I had looking at the db was that who
designed the schema (I assume the initial Google team) never saw a db
before in his life. Actually there are far worse issues, than a simple
join to get the icon, in the schema.
Btw consider this is not a server database, so it is expected to have
some more joins to save on database filesize. Relational database design
would have probably used the page url as a key on icons, but that'd
actually be more expensive for an embedded db.

Just to put this on the right perspective, we are going to completely
rewrite the schema, and make so that we can change it completely when we
want. At least, this is the plan.
So, if you have ideas to improve things, bring them out!

> You aren't really replying to the argument here. Do you agree that we should target simple problems (favicon cache) with simple solutions with linear worst case performance?

Personally, I'm open to suggestions, fwiw I never liked having favicons
in the db.

Cheers,
Marco

Justin Lebar

unread,
Nov 16, 2011, 1:29:55 PM11/16/11
to Andreas Gal, dev-pl...@lists.mozilla.org, Shawn Wilsher
Andreas wrote:
> SQLite can have fragile performance behavior if we rub it the wrong way.
> Thats not really SQLite's fault. It simply optimized for max throughput on
> average case, and not max responsiveness, which we really need for many of
> these tasks.

SQLite clearly has bad performance if we rub it the wrong way. But I think
it's wrong to characterize this as "optimized for max throughput in average
case, rather than max responsiveness." In my experience, SQLite is very
tunable, and you can optimize it for basically whatever you want. (Therein
lies the problem; if you don't tune it right, you're optimizing for slowness.)

It's fair to call out SQLite's fragility, but let's not confuse the point.

If we wanted to give SQLite a fair chance, we'd:

1. Fix the things we're doing that we *know* are causing problems, and
2. Instrument our implementation so we notice when SQLite is misbehaving.

But until we do these things, I don't think it's particularly fair to
beat up on SQLite.

Andreas wrote:
> nsCOMPtr<mozIStorageStatement> stmt = mDB->GetStatement(
> "SELECT f.id, f.url, length(f.data), f.expiration "
> "FROM moz_places h "
> "JOIN moz_favicons f ON h.favicon_id = f.id "
> "WHERE h.url = :page_url "
> "LIMIT 1"
> );
>
> Fetching a single favicon for a page shouldn't require running a join. Someone went a
> bit overboard with relational database design here, and we are paying the
> responsiveness price.

Maybe I like relational databases too much, but using a join here
seems perfectly reasonable to me. The DB should have a unique index
(primary key?) on moz_places.url, so to do the join, we look up the
one row in moz_places, then we look up the favicon corresponding to
the URL, again using an index. That's what you'd do in pretty much
any sane design, with or without a relational database.

Now, maybe SQLite executes this join poorly, either because it's not
so smart or because we're not feeding it the right statistics or
because our indexes are wrong. Perhaps it's too easy to footgun with
even simple SQLite queries like this; perhaps a different storage
back-end would have fewer opportunities for fail. Or maybe you think
we shouldn't be doing this lookup in the first place.

But I think it's wrong to look at the SQL statement, say that it's
complicated, and assert that we're paying a responsiveness price.

sdwilsh wrote:
> Bug 686025 was not a deadlock, it was lock contention (or at worst resource starvation).

Andreas is correct in that the query here took minutes to run. So
there were two bugs: That the query took forever, and also that it
blocked the UI. The latter is resource starvation and entirely
Mozilla's fail, but I think it's fair to say that the former is
related to SQLite. We wouldn't have had this problem if SQLite
automatically analyzed itself or used introspection to notice when the
assumptions baked into its query plan were way off. But such is the
cost of using a "lite" database.

-Justin

On Wed, Nov 16, 2011 at 2:17 AM, Andreas Gal <g...@mozilla.com> wrote:
> Hi Shawn,
>
> On Nov 15, 2011, at 10:59 PM, Shawn Wilsher wrote:
>
>> On 11/15/2011 6:50 PM, Andreas Gal wrote:
>>> First of all, our use of SQLite is probably really bad. The places code in particular is a total mess. It does stuff like use SQLite synchronously and asynchronously, creating all sorts of opportunities for deadlocks (bug 686025). Beyond that, the places code also uses an extremely complicated database scheme for even the most trivial things, which causes complex queries and joins to be run for simple operations such as give-me-the-last-seen-fav-icon-for-a-page.
>> Bug 686025 was not a deadlock, it was lock contention (or at worst resource starvation).  Exaggerating the situation only makes this perception worse, FWIW, and doesn't help solve any problems.  I'd love to know more details on how the current places database schema is "extremely complicated" for the "most trivial things".  Making those claims without backing it up with data is also not helping and counter to the whole purpose of this thread.
>
> Pointing out that places merely spins over contented locks for minutes instead of deadlocking doesn't really increase my confidence in the code.
>
> As for places being overly complex, I think the database schema and the queries used to do trivial actions like favicon lookup do back that statement:
>
> http://people.mozilla.org/~dietrich/places-erd.png
>
>  nsCOMPtr<mozIStorageStatement> stmt = mDB->GetStatement(
>    "SELECT f.id, f.url, length(f.data), f.expiration "
>    "FROM moz_places h "
>    "JOIN moz_favicons f ON h.favicon_id = f.id "
>    "WHERE h.url = :page_url "
>    "LIMIT 1"
>  );
>
> Fetching a single favicon for a page shouldn't require running a join. Someone went a bit overboard with relational database design here, and we are paying the responsiveness price.
>
>>
>>> That having said, SQLite seems a wrong tool for us in some cases. Its a general purpose database that is optimized to achieve good performance average case, but arbitrarily bad performance worst case. Bug 691509 has shown that we can get long hangs (minutes?) if we didn't run analyze in a while and SQLite takes a bad query path. Many of the things we currently use SQLite for we can get specialized implementations with predictable performance behavior with much less code involved. Favicons for example should be a simple circular cache on disk (we can easily fetch them again once we evict something).
>> Bug 691509 was us using SQLite in the worst possible way.  We'd run `ANALYZE` once, so all table counts would be very low.  Then, we'd add a bunch of data making those counts inaccurate.  Because SQLite had data saying that there were a small number of rows, it choose to do a table scan because it would have been more efficient if those numbers were accurate.
>>
>> I'll note that SQLite would have picked a perfectly workable query plan for these queries if we had never chosen to run `ANALYZE`.  Because we did, and populated it with the wrong data, we shot ourselves in the foot (think of `ANALYZE` as "trust me, I know what I'm doing").
>
> You aren't really replying to the argument here. Do you agree that we should target simple problems (favicon cache) with simple solutions with linear worst case performance?
>
>>

Marco Bonardo

unread,
Nov 16, 2011, 1:50:30 PM11/16/11
to
On 16/11/2011 19:29, Justin Lebar wrote:
> but I think it's fair to say that the former is
> related to SQLite. We wouldn't have had this problem if SQLite
> automatically analyzed itself or used introspection to notice when the
> assumptions baked into its query plan were way off. But such is the
> cost of using a "lite" database.

Well it's our fault to decide to go the ANALYZE path. We should have not
enabled ANALYZE for a simple reason: it is something intended to address
poorly written queries and give perf coverage of exotic data
distributions. But our queries (while sometimes complex due to schema
issues) are already optimized for the schema and the most common data
distribution we have.
It marginally helps exotic data distributions, but the fallback is that,
when it fails, the performance hit is so bad to completely hide any
gains. That's why I filed bug 683876, to drive SQLite choices based on
what we know.
I agree with you that, retrospectively looking at the feature and the
results, it's an error-prone feature, and this was not that clear
originally.

Cheers,
Marco

Justin Lebar

unread,
Nov 16, 2011, 2:16:13 PM11/16/11
to Marco Bonardo, dev-pl...@lists.mozilla.org
> That's why I filed bug 683876, to drive SQLite choices based on what we
> know.

To be honest, hardcoding table statistics scares the hell out of me,
especially since we have no way to act on the case when our statistics
fail and we execute a bad plan, without releasing an update to the
binary.

Maybe you think that's unlikely to happen, and we'll get it right and
never break it. But there's more room for error here than if we used
ANALYZE, yes?

> We should have not
> enabled ANALYZE for a simple reason: it is something intended to address
> poorly written queries and give perf coverage of exotic data distributions.

The purpose of ANALYZE is so we don't have to hardcode table
statistics. What's an exotic data distribution to one db is
completely normal to another.

Now, maybe we can't run ANALYZE because it's too slow or whatever.
But I'd count "we have to manually add statistics for every table, and
if we get it wrong, queries may take forever" as a serious negative
against SQLite.

-Justin

Marco Bonardo

unread,
Nov 16, 2011, 3:28:15 PM11/16/11
to
On 16/11/2011 20:16, Justin Lebar wrote:
> To be honest, hardcoding table statistics scares the hell out of me,
> especially since we have no way to act on the case when our statistics
> fail and we execute a bad plan, without releasing an update to the
> binary.

SQlite uses the indices based on these stats, if they exist (remember
you can avoid enabling it on a per-database manner). In some case using
an index is slightly more expensive than using a linear scan (since
requires a btree), and, with fresh stats, it can know that and avoid the
index path. But these gains are in the order of a few milliseconds.
Instead if stats are not up-to-date, wrongly doing a linear scan in a
table with thousands of entries may take _ seconds_.
My point is that we can build queries on top of known stats, we know
from when we build the query which path it will always take, so they'll
unlikely misbehave. We may lose some millisecond in those cases were it
may have skipped the index, but it can't try to linearly scan a
thousands entries table, it will always go the path we designed it for.

> Maybe you think that's unlikely to happen, and we'll get it right and
> never break it. But there's more room for error here than if we used
> ANALYZE, yes?

We have to pay attention on the query plan when building or changing
queries, but we already do.

The problems I'm trying to solve by forcing stats are:
1. we should know from the beginning which path a query will take.
2. ANALYZE is expensive, it is because in SQLite there isn't a quick
path to count(*) and analyze does that for each table. We run it in
background, but it's still more expensive than the potential
milliseconds we may save on querying.

> Now, maybe we can't run ANALYZE because it's too slow or whatever.
> But I'd count "we have to manually add statistics for every table, and
> if we get it wrong, queries may take forever" as a serious negative
> against SQLite.

Imo ANALYZE adds the dangers here. Without it you can run EXECUTE QUERY
PLAN on your query and SQLite will tell you what the query does and what
it will always do. With it you can only guess what it may do. That's the
hidden danger in it.

Cheers,
Marco

Justin Lebar

unread,
Nov 16, 2011, 4:49:46 PM11/16/11
to Marco Bonardo, dev-pl...@lists.mozilla.org
I guess my concern isn't so much with hardcoding a specific query plan
-- we're already relying on this, as you point out -- but rather that
the query plan is implied by a set of hardcoded numbers in a table
somewhere (plus, I guess, the set of indices on the table).

It seems like it'd be a lot less error-prone to explicitly call out
the query plan you want in the query itself. Then if an index ceases
to exist, presumably SQLite would complain, and you don't have to
worry about populating the statistics table with the correct values.
Maybe SQLite doesn't let you do this well enough, but if it's
important, maybe we can get this changed.

Put another way:

> The problems I'm trying to solve by forcing stats are:
> 1. we should know from the beginning which path a query will take.

I'm not convinced that the hardcoded stats solve this problem well.

How do we even know that the proposed statistics table is correct? We
don't have automated tests ensuring that we use the right plans for
all our queries, right? So it's possible sqlite will use a table scan
where we don't want it to, based on the stats we give it? And if
sqlite doesn't today, sqlite three versions from now might?

Alternatively, you could write an automated test which runs every
query in our code, then manually verify that the query plans are
right, and then test that those plans never change. And update the
test whenever you change the schema or the queries.

ANALYZE seems like a lot less work than that, if you can ensure that
the stats are relatively fresh. And if ANALYZE is too slow, perhaps
our SQLite contacts can work on this. But maybe ANALYZE isn't the
right solution. I'm just concerned about the viability of these
hardcoded stats and our ability to get them right.

-Justin

Marco Bonardo

unread,
Nov 16, 2011, 5:36:18 PM11/16/11
to
On 16/11/2011 22:49, Justin Lebar wrote:
> I guess my concern isn't so much with hardcoding a specific query plan
> -- we're already relying on this, as you point out -- but rather that
> the query plan is implied by a set of hardcoded numbers in a table
> somewhere (plus, I guess, the set of indices on the table).

I explained in bug 683876 that imo the two approaches have the same
maintenance costs. I suppose you read that, anyone interested may go
looking there though and participate to the discussion.

> It seems like it'd be a lot less error-prone to explicitly call out
> the query plan you want in the query itself. Then if an index ceases
> to exist, presumably SQLite would complain

would not, the '+' just implies "please ignore an eventual index on this
column", it's not mandatory nor raises any kind of error.

> and you don't have to
> worry about populating the statistics table with the correct values.

But you have to go through each '+' to ensure everything still goes the
intended path. So both approaches ask you to spend time in maintenance
on schema changes.

> How do we even know that the proposed statistics table is correct?

Telemetry can give us much better stats than we need.
How would you put the '+' in the queries? probably by guessing what you
want, plus running the query against databases you _think_ are
problematic. What happens if your problematic case is not realistic? You
have the same issue as with the stats table.

> So it's possible sqlite will use a table scan
> where we don't want it to, based on the stats we give it?

Well, once you say "I expect this index to have 1000 entries" it
couldn't even think to do a linear scan. When you design the schema you
usually have some ideas regarding the relations, and telemetry may
confirm those.

> And if
> sqlite doesn't today, sqlite three versions from now might?

Maybe, but it may even take a third path that you were not expecting and
bypass your index esclusion. This happened when we upgraded to 3.6.x,
for example. Index exclusion does not protect us from having to check we
do things correctly on each optimizer update or schema change.

> Alternatively, you could write an automated test which runs every
> query in our code, then manually verify that the query plans are
> right, and then test that those plans never change. And update the
> test whenever you change the schema or the queries.

yes, we need this, or better, we need performance regression tests on
queries performances. some queries are really simple, some are not and
for these ones may be hard (just think to indexedDB where the query is
build dinamically, or Places queries that animate the UI views).

> ANALYZE seems like a lot less work than that, if you can ensure that
> the stats are relatively fresh.

It's not, you could always run it too late and end up with issues. Plus
it removes the control from you. You designed the queries to be
performant on a certain dataset, analyze removes all of your control, it
may do anything that you were not expecting, or that your schema is not
designed to handle. And keeping it up-to-date is more expensive than the
gain.
I think it's useful in those cases where you put a lot of data in a db
and then you ship it with your app, like car maintenance pieces
database. A friend of mine works on such a software and indeed, in his
case, it's useful since the db is unlikely to change between versions.
In our case the db changes so fast it can't cope with it.

Btw, both approaches have pros and cons, I don't see a clear winner as
of now, even if I prefer the stats table approach for reasoning I
expressed in the bug.

-m

Justin Lebar

unread,
Nov 16, 2011, 6:15:15 PM11/16/11
to Marco Bonardo, dev-pl...@lists.mozilla.org
We're basically in agreement here and in the bug. If we have solid
regression testing -- either by checking that the queries use the
right plans or that the speed of the queries on *a wide variety of
databases* doesn't regress -- then I'm happy as a clam with whatever
approach is taken to try to force the hand of the query optimizer.

Thanks, Marco.

-Justin

bent

unread,
Nov 17, 2011, 11:28:38 AM11/17/11
to
On Nov 15, 11:17 pm, Andreas Gal <g...@mozilla.com> wrote:
> I am glad we agree. We should see if Richard can help us fix the performance problems short term, and mid-term we should eliminate SQLite from indexeddb.

Dr. Hipp has been a great help here, and I believe we now have a very
good handle on our previous performance problems (see bug 702889).

The short story is that our table design and our cursor queries were
structured in a way that SQLite was unable to optimize appropriately.
SQLite was, therefore, doing a slow linear scan for every iteration in
our cursors. Yuck. I've fixed up our table structure and our queries
now so that cursors are properly using indexes. I also corrected some
problems I found with cascading deletions so removing items from our
databases is now several orders of magnitude faster.

In the short term I think these changes will make our IndexedDB
implementation very competitive with Chrome's IndexedDB implementation
(that uses LevelDB). I haven't seen any comparisons with Microsoft's
implementation yet, but that isn't expected until IE 10 anyway.

I think we can take several lessons from our experience using SQLite
for IndexedDB:

1. SQLite is quite obedient. It will do exactly what you ask it to do,
and that means that table design and query creation have to be
considered very carefully.
2. SQLite has a ton of knobs to tune performance characteristics.
There are many more things that we can tweak if performance is still
suboptimal, and we've only just now started investigating.
3. The SQLite developers are extremely responsive and helpful. They
are more than willing to help us fix our bad usage and they have been
very agreeable to adding additional features and APIs that we need.

Our goal in the beginning was to get the IndexedDB API into the wild
for feedback so that we could provide a stable alternative to WebSQL.
Performance tuning was always going to be a followup task, so it's a
little unfair to judge SQLite by IndexedDB's current usage. Now that
the API has solidified and we're turning our focus to performance we
can begin to ask whether or not SQLite is appropriate for our backend.

-Ben Turner

Robert Kaiser

unread,
Nov 17, 2011, 12:00:13 PM11/17/11
to
Marco Bonardo schrieb:
> Just to put this on the right perspective, we are going to completely
> rewrite the schema

From all I've heard, that's a great thing! Go places team!

Robert Kaiser


--
Note that any statements of mine - no matter how passionate - are never
meant to be offensive but very often as food for thought or possible
arguments that we as a community should think about. And most of the
time, I even appreciate irony and fun! :)

Marco Bonardo

unread,
Nov 17, 2011, 12:49:46 PM11/17/11
to
On 17/11/2011 18:00, Robert Kaiser wrote:
> From all I've heard, that's a great thing! Go places team!

Does that team exist still? I think finally it can even die, after many
revivals.

Places team briefly had 7 persons 2 years ago (and that one was the
third Places team, already), doing refactoring from the mess it was, and
even if we shout loudly about well-known architectural issues, it was
reduced to a 2 men show (actually we used to say 1 and a half, since
both me and sdwilsh had other responsibilities), and then a 2/3 (yes,
not even 1) man show. Luckily Dietrich, as the module owner, still
reviews my patches with short delays, so we don't slowdown too much. I'm
puzzled by seeing people noticing just today there may be issues and
need of resources.
Luckily we got lot of help from lots of nice persons along the way, and
I'm happy about the gains so far, since I know very well the initial
situation and what could have been done with these resources. And I'm
happy to see renewed will to help, people asking to rewrite stuff, and
actually doing that. It's not a simple task, due to how confusing was
the original intent (and what a mess was the code), and that's also why
sometimes we push on code quality.
I should blog more about the current status and plans (and maybe some
history of the module, for the curious), since there's lot of
misinformation around.

Btw, I feel good part of the Firefox team nowadays, so go Firefox team!
Marco

Nicholas Nethercote

unread,
Nov 17, 2011, 6:55:22 PM11/17/11
to dev-pl...@lists.mozilla.org, Richard Hipp
On Tue, Nov 15, 2011 at 5:54 PM, Nicholas Nethercote
<n.neth...@gmail.com> wrote:
>
> there's a one-line
> patch that makes SQLite discard cache pages more eagerly than it currently
> does.  This drastically reduces the memory usage -- in limited testing I saw
> a 2--3x reduction with smallish databases, but Richard said he saw 60MB+
> usage drop down to 3MB(!).  The slow-down associated with this was only
> 5--10% on SQLite performance tests, but it's unclear how those correlate to
> Firefox's SQLite performance.

I filed https://bugzilla.mozilla.org/show_bug.cgi?id=703427 about
this. It has details on the patch and some ideas on how to evaluate
the time and space effects.

Nick

Robert Kaiser

unread,
Nov 18, 2011, 10:55:50 AM11/18/11
to
Marco Bonardo schrieb:
> On 17/11/2011 18:00, Robert Kaiser wrote:
>> From all I've heard, that's a great thing! Go places team!
>
> Does that team exist still?

OK, then it's "Go Marco!" instead. ;-)

I used "places team" not as an organizational unit but as "whoever it is
that works on places these days", didn't know if it was you alone or
more people.

Shawn Wilsher

unread,
Nov 19, 2011, 1:25:10 AM11/19/11
to Andreas Gal, dev-pl...@lists.mozilla.org
On 11/15/2011 11:17 PM, Andreas Gal wrote:
> Pointing out that places merely spins over contented locks for minutes instead of deadlocking doesn't really increase my confidence in the code.
Sure, and you won't find me saying that it isn't a problem. Places
hasn't had much engineering attention, and the people working on it are
pretty much always in "fight fires" mode. I don't know how we could
have improved this situation even with hindsight (short of telling
Google we didn't want their code for places in the first place).

> Fetching a single favicon for a page shouldn't require running a join. Someone went a bit overboard with relational database design here, and we are paying the responsiveness price.
At first glance, this seems like a reasonable statement. However, it's
also rather naive. The problem is that *every* web page can specify a
favicon for itself via a meta tag. A lot of sites don't bother
differentiating on different pages, so we'd end up storing the same
favicon for all those pages. We'd lose the join, but we'd bloat our
database size storing the same image over and over again. This would
exasperate other problems we've encountered and worked around relating
to database size.

> You aren't really replying to the argument here. Do you agree that we should target simple problems (favicon cache) with simple solutions with linear worst case performance?
Yes, in fact I didn't disagree with it above. I was merely pointing out
that your listed data didn't support your conclusion. I completely
agree that we use SQLite in areas of the codebase that we simply should not.

Cheers,

Shawn

Shawn Wilsher

unread,
Nov 19, 2011, 1:28:53 AM11/19/11
to Mark Finkle, Richard Hipp, dev-pl...@lists.mozilla.org
On 11/16/2011 6:57 AM, Mark Finkle wrote:
> Technically, Fennec only removed the use of Places. SQLite is used for
> other features. In fact, Fennec just uses the Android OS SQLite-based
> history and bookmark systems now.
I still haven't seen data on why that decision was made despite seeing
more than one person ask for it.

> However, SQLite performance and memory usage are major concerns for Fennec.
Given that the Android history and bookmarking APIs use SQLite, I'm
confused why you are trading SQLite for...SQLite.

Cheers,

Shawn

Shawn Wilsher

unread,
Nov 19, 2011, 1:43:59 AM11/19/11
to dev-pl...@lists.mozilla.org
On 11/16/2011 12:11 AM, Wolfgang Rosenauer wrote:
> - at some point I really think there were modifications or patches which
> were pulled from newer changes in mozilla's sqlite not available in
> the release version (cannot remember details and it's probably not
> important to track down anymore).
FWIW, as long as I've been the owner of storage (past 2-3 years) that
hasn't been the case. There was one instance recently [1] where this
was done without my knowledge, but luckily mak spotted it and reverted it.

Cheers,

Shawn

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=617115

Doug Turner

unread,
Nov 19, 2011, 1:45:35 AM11/19/11
to Shawn Wilsher, Richard Hipp, Taras Glek, Mark Finkle, dev-pl...@lists.mozilla.org

On Nov 18, 2011, at 10:28 PM, Shawn Wilsher wrote:

> On 11/16/2011 6:57 AM, Mark Finkle wrote:
>> Technically, Fennec only removed the use of Places. SQLite is used for
>> other features. In fact, Fennec just uses the Android OS SQLite-based
>> history and bookmark systems now.
> I still haven't seen data on why that decision was made despite seeing more than one person ask for it.

The decision wasn't made because of the well-known perf and is memory issues. Instead, we just needed the 'awesome' bar data available before gecko is loaded. It seemed like the best approach was to stay clear of the places db (the schema frightens most people). We built something faster and simpler, and something that can provide a similar UX that the awesome bar in Firefox desktop does.

If you need memory usage data, we can provide that. about:memory has lots of into. For example, on my relatively new desktop profile i see:

├───69.46 MB (14.77%) -- storage
│ └──69.46 MB (14.77%) -- sqlite
│ ├──34.58 MB (07.35%) -- urlclassifier3.sqlite
│ │ ├──34.48 MB (07.33%) -- cache-used
│ │ └───0.10 MB (00.02%) -- (2 omitted)
│ ├──28.49 MB (06.06%) -- places.sqlite
│ │ ├──27.99 MB (05.95%) -- cache-used [4]
│ │ └───0.50 MB (00.11%) -- (2 omitted)

of course, we don't use the url classifier on mobile yet. gcp is making great process on reducing memory for this. For places, we have 27 MBs used for places. This is unacceptable on mobile. It causes us to get killed way too often. Moving to a system level database, we don't take that memory in our application.

I know you have done a ton of work to make IO a lot better. Taras may have new data on any jank caused by sqlite io. I am unaware of any that blocks us on Mobile.

Doug

Nicholas Nethercote

unread,
Nov 19, 2011, 2:17:09 AM11/19/11
to Doug Turner, Richard Hipp, Taras Glek, Mark Finkle, dev-pl...@lists.mozilla.org, Shawn Wilsher
On Sat, Nov 19, 2011 at 5:45 PM, Doug Turner <do...@mozilla.com> wrote:
>
> If you need memory usage data, we can provide that.  about:memory has lots of into.  For example, on my relatively new desktop profile i see:
>
> ├───69.46 MB (14.77%) -- storage
> │   └──69.46 MB (14.77%) -- sqlite
> │      ├──34.58 MB (07.35%) -- urlclassifier3.sqlite
> │      │  ├──34.48 MB (07.33%) -- cache-used
> │      │  └───0.10 MB (00.02%) -- (2 omitted)
> │      ├──28.49 MB (06.06%) -- places.sqlite
> │      │  ├──27.99 MB (05.95%) -- cache-used [4]
> │      │  └───0.50 MB (00.11%) -- (2 omitted)

Please please please try the patch in
https://bugzilla.mozilla.org/show_bug.cgi?id=703427. For whatever
reason, my desktop profile rarely gets much over 10MB for total SQLite
usage, so it'd be great to have data on how the patch performs for a
profile like yours.

Nick

Rob Arnold

unread,
Nov 19, 2011, 3:02:43 PM11/19/11
to Doug Turner, Richard Hipp, Taras Glek, Mark Finkle, dev-pl...@lists.mozilla.org, Shawn Wilsher
On Fri, Nov 18, 2011 at 10:45 PM, Doug Turner <do...@mozilla.com> wrote:

>
> On Nov 18, 2011, at 10:28 PM, Shawn Wilsher wrote:
>
> > On 11/16/2011 6:57 AM, Mark Finkle wrote:
> >> Technically, Fennec only removed the use of Places. SQLite is used for
> >> other features. In fact, Fennec just uses the Android OS SQLite-based
> >> history and bookmark systems now.
> > I still haven't seen data on why that decision was made despite seeing
> more than one person ask for it.
>
> The decision wasn't made because of the well-known perf and is memory
> issues. Instead, we just needed the 'awesome' bar data available before
> gecko is loaded. It seemed like the best approach was to stay clear of the
> places db (the schema frightens most people). We built something faster
> and simpler, and something that can provide a similar UX that the awesome
> bar in Firefox desktop does.
>
> If you need memory usage data, we can provide that. about:memory has lots
> of into. For example, on my relatively new desktop profile i see:
>
> ├───69.46 MB (14.77%) -- storage
> │ └──69.46 MB (14.77%) -- sqlite
> │ ├──34.58 MB (07.35%) -- urlclassifier3.sqlite
> │ │ ├──34.48 MB (07.33%) -- cache-used
> │ │ └───0.10 MB (00.02%) -- (2 omitted)
> │ ├──28.49 MB (06.06%) -- places.sqlite
> │ │ ├──27.99 MB (05.95%) -- cache-used [4]
> │ │ └───0.50 MB (00.11%) -- (2 omitted)
>
> of course, we don't use the url classifier on mobile yet. gcp is making
> great process on reducing memory for this. For places, we have 27 MBs used
> for places. This is unacceptable on mobile. It causes us to get killed
> way too often. Moving to a system level database, we don't take that
> memory in our application.
>

Without more data, I am skeptical of this faster claim when it comes to
using Android's sqlite. It's an outdated version of sqlite which has known
dataloss and performance bugs and Mozilla can never update it to fix issues
like what the patch in 703427 addresses. Browser's database has likely
never been tested at the size of what sync will throw at it - how does it
scale?

I can understand building a simpler version of bookmarks & history on top
of Mozilla's sqlite using known best performance practices but why use the
system's sqlite which has the same caching problem as Mozilla's current
version ? Wouldn't that help cause the Browser app to be killed (which
presumably has all sorts of ux impact)? Isn't the memory used by the system
the same (or greater since now Browser has to be running)?

-Rob

Shawn Wilsher

unread,
Nov 19, 2011, 4:40:52 PM11/19/11
to Doug Turner, Richard Hipp, Taras Glek, Mark Finkle, dev-pl...@lists.mozilla.org
On 11/18/2011 10:45 PM, Doug Turner wrote:
> The decision wasn't made because of the well-known perf and is memory issues. Instead, we just needed the 'awesome' bar data available before gecko is loaded. It seemed like the best approach was to stay clear of the places db (the schema frightens most people). We built something faster and simpler, and something that can provide a similar UX that the awesome bar in Firefox desktop does.
Being afraid of the schema isn't a strong engineering argument, FWIW.
It wouldn't be terribly difficult (although it would involve a bunch of
code duplication) to write the location bar querying in native code to
work with Android's native UI. I personally believe there is value in
having the same algorithm for searching for both desktop and mobile so
the user gets the same experience.

> ├───69.46 MB (14.77%) -- storage
> │ └──69.46 MB (14.77%) -- sqlite
> │ ├──34.58 MB (07.35%) -- urlclassifier3.sqlite
> │ │ ├──34.48 MB (07.33%) -- cache-used
> │ │ └───0.10 MB (00.02%) -- (2 omitted)
> │ ├──28.49 MB (06.06%) -- places.sqlite
> │ │ ├──27.99 MB (05.95%) -- cache-used [4]
> │ │ └───0.50 MB (00.11%) -- (2 omitted)
>
> of course, we don't use the url classifier on mobile yet. gcp is making great process on reducing memory for this. For places, we have 27 MBs used for places. This is unacceptable on mobile. It causes us to get killed way too often. Moving to a system level database, we don't take that memory in our application.
That's not a good comparison since the cache size for places.sqlite
connections is set based on the size of the database, and we expire
faster on mobile than we do on the desktop.

Cheers,

Shawn

Doug Turner

unread,
Nov 19, 2011, 7:46:01 PM11/19/11
to Shawn Wilsher, Richard Hipp, Taras Glek, Mark Finkle, dev-pl...@lists.mozilla.org

On Nov 19, 2011, at 1:40 PM, Shawn Wilsher wrote:

> On 11/18/2011 10:45 PM, Doug Turner wrote:
>> The decision wasn't made because of the well-known perf and is memory issues. Instead, we just needed the 'awesome' bar data available before gecko is loaded. It seemed like the best approach was to stay clear of the places db (the schema frightens most people). We built something faster and simpler, and something that can provide a similar UX that the awesome bar in Firefox desktop does.
> Being afraid of the schema isn't a strong engineering argument, FWIW. It wouldn't be terribly difficult (although it would involve a bunch of code duplication) to write the location bar querying in native code to work with Android's native UI. I personally believe there is value in having the same algorithm for searching for both desktop and mobile so the user gets the same experience.


fear is a good thing! it comes of experience of knowing that joins are usually bad. it comes from knowing that complex systems tend to have more bugs that simpler systems. that is what I'm saying.

I do agree with you, we want to get the same experience between the products. However, we can't pay a 20mb hit for that. I don't think I'd pay a 2mb premium on mobile for that. But there is some room there, sure.

Look, I am not an expert with our Places story -- you have more than your fair share of battle scars from this. Everyone appreciates your help making Firefox storage good.

Doug

Mike Hommey

unread,
Nov 20, 2011, 2:48:08 AM11/20/11
to Rob Arnold, Taras Glek, Mark Finkle, Richard Hipp, Doug Turner, dev-pl...@lists.mozilla.org, Shawn Wilsher
On Sat, Nov 19, 2011 at 12:02:43PM -0800, Rob Arnold wrote:
> On Fri, Nov 18, 2011 at 10:45 PM, Doug Turner <do...@mozilla.com> wrote:
>
> >
> > On Nov 18, 2011, at 10:28 PM, Shawn Wilsher wrote:
> >
> > > On 11/16/2011 6:57 AM, Mark Finkle wrote:
> > >> Technically, Fennec only removed the use of Places. SQLite is used for
> > >> other features. In fact, Fennec just uses the Android OS SQLite-based
> > >> history and bookmark systems now.
> > > I still haven't seen data on why that decision was made despite seeing
> > more than one person ask for it.
> >
> > The decision wasn't made because of the well-known perf and is memory
> > issues. Instead, we just needed the 'awesome' bar data available before
> > gecko is loaded. It seemed like the best approach was to stay clear of the
> > places db (the schema frightens most people). We built something faster
> > and simpler, and something that can provide a similar UX that the awesome
> > bar in Firefox desktop does.
> >
> > If you need memory usage data, we can provide that. about:memory has lots
> > of into. For example, on my relatively new desktop profile i see:
> >
> > ├───69.46 MB (14.77%) -- storage
> > │ └──69.46 MB (14.77%) -- sqlite
> > │ ├──34.58 MB (07.35%) -- urlclassifier3.sqlite
> > │ │ ├──34.48 MB (07.33%) -- cache-used
> > │ │ └───0.10 MB (00.02%) -- (2 omitted)
> > │ ├──28.49 MB (06.06%) -- places.sqlite
> > │ │ ├──27.99 MB (05.95%) -- cache-used [4]
> > │ │ └───0.50 MB (00.11%) -- (2 omitted)
> >
> > of course, we don't use the url classifier on mobile yet. gcp is making
> > great process on reducing memory for this. For places, we have 27 MBs used
> > for places. This is unacceptable on mobile. It causes us to get killed
> > way too often. Moving to a system level database, we don't take that
> > memory in our application.
> >
>
> Without more data, I am skeptical of this faster claim when it comes to
> using Android's sqlite.

It's simple. If you want to use Gecko's storage API, you need to have
Gecko working. Currently, you don't get that before one second after
starting Firefox on fast devices. With native UI, you get that within
a few hundred milliseconds.

There is middle ground, though: we could use mozilla's sqlite without
Gecko's storage API, which would mean (re)writing a mozsqlite wrapper in
java, or a JNI glue. mozsqlite could be loaded early (before the rest of
Gecko), and shouldn't delay things too much. That would avoid problems
with the outdated system sqlite, but wouldn't make interaction with
android system history and bookmarks any easy.

Mike

Mike Hommey

unread,
Nov 20, 2011, 3:02:54 AM11/20/11
to Rob Arnold, Taras Glek, Mark Finkle, Richard Hipp, Doug Turner, dev-pl...@lists.mozilla.org, Shawn Wilsher
To be clearer, using system sqlite, you don't need to wait for Gecko to
access history and bookmarks.

Dao

unread,
Nov 20, 2011, 5:35:56 AM11/20/11
to
On 19.11.2011 21:02, Rob Arnold wrote:
> Without more data, I am skeptical of this faster claim when it comes to
> using Android's sqlite. It's an outdated version of sqlite which has known
> dataloss and performance bugs and Mozilla can never update it to fix issues
> like what the patch in 703427 addresses. Browser's database has likely
> never been tested at the size of what sync will throw at it - how does it
> scale?

Also, is it future safe? Do we know that Chrome on Android is going to
use that database and if not that the database will stay around?

Dao

unread,
Nov 20, 2011, 5:37:24 AM11/20/11
to
On 20.11.2011 08:48, Mike Hommey wrote:
> It's simple. If you want to use Gecko's storage API, you need to have
> Gecko working. Currently, you don't get that before one second after
> starting Firefox on fast devices. With native UI, you get that within
> a few hundred milliseconds.
>
> There is middle ground, though: we could use mozilla's sqlite without
> Gecko's storage API, which would mean (re)writing a mozsqlite wrapper in
> java, or a JNI glue. mozsqlite could be loaded early (before the rest of
> Gecko), and shouldn't delay things too much. That would avoid problems
> with the outdated system sqlite, but wouldn't make interaction with
> android system history and bookmarks any easy.

Right... This assumes that we actually want to interact with the system
history and bookmarks, rather than just getting the UI up early.

Robert Kaiser

unread,
Nov 20, 2011, 10:26:45 AM11/20/11
to
Doug Turner schrieb:
> If you need memory usage data, we can provide that. about:memory has lots of into. For example, on my relatively new desktop profile i see:
>
> │ ├──28.49 MB (06.06%) -- places.sqlite
> │ │ ├──27.99 MB (05.95%) -- cache-used [4]
> │ │ └───0.50 MB (00.11%) -- (2 omitted)

Comparing desktop usage to mobile is not just inaccurate, I guess, but
even unfair. You'd need to compare to a XUL UI Fennec build, which, yes,
has about:memory as well.

The very recent 11.0a1 Nightly build I have on my N9 (thanks to romaxa)
says 3.23 MB total for places.sqlite there. The one on my N900 (which
might be slightly older as it's the last Nightly we produced for that
device) says 5.37 MB for this one.

Both builds are well-synched with each other and all my tablet, laptop
and desktop installations of both Nightly and SeaMonkey.

Robert Kaiser

--
Note that any statements of mine - no matter how passionate - are never
meant to be offensive but very often as food for thought or possible
arguments that we as a community needs answers to. And most of the time,

Doug Turner

unread,
Nov 20, 2011, 2:04:58 PM11/20/11
to Mike Hommey, Taras Glek, Rob Arnold, Mark Finkle, Richard Hipp, dev-pl...@lists.mozilla.org, Shawn Wilsher

> It's simple. If you want to use Gecko's storage API, you need to have
> Gecko working. Currently, you don't get that before one second after
> starting Firefox on fast devices. With native UI, you get that within
> a few hundred milliseconds.
>
> There is middle ground, though: we could use mozilla's sqlite without
> Gecko's storage API, which would mean (re)writing a mozsqlite wrapper in
> java, or a JNI glue. mozsqlite could be loaded early (before the rest of
> Gecko), and shouldn't delay things too much. That would avoid problems
> with the outdated system sqlite, but wouldn't make interaction with
> android system history and bookmarks any easy.


Mike, exactly. We will be probably doing this eventually. It isn't clear that hc or ics have the same format that we need. Either way, it is post our first release.

Thanks!

Henri Sivonen

unread,
Nov 21, 2011, 2:04:13 AM11/21/11
to dev-pl...@lists.mozilla.org
On Sun, Nov 20, 2011 at 9:04 PM, Doug Turner <do...@mozilla.com> wrote:
>> There is middle ground, though: we could use mozilla's sqlite without
>> Gecko's storage API, which would mean (re)writing a mozsqlite wrapper in
>> java, or a JNI glue. mozsqlite could be loaded early (before the rest of
>> Gecko), and shouldn't delay things too much. That would avoid problems
>> with the outdated system sqlite, but wouldn't make interaction with
>> android system history and bookmarks any easy.
>
> Mike, exactly.  We will be probably doing this eventually.  It isn't clear that hc or ics have the same format that we need.  Either way, it is post our first release.

Doing this after the first release seems like a weird road map. If
this was done in the first release, there'd be no need to migrate data
or to bother users with Firefox and the stock browser contaminating
each other's history. If this is done later, there will have been two
data migrations (once from XUL Fennec's data store to the system data
store and once from the system data store to the Native Fennec data
store). Also two Awesomebar and Sync reimplementations instead of one.

--
Henri Sivonen
hsiv...@iki.fi
http://hsivonen.iki.fi/

Doug Turner

unread,
Nov 21, 2011, 2:15:24 AM11/21/11
to Henri Sivonen, dev-pl...@lists.mozilla.org
Hi Henri,
That might be true. We are on a aggressive release cycle -- hoping to have something by first quarter of next year. Some things, like this, will have to wait. If you have time, feel free to start working on it!

Doug

Marco Bonardo

unread,
Nov 21, 2011, 5:14:30 AM11/21/11
to
On 19/11/2011 07:45, Doug Turner wrote:
> It seemed like the best approach was to stay clear of the places db (the schema frightens most people).

While working on a better schema, I looked at the schema used by most
browsers around, it's not really much different. Chrome is really
similar for example, even column names (and I can see why, since has
been designed by the same persons).
If the system history schema of Android is so much better, I'd love to
see it since it may help improving ours, do you know where the
documentation is?

> If you need memory usage data, we can provide that. about:memory has lots of into. For example, on my relatively new desktop profile i see:
>
> ├───69.46 MB (14.77%) -- storage
> │ └──69.46 MB (14.77%) -- sqlite

This data comes from Firefox 8, or 9. Current Aurora code has memory
limited to 4MB per connection, and there is an experimental patch that
brings memory usage down to about 3-4MB total.
Just to clarify, this memory usage is by design, it was a bad decision
in the past, thus most of it has been reverted.
Good data would be memory usage of Nightly on mobile, and improving that
is fairly easy.

-m

Marco Bonardo

unread,
Nov 21, 2011, 5:25:03 AM11/21/11
to
On 19/11/2011 08:17, Nicholas Nethercote wrote:
> Please please please try the patch in
> https://bugzilla.mozilla.org/show_bug.cgi?id=703427. For whatever
> reason, my desktop profile rarely gets much over 10MB for total SQLite
> usage, so it'd be great to have data on how the patch performs for a
> profile like yours.

Even without that patch (that probably we may take in the mobile case,
even with the small perf regression, since it's much more features
limited and memory has greater importance) our memory usage on code >=
Aurora is much more limited.
-m

Robert Kaiser

unread,
Nov 21, 2011, 8:42:36 AM11/21/11
to
Robert Kaiser schrieb:
> The very recent 11.0a1 Nightly build I have on my N9 (thanks to romaxa)
> says 3.23 MB total for places.sqlite there. The one on my N900 (which
> might be slightly older as it's the last Nightly we produced for that
> device) says 5.37 MB for this one.
>
> Both builds are well-synched with each other and all my tablet, laptop
> and desktop installations of both Nightly and SeaMonkey.

One more data point: My (XUL) Nightly on the Android tablet shows 6.69
MB for places.sqlite after having it run for quite some time - after a
Nightly update and restart, it says 0.50 MB, after using the
awesomescreen and navigating to a couple pages, it goes to 3.89 MB.

AFAIK, Marco says we have even more space for optimizing that within the
boundaries of actually using places. That sounds like a good idea to me,
but it also looks like numbers on mobile builds aren't nearly as bad as
Doug found on his unspecified desktop build. :)

Robert Kaiser

--
Note that any statements of mine - no matter how passionate - are never
meant to be offensive but very often as food for thought or possible
arguments that we as a community should think about. And most of the

Marco Bonardo

unread,
Nov 21, 2011, 9:39:55 AM11/21/11
to
On 21/11/2011 14:42, Robert Kaiser wrote:
> AFAIK, Marco says we have even more space for optimizing that within the
> boundaries of actually using places.

Yes, memory has already been improved our side on Aurora and Nightly,
there is still space for optimizing it though (both on Places and on
SQLite side, we are investigating both, as Nicholas Nethercote pointed
out in other messages here).
I expect your measured use of about 6MB to be "acceptable", but
improvable by up to 50%, depending on the perf loss.
On desktop, my current usage on a 8GB memory system, with a 120MB
database, is 14MB, we may reduce it to something around 10MB, but
considering current size and gain it's a bit less prioritized on Desktop.
Beta and Release are using the old "unlimited" cache, so they will
likely use a bunch more memory.
A good thing about our memory reporting of SQLite usage, is that from
when we started using jemalloc it is complete, any single used byte is
reported, included slop.

We fixed the longstanding issue with the wal journal size, that is now
limited to 512KB on every platform.
We can further reduce fsyncs, that right now hit mostly on bookmarks
changes. I'll file a bug about this simple task, I was discussing this
with the perf team just a couple days ago.

Performances of bookmarks have larger space for optimization, while
optimizations on history and favicons may be more limited since they are
already async.
The largest history gain may likely come from using a monotonic frecency
algo, that Jesse Ruderman (and iirc someone else before) suggested some
time ago.
Since both history and favicons are already async I'm currently removing
some bookmarks complication, so that we have less overhead in the most
common cases (the ones mobile cares about) and we can move faster to a
redesigned schema. I have an ETA of Q12012 for a working prototype of a
new schema, that is similar to the Sync ETA for async bookmarks and
icons synchronization work. Related improvements will likely land before
that time, though, as soon as they are production-ready.

-m

Doug Turner

unread,
Nov 21, 2011, 10:27:59 AM11/21/11
to Marco Bonardo, dev-pl...@lists.mozilla.org
That is great news, Marco!
> _______________________________________________
> dev-platform mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform

Matt Brubeck

unread,
Nov 21, 2011, 11:59:59 AM11/21/11
to Dao
On 11/20/2011 02:35 AM, Dao wrote:
> On 19.11.2011 21:02, Rob Arnold wrote:
>> Without more data, I am skeptical of this faster claim when it comes to
>> using Android's sqlite.

As Doug said, the main reason for using a different datastore on Android
is not performance, but simply so that it can be accessed when Gecko is
not running. We need a standalone datastore, and Android has one
built-in so we are using it for now. There may be good reasons to
switch to a different datastore, and we can do that if necessary.

> Also, is it future safe? Do we know that Chrome on Android is going to
> use that database and if not that the database will stay around?

We're accessing the Android browser datastore through a documented and
supported ContactProvider[1] API, not by using SQLite directly. The
underlying implementation could change (and in fact, the underlying
schema *has* changed from one Android version to the next) without
directly impacting our app.

[1]:
http://developer.android.com/reference/android/content/ContentProvider.html

Marco Bonardo

unread,
Nov 21, 2011, 12:23:03 PM11/21/11
to
On 21/11/2011 17:59, Matt Brubeck wrote:
> We're accessing the Android browser datastore through a documented and
> supported ContactProvider[1] API, not by using SQLite directly. The
> underlying implementation could change (and in fact, the underlying
> schema *has* changed from one Android version to the next) without
> directly impacting our app.

I'm sure there are more reasons than performances, we should try to
reduce this fork, so that those reasons can be reduced to some
easy-solution cases.
I say this because I think that, from a code maintenance point of view,
it would be much less expensive to keep using Places and have a Sync
engine able to sync places<->system. That way you may avoid having to
rewrite and maintain a lot of code to dupe functionality that we already
have, you should just maintain the Sync engine.
Btw, any data on what is missing on our side would help driving changes.

Do you have a link to the Android schema documentation, it would be a
useful datapoint to evaluate our future changes.

-m

Doug Turner

unread,
Nov 21, 2011, 12:29:08 PM11/21/11
to Marco Bonardo, dev-pl...@lists.mozilla.org
Marco, is the places team signed up to supporting places access outside of Gecko startup? We need to have the ability to access 'awesome' bar data from Java before Gecko has started. That is a must-have.

Doug

Marco Bonardo

unread,
Nov 21, 2011, 1:02:57 PM11/21/11
to Doug Turner, dev-pl...@lists.mozilla.org
On lunedì 21 novembre 2011 18:29:08, Doug Turner wrote:
> Marco, is the places team signed up to supporting places access outside of Gecko startup? We need to have the ability to access 'awesome' bar data from Java before Gecko has started. That is a must-have.

The "places team" as a separate entity does not exist anymore, it's
mostly me and dietrich (but we both also have other responsibilities),
Places mostly falls in the browser/toolkit bucket and at I'm the one
spending more time on its code and planning.

As a peer, I think I would not have issues with third party connections
reading the database, that is unlikely to create issues to us, each
connection in SQLite is pretty much indipendent and read-only
connections don't cause any sort of locking issues. It may run at any
time without causing us issues.

If this connection should also be able to write to the database, then
it may be a bit more complicated; it could indeed cause SQLITE_BUSY
locks, that would cause Places to fail when any of the synchronous APIs
are invoked while the system is doing updates (this is not solvable
till all synchronous APIs have been killed). Plus it should be careful
about data coherence (tests can help here).

Stayin on the read-only connection idea, that is the simpler, I can see
just a couple additional things to consider:

1. there should be team synchronization when we revise the schema, to
ensure the service can always read the data from the current schema
version. This may be easy to automate, for example adding a test in
Mobile or Sync that fails when someone revises the schema version in
places.sqlite but forgets to fix Mobile/Sync code. Just fail loudly and
visibly.

2. if you want to use system SQLite to read the database, it should be
new enough to be able to access it, for example old versions (< 3.7.0,
so anything before Honeycomb) may be missing WAL support, and when
hitting a WAL database they would bail out thinking it's a corrupt
database. Not using WAL, on the other side, would mean fsyncing a lot.
At that point you may decide to scrap durability and have your system
service ensure the db is sane, or throw it away and rebuild otherwise.

-m
0 new messages