Asynchronous API for mozStorage

34 views
Skip to first unread message

Shawn Wilsher

unread,
May 12, 2008, 12:50:45 PM5/12/08
to
I'm looking for feedback on the proposed API (with sample code) for an
asynchronous API for mozStorage. What I've got so far can be found here:
http://developer.mozilla.org/en/docs/User_talk:Comrade693#Async_Storage_Statements

This is something we are looking at for Firefox.next, so the sooner you
give feedback, the better. :)

Cheers,

Shawn

Adam Kowalczyk

unread,
May 12, 2008, 1:24:01 PM5/12/08
to
Thanks for taking this on, Shawn!

I think we should make async API as similar to the synchronous one as
possible. What's the reason behind basing it on HTML5 SQL? Developers
are already accustomed to existing mozStorage APIs, so we shouldn't
deviate from them without a good reason. Learning curve aside, it would
create a big inconsistency. Let's take a look at the differences:

* AsyncTransaction is a completely different animal than a traditional
transaction.
* mozStorage APIs revolve around statement objects. There's no explicit
statement here.
* Parameters are bound in a different way
* Results are read in a different way

By the way, AsyncTransaction.execute() reads as if the transaction is
executed, not the statement. C.f. mozIStorageStatement.execute().

Another concern is the callback. Let's make it a single function instead
of an object implementing given methods. In the latter case you've got
to implement QI and the code that handles the statement is more
scattered. If you use many async statements, you usually make a
<code>switch</code> and forward the callbacks to separate dedicated
functions, anyway.

Here is the Shawn's example, adapted to use my idea of the API. Please
note that I have no knowledge of the internals of sqlite and mozStorage,
so I don't know how feasible this is.

// conn is a mozIStorageConnection
let sql = "SELECT * FROM moz_downloads WHERE state = ?1";
let stmt = conn.createStatement(sql);
stmt.bindInt32Parameter(0, Ci.nsIDownloadManager.DOWNLOAD_FINISHED);
stmt.executeAsync(callback);

function callback(aStatement, aResult, aError) {
if (aError)
// update UI accordingly

for (let row in aResult) {
// process data
// should be able to access data in a row by name or argument
number
}
}


Alternatively, the callback could be defined in the style of XMLHttpRequest.

stmt.onResultsReady = onResultsReady;
stmt.onError = onError;
stmt.executeAsync();

function onResultsRead(aStatement, aResults) {

}

function onError(aStatement, aError) {

}

- Adam


Sergey Yanovich

unread,
May 12, 2008, 3:21:32 PM5/12/08
to

I am extensively using mozIStorage interfaces in an accounting xul
application, that is approaching production readiness. Shawn, thanks for
this opportunity to present my point.

Async calls are a great idea!

Comments on the proposed interface:

1. Is there any technical reasons that require AsyncTransaction?

It seems to be redundant otherwise. The best practice is use sql
parameters, which are to be bound after the statement is already
prepared and its syntax is verified. The counter-proposal is to add
'executeAsync()' to 'mozIStorageStatement'.

2. Separate method to obtain ResultSet is a right thing. We can probably
also change 'mozIStorageStatement' synchronous calls API in the same
way. Currently, the first call of 'executeStep()' has a special EXECUTE
meaning, while all subsequent ones actually mean FETCH ROW. Separating
this two meanings may be a good idea.

Instead of passing ResultSet proxy to a callback, let's:
drop 'boolean executeStep();',
drop 'void reset();',
drop 'readonly attribute long state;'
add 'readonly attribute mozIStorageValueArray resultSet;'
to 'mozIStorageStatement', and stop subclassing 'mozIStorageValueArray'
there. In turn 'mozIStorageValueArray' will receive
'readonly attribute long state;',
'void fetchNext();', and
'void fetchNextAsync([optional] in Callback aCallback);'

So 'mozIStorageValueArray' (aka cursor) is separated from
'mozIStorageStatement'. The former can later be extended to provide
support for random access cursors. 'reset()' goes away entirely. It can
be invoked internally in 'execute()', 'executeAsync()' and 'finalize()'.

As a result, mozIStorage* will become more SQL-ish and friendlier to
other databases like MySQL or PostgreSQL. There is a legitimate question
"why do we care about other databases?". Here is one hypothetical
example: since we already keep a hefty part of user profile (history and
bookmarks) in a relational database, we can move the rest of valuable
data -- mainly passwords and extensions (yes! extensions) -- there as
well. So the profile will contain a database (or databases) and cached
data. This creates an opportunity for big organizations to use a central
database to store and manage extensions, web settings and browsing data
of their users. And if such a user logins simultaneously on several
workstations, no browsing data will be lost.

3. Finally, can we directly use nsIWebProgressListener2 as a callback
class in both 'executeAsync()' and 'fetchNextAsync()'?

This can be the first step of many to provide full database progress
feedback using progress bar. Long places operations (the main client of
storage) will clearly benefit from this feature.

4. And woooooohooooooo, as 3.0 RC1 has finally been tagged for building!

--
Sergey Yanovich

Shawn Wilsher

unread,
May 12, 2008, 4:33:50 PM5/12/08
to
Adam Kowalczyk wrote:
> Thanks for taking this on, Shawn!
>
> I think we should make async API as similar to the synchronous one as
> possible. What's the reason behind basing it on HTML5 SQL? Developers
> are already accustomed to existing mozStorage APIs, so we shouldn't
> deviate from them without a good reason. Learning curve aside, it would
> create a big inconsistency. Let's take a look at the differences:
I based it off the HTML SQL spec for two reasons:
(1) A lot of mozilla stuff is based off of web APIs. This lowers the
barrier to entry for new developers (who have done web work at least)
since they are generally somewhat familiar with existing APIs.
(2) I don't think we should create new APIs solely based on the existing
API. It's my opinion that the current API is rather clunky in areas,
and the proposed API was meant to fix some of that.

> * AsyncTransaction is a completely different animal than a traditional
> transaction.

Yes, and no. Each statement has an implicit transaction wrapped around
it - the whole statement either succeeds or fails. With that said, I'm
open to a better name on this.

> * mozStorage APIs revolve around statement objects. There's no explicit
> statement here.

I'm not sure this is a bad thing, but it is different.

> * Parameters are bound in a different way

It's my opinion that our current approach to binding parameters is
clunky and error-prone. I'd love to fix that, and that's what I'm
trying to do here.

> * Results are read in a different way

Same as above. Note that since this is targeting Firefox.next, which is
supposed to come out some time in the fall, we don't want to change
existing APIs. So we can add to our current API, but changing it should
be avoided if possible.

> By the way, AsyncTransaction.execute() reads as if the transaction is
> executed, not the statement. C.f. mozIStorageStatement.execute().

Implicit transaction again. Although, I do believe this could be done
without the AsyncTransaction object which would probably ease this
concern of yours (I had a reason for doing it with AsyncTransaction
because I shared your line of thinking initially. Something made me
change my mind, but I can't recall what that was currently. When I
figure it out again, I'll post).

> Another concern is the callback. Let's make it a single function instead
> of an object implementing given methods. In the latter case you've got
> to implement QI and the code that handles the statement is more
> scattered. If you use many async statements, you usually make a
> <code>switch</code> and forward the callbacks to separate dedicated
> functions, anyway.

Letting one method handle both errors and handling of an actual result
is a form of Logical Cohesion which is generally frowned upon (yeah, we
do it in the mozilla codebase in various places). Functional Cohesion
is preferred. I recommend reading Code Complete 2nd Ed., Chapter 7 for
more details (I imagine wikipedia has something on it too).

Additionally, JS consumers don't have to implement the QI code, and they
don't even need to provide an error or result handler if they don't want
to. XPConnect, as I understand it, will just throw
NS_ERROR_NOT_IMPLEMENTED if the back-end code tries to call it. The
back-end should be able to handle that gracefully.

> // conn is a mozIStorageConnection
> let sql = "SELECT * FROM moz_downloads WHERE state = ?1";
> let stmt = conn.createStatement(sql);
> stmt.bindInt32Parameter(0, Ci.nsIDownloadManager.DOWNLOAD_FINISHED);
> stmt.executeAsync(callback);

The problem with this approach is that preparing the statement (the
createStatement call) is often the most expensive operation for SELECT
statements. With the current API we can do that work on the background
thread[pool].

> Alternatively, the callback could be defined in the style of
> XMLHttpRequest.

The only reason why I don't like this approach is because it seems like
a lot more lines of code to accomplish the same thing. With that said,
I'm indifferent about it.


Thanks for the great feedback. Let me know if this addresses your concerns.

Cheers,

Shawn

Shawn Wilsher

unread,
May 12, 2008, 9:13:51 PM5/12/08
to
Sergey Yanovich wrote:
> 1. Is there any technical reasons that require AsyncTransaction?
At this point, I don't think so. I had a reason at some point, but I
cannot recall what it was. The only thing that might be useful is
leaving it in for the callback to handle, but then it should have some
properties on it, otherwise it seems useless (checking as to why the
HTML 5 spec did this would probably be useful too - I'll get on that).

> It seems to be redundant otherwise. The best practice is use sql
> parameters, which are to be bound after the statement is already
> prepared and its syntax is verified. The counter-proposal is to add
> 'executeAsync()' to 'mozIStorageStatement'.

I think that this might be the way to go, but I don't think we should
have them prepare it first (see previous post about that in reply to Adam).

> 2. Separate method to obtain ResultSet is a right thing. We can probably
> also change 'mozIStorageStatement' synchronous calls API in the same
> way. Currently, the first call of 'executeStep()' has a special EXECUTE
> meaning, while all subsequent ones actually mean FETCH ROW. Separating
> this two meanings may be a good idea.

I'd love to, but we don't want to have that much API churn for Firefox
3.1. For Firefox 4 we can look at that.

> 3. Finally, can we directly use nsIWebProgressListener2 as a callback
> class in both 'executeAsync()' and 'fetchNextAsync()'?

I'm not sure I understand how that is useful, so you'd have to elaborate
on that.

Cheers,

Shawn

Shawn Wilsher

unread,
May 12, 2008, 10:11:32 PM5/12/08
to
I've gone and incorporated some of the feedback from Adam and Sergey.
All changes are up on the wiki [1], but here's a summary:

(1) Dropped the AsyncTransaction interface, and moved its executeAsync
method onto mozIStorageConnection. There was no reason for the
interface to exist, and it's one less thing consumers have to worry about.

(2) Added a SQLError interface that will be passed back to the callback
for errors. There is, however, no definition as I'm not 100% sure what
we want to include when we notify that there was an error.

(3) Updated the sample code to go along with the new interface proposal.

Cheers,

Shawn

[1]
http://developer.mozilla.org/en/docs/User_talk:Comrade693#Async_Storage_Statements

Mook

unread,
May 13, 2008, 3:30:21 AM5/13/08
to
(I'm looking at the wiki, r71407)

Overall: It would probably be useful to explicitly state, near the top
of the wiki page, exactly which Mozilla version this is targeting -
sounds like Firefox.next / 1.9.nn?

Shawn Wilsher wrote:


> Adam Kowalczyk wrote:
>> // conn is a mozIStorageConnection
>> let sql = "SELECT * FROM moz_downloads WHERE state = ?1";
>> let stmt = conn.createStatement(sql);
>> stmt.bindInt32Parameter(0, Ci.nsIDownloadManager.DOWNLOAD_FINISHED);
>> stmt.executeAsync(callback);
> The problem with this approach is that preparing the statement (the
> createStatement call) is often the most expensive operation for SELECT
> statements. With the current API we can do that work on the background
> thread[pool].

On the other hand, this also means it's not possible to re-use statements.

I'm not quite sure how that second argument (the one with
the things to bind) can be implemented, given that XPIDL doesn't, as far
as I know, have method overloading. It seems to take either a JS Array,
Object, or a C++ Parameters thing that is to be determined. Would we
still get type-checking on the C++ side? (Hey, if you somehow manage to
get that, all the better.)

Of course, I know about nil about API design ;)

(Off-topic note-to-self: trigger + mozIStorageFunction = win)

--
Mook

Boris Zbarsky

unread,
May 13, 2008, 4:05:40 AM5/13/08
to
Mook wrote:
> I'm not quite sure how that second argument (the one with
> the things to bind) can be implemented, given that XPIDL doesn't, as far
> as I know, have method overloading.

While strictly speaking true, there are ways to get around this, esp. if all
your calls are from JS into C++. You can use nsIVariant, or declare no args in
the IDL and read them directly using XPConnect APIs.

-Boris

Sergey Yanovich

unread,
May 13, 2008, 7:54:03 AM5/13/08
to
Shawn Wilsher wrote:

> Sergey Yanovich wrote:
>> 2. Separate method to obtain ResultSet is a right thing. We can
>> probably also change 'mozIStorageStatement' synchronous calls API in
>> the same way. Currently, the first call of 'executeStep()' has a
>> special EXECUTE meaning, while all subsequent ones actually mean FETCH
>> ROW. Separating this two meanings may be a good idea.
> I'd love to, but we don't want to have that much API churn for Firefox
> 3.1. For Firefox 4 we can look at that.

Sure. I mistook Firefox.next as Firefox 4.

>> 3. Finally, can we directly use nsIWebProgressListener2 as a callback
>> class in both 'executeAsync()' and 'fetchNextAsync()'?
> I'm not sure I understand how that is useful, so you'd have to elaborate
> on that.

I had in mind whole-sale database access like when History -> Show All
History is invoked. Our current Library window doesn't contain progress
bar, because it would be useless, since the routine that populates the
main tree is synchronous. But things will change, once async call
becomes possible. We can then provide substantially better user
experience by doing async loading and showing progress in a progress bar.

However, in the context of a dot release, introducing this new
functionality to the mozIStorage* no longer seems a good idea.

Sergey Yanovich

unread,
May 13, 2008, 10:12:40 AM5/13/08
to
Shawn Wilsher wrote:
> I've gone and incorporated some of the feedback from Adam and Sergey.
> All changes are up on the wiki [1], but here's a summary:
>
> (1) Dropped the AsyncTransaction interface, and moved its executeAsync
> method onto mozIStorageConnection. There was no reason for the
> interface to exist, and it's one less thing consumers have to worry about.

A) I started to argue with the necessity of changing connection
interface, but then came around this:


> The problem with this approach is that preparing the statement (the createStatement call) is often the most expensive operation for SELECT statements.

This is a great argument in favor of creating a statement
asynchronously, but it is also an argument in favor of storing the
produced statement for re-use.

This may have been Shawn's motivation to introduce AsyncTransaction.
Nevertheless, it must be an async operation. A possible name could be
createStatementAsync(). Another important method of the statement is
finalize(), which is the only reliable way to free the database from JS.

B) In the view of createStatementAsync(), Parameters callback receives
additional task of storing the statement.

C) Finally, ResultSet behavior needs some clarification. From how it is
used in the example, it may be inferred, that getNextResult() is
synchronous. That means, all data needs to be fetched and saved locally
before handleResult() is invoked. A number of questions arise:

* When working the large amounts of data (like browsing history), this
may result is excessive memory consumption.

* Although the data is fetched row by row, nothing will be displayed,
until the last row is fetched.

* If there are errors in the end of a large list, nothing will be
displayed at all.

Proposed solution is to have API for both paths of execution:
1. mozIStorageConnection.executeSimpleAsync(in AUTF8String aSQL,
[optional] in Parameters aParameters, [optional] in Callback aCallback);

2. mozIStorageConnection.createStatementAsync(in in AUTF8String aSQL,
[optional] in Parameters aParameters), then
mozIStorageStatement.executeAsync(in Callback aCallback), then
mozIStorageValueArray.fetchNextAsync(in Callback aCallback) for each
row.

D) Callback needs to have separate methods handleAllResults/handleResult
respectively to handle 1. and 2.

--
Sergey Yanovich

Mike Shaver

unread,
May 13, 2008, 10:56:16 AM5/13/08
to Sergey Yanovich, dev-pl...@lists.mozilla.org
On Tue, May 13, 2008 at 10:12 AM, Sergey Yanovich <ynv...@gmail.com> wrote:
> > The problem with this approach is that preparing the statement (the createStatement call) is often the most expensive operation for SELECT statements.

This is true for very light-weight SELECT statements, to be clear, and
it only matters if you're running them many times.

> This is a great argument in favor of creating a statement
> asynchronously,

No -- it can't really block, and the additional complexity of using it
will be a huge developer cost for no gain.

> but it is also an argument in favor of storing the
> produced statement for re-use.

Yes.

:)

Mike

Shawn Wilsher

unread,
May 13, 2008, 11:24:08 AM5/13/08
to
Mook wrote:
> Overall: It would probably be useful to explicitly state, near the top
> of the wiki page, exactly which Mozilla version this is targeting -
> sounds like Firefox.next / 1.9.nn?
Done

>> The problem with this approach is that preparing the statement (the
>> createStatement call) is often the most expensive operation for SELECT
>> statements. With the current API we can do that work on the
>> background thread[pool].
> On the other hand, this also means it's not possible to re-use statements.

This is true, but I think I have a solution for that. I need to think
about it a bit more though, so that'll come in a later post.

> I'm not quite sure how that second argument (the one with
> the things to bind) can be implemented, given that XPIDL doesn't, as far
> as I know, have method overloading. It seems to take either a JS Array,
> Object, or a C++ Parameters thing that is to be determined. Would we
> still get type-checking on the C++ side? (Hey, if you somehow manage to
> get that, all the better.)

I've updated the wiki (revision 71419 [1]) to better detail this. I
think it's quite possible to do all this and still get type-checking.

Cheers,

Shawn

[1]
http://developer.mozilla.org/en/docs/index.php?title=User_talk:Comrade693&oldid=71419#Async_Storage_Statements

Shawn Wilsher

unread,
May 13, 2008, 12:11:03 PM5/13/08
to
Sergey Yanovich wrote:
> A) I started to argue with the necessity of changing connection
> interface, but then came around this:
>> The problem with this approach is that preparing the statement (the
>> createStatement call) is often the most expensive operation for SELECT
>> statements.
>
> This is a great argument in favor of creating a statement
> asynchronously, but it is also an argument in favor of storing the
> produced statement for re-use.
>
> This may have been Shawn's motivation to introduce AsyncTransaction.
> Nevertheless, it must be an async operation. A possible name could be
> createStatementAsync(). Another important method of the statement is
> finalize(), which is the only reliable way to free the database from JS.
I've put some thought to this, and I have what may be a big concern with
storing the statement to use it again - locks. A statement can only be
active at one time, so we would have to protect it's use with a lock to
make sure that multiple async calls were not trying to use it at the
same time. Now, we could do this, but it seems to me that the common
case would be that there wouldn't be more than one call at a time, so
introducing the locks adds overhead for the common case. We'd also have
to be calling to code outside the storage module while holding the lock,
which is a no-no.

With that said, I think my previous statement wasn't very complete.
With large, fragmented database files, stepping through the result can
also be expensive since it has to keep reading from various places on
the disk.

Additionally, while preparing the statement is work, it's not blocking
the UI, so I'm not sure we should be worrying about saving that work.
Perhaps I'm wrong on this though...

> B) In the view of createStatementAsync(), Parameters callback receives
> additional task of storing the statement.

That violates Functional Cohesion, which I'd prefer to not do.

> C) Finally, ResultSet behavior needs some clarification. From how it is
> used in the example, it may be inferred, that getNextResult() is
> synchronous. That means, all data needs to be fetched and saved locally
> before handleResult() is invoked. A number of questions arise:
>
> * When working the large amounts of data (like browsing history), this
> may result is excessive memory consumption.
>
> * Although the data is fetched row by row, nothing will be displayed,
> until the last row is fetched.
>
> * If there are errors in the end of a large list, nothing will be
> displayed at all.

I'm not sure I understand this point specifically.

> Proposed solution is to have API for both paths of execution:
> 1. mozIStorageConnection.executeSimpleAsync(in AUTF8String aSQL,
> [optional] in Parameters aParameters, [optional] in Callback aCallback);
>
> 2. mozIStorageConnection.createStatementAsync(in in AUTF8String aSQL,
> [optional] in Parameters aParameters), then
> mozIStorageStatement.executeAsync(in Callback aCallback), then
> mozIStorageValueArray.fetchNextAsync(in Callback aCallback) for each
> row.

Based on my previous comments in this reply (and the one I'm about to
make), I don't think that's needed.

> D) Callback needs to have separate methods handleAllResults/handleResult
> respectively to handle 1. and 2.

I do not believe that this is necessary. handleResult can handle one or
more rows at a time, and it can be called more than once. I've updated
the wiki page accordingly (revision 71426 [1])

This may seem odd, but at least one other API that I know of does that -
Threading Building Blocks with it's Body and Range concepts. A Body may
have its function operator called more than once on different ranges,
which is essentially what we are doing here.

Let me know if this doesn't clarify things enough.

Cheers,

Shawn


[1]
http://developer.mozilla.org/en/docs/index.php?title=User_talk:Comrade693&oldid=71426

Boris Zbarsky

unread,
May 13, 2008, 1:01:55 PM5/13/08
to
Shawn Wilsher wrote:
> I've put some thought to this, and I have what may be a big concern with
> storing the statement to use it again - locks. A statement can only be
> active at one time, so we would have to protect it's use with a lock to
> make sure that multiple async calls were not trying to use it at the
> same time.

I'm jumping into the middle of the thread here, but would the following work?
Store both the original string and a prepared statement. Have an unsigned int
counter initialized to 0.

An attempt to execute the statement does the following:

1) atomically increment the counter
2a) If the counter value is now 1, go ahead and execute
the prepared statement, then atomically decrement the counter.
2b) Otherwise, atomically decrement the counter, create a one-off
statement from the string, and execute it.

This maintains the invariant that any time we're executing the prepared
statement the counter is nonzero, so that no one else can enter the prepared
statement execution. In the common case, the only overhead over blindly
executing the prepared statement is that we have to do the increment/decrement
and one branch. If two execution attempts race, one will hit the fast path,
while the other hits the slow path. At least if the branch between 2a and 2b is
done on the return value of PR_AtomicIncrement.

> Additionally, while preparing the statement is work, it's not blocking
> the UI, so I'm not sure we should be worrying about saving that work.
> Perhaps I'm wrong on this though...

Are all statement preparations happening on the same thread? How many
statements do we expect to be in-flight at once? Depending on the answer to the
former, it seems to me that if there is the potential for lots of statement
preparation to be happening, we'll either be blocking the statement preparation
thread (causing UI lag in whatever operation is showing a progress bar to the
user while waiting for the query to complete) or spawn off lots of statement
preparation threads (and some will end up blocking on lack of execution units,
though that might become less of a problem as processors have more and more cores).

But in any case, if it's cheap enough it might be worth optimizing for something
other than just UI responsiveness here (e.g. power consumption? ;) ).

-Boris

Sergey Yanovich

unread,
May 13, 2008, 2:01:13 PM5/13/08
to
Shawn Wilsher wrote:
> Sergey Yanovich wrote:

Yeah, now I seem to have more questions, than before :)

1. What's the common use case(s) for async calls? Should the resulting
API be universal or optimized for common usage?

2. When is mozIStorageStatement.finalize() expected to be called, if at all?

3. How many times is handleResult() expected to be called, when the
query yields 10, 100, 10000 rows? Is ResultSet.getNextResult() synchronous?

The answers will let me be more specific with my comments.

Thanks,
--
Sergey Yanovich

Shawn Wilsher

unread,
May 13, 2008, 3:08:36 PM5/13/08
to
Sergey Yanovich wrote:
> 1. What's the common use case(s) for async calls? Should the resulting
> API be universal or optimized for common usage?
After discussing this at length with shaver online, I think we are going
to move in the direction suggested by Adam in his first post. As a
result, I think that makes this concern moot - correct me if I'm wrong.

> 2. When is mozIStorageStatement.finalize() expected to be called, if at
> all?

You don't have to ever call it. Internally, it's called in the
statements destructor so when your js var is garbage collected it is
taken care of. However, it was added because you couldn't move/delete
the database file if a connection was still open, and a connection stays
open as long as there is a statement that isn't finalized. Seamonkey
specifically needed this.

> 3. How many times is handleResult() expected to be called, when the
> query yields 10, 100, 10000 rows? Is ResultSet.getNextResult() synchronous?

I think defining an amount would be a poor idea. The back-end could
chunk it in time, or it could chunk it some other way. I think that
decision should be decided at implementation time and be hidden from the
interface.

As a result, I don't think ResultSet::getNextResult is synchronous in
terms of database access. I'm envisioning the back-end to dump the data
into an object that is passed back to the callback. If that's not what
you meant by synchronous, you'll have to elaborate more :)

Cheers,

Shawn

Shawn Wilsher

unread,
May 13, 2008, 4:20:41 PM5/13/08
to
OK, version 3.0 of the API is up on the wiki (revision 71435 [1]). The
changes incorporated feedback obtained so far, as well as discussion
with shaver on irc. In summary:

(1) executeAsync has been moved onto mozIStorageStatement. You can now
use a prepared statement that has parameters already bound with this API.

(2) As a result of (1), the fancy binding of parameters magic has been
removed. We might look at that after this is done for Firefox 3.1, but
it will likely make Firefox 4.

(3) Code samples updated accordingly.

Questions and feedback welcome :)

Cheers,

Shawn

[1]
http://developer.mozilla.org/en/docs/index.php?title=User_talk:Comrade693&oldid=71435#Async_Storage_Statements

Sergey Yanovich

unread,
May 14, 2008, 3:24:59 AM5/14/08
to
Shawn Wilsher wrote:
> Sergey Yanovich wrote:
>> 1. What's the common use case(s) for async calls? Should the resulting
>> API be universal or optimized for common usage?
> After discussing this at length with shaver online, I think we are going
> to move in the direction suggested by Adam in his first post. As a
> result, I think that makes this concern moot - correct me if I'm wrong.

Either way was good to me, so there was no concern :) Nevertheless, I
like the made choice more than the other.

>> 2. When is mozIStorageStatement.finalize() expected to be called, if
>> at all?
> You don't have to ever call it. Internally, it's called in the
> statements destructor so when your js var is garbage collected it is
> taken care of. However, it was added because you couldn't move/delete
> the database file if a connection was still open, and a connection stays
> open as long as there is a statement that isn't finalized. Seamonkey
> specifically needed this.

After the decision to keep creating statements old way, this concern is
void. Statement object will remain to be normally accessible from js.

But this answers brings us back to the question 1. What we are trying to
solve? Please correct me, but I've read in bugzilla, that lack of
asynchronous API in storage makes browsing experience really ugly, if
the profile resides on a NFS mount. If this case falls into the range of
issues being addressed, we could probably reconsider this decision.

>> 3. How many times is handleResult() expected to be called, when the
>> query yields 10, 100, 10000 rows? Is ResultSet.getNextResult()
>> synchronous?
> I think defining an amount would be a poor idea. The back-end could
> chunk it in time, or it could chunk it some other way. I think that
> decision should be decided at implementation time and be hidden from the
> interface.
>
> As a result, I don't think ResultSet::getNextResult is synchronous in
> terms of database access. I'm envisioning the back-end to dump the data
> into an object that is passed back to the callback. If that's not what
> you meant by synchronous, you'll have to elaborate more :)

I understand the word "synchronous" when used with a method as follows.
If the method returns after the job is done, it is synchronous. If the
job continues to run in a different thread/process, when the method
returns, it is asynchronous.

It is now crystal clear from your answer, that the result set is deemed
to be created before handelResult() is invoked, so
ResultSet::GetNextResult() is a synchronous method (which fetches
products of asynchronous database access :)).

Now I can rephrase my earlier concerns on how this approach can handle
large result sets, like browsing history:

* When working the large amounts of data (like browsing history), this

may result is excessive memory consumption and allocation overhead. If
the caller plans to store all the results in its own storage, the
storage will be allocated 2 times: in the back-end, and by the caller.

* Although the data is fetched row by row, nothing will be displayed,
until the last row is fetched.

* If there are errors in the end of a large list, nothing will be

displayed at all. F.e., if there is network failure on the row #9999 of
10000, the back-end will invoke handleError(), and all fetched data will
be lost. The data can be recovered in handleError() as a workaround, but
this will violate Cohesion, right?

With all that said, we are also going to have an accessible statement,
which is a mozIStorageValueArray instance. So ResultSet is completely
redundant in this case, because the result set can be normally iterated
with mozIStorageStatement.executeStep(). Correct?

My proposal is that we add a executeStepAsync(). When the caller knows,
that the result set may be huge, it will use this and expect to be
called once for each row.

To enforce Functional Cohesion, Callback will need 2 separate methods
for this two async calls. They may be handleResultSet (for all) and
handleResult (for one). Both should receive mozIStorageStatement as a
parameter.

Thanks,
--
Sergey Yanovich

Shawn Wilsher

unread,
May 14, 2008, 1:09:17 PM5/14/08
to
Sergey Yanovich wrote:
> But this answers brings us back to the question 1. What we are trying to
> solve? Please correct me, but I've read in bugzilla, that lack of
> asynchronous API in storage makes browsing experience really ugly, if
> the profile resides on a NFS mount. If this case falls into the range of
> issues being addressed, we could probably reconsider this decision.
The problem, especially in the NFS case, is the call to fsync, which can
take a while (much more of an issue with some file systems like ext3
[1]) and lock up the UI. Placing this call on a background thread means
it won't lock up the UI.

> Now I can rephrase my earlier concerns on how this approach can handle
> large result sets, like browsing history:
>
> * When working the large amounts of data (like browsing history), this
> may result is excessive memory consumption and allocation overhead. If
> the caller plans to store all the results in its own storage, the
> storage will be allocated 2 times: in the back-end, and by the caller.

This already happens with most other database API's, so I don't think
it's a big deal. Also, I don't expect most situations to be handling
large chunks of data. However, if this does turn out to be a problem,
we can always have the background task providing the data "slow down" as
it were, and wait for existing data to be processed before generating more.

> * Although the data is fetched row by row, nothing will be displayed,
> until the last row is fetched.

I'm not sure how you've come to that conclusion, but I don't believe it
to be true.

> * If there are errors in the end of a large list, nothing will be
> displayed at all. F.e., if there is network failure on the row #9999 of
> 10000, the back-end will invoke handleError(), and all fetched data will
> be lost. The data can be recovered in handleError() as a workaround, but
> this will violate Cohesion, right?

I don't see why we couldn't give the caller the data we have and then
invoke handleError. Also, many calls could have been made to
handleResult before we even hit that error.

> With all that said, we are also going to have an accessible statement,
> which is a mozIStorageValueArray instance. So ResultSet is completely
> redundant in this case, because the result set can be normally iterated
> with mozIStorageStatement.executeStep(). Correct?

We won't be exposing this though. Internally, in the background we'll
be calling it to generate ResultSet objects, but this shouldn't be
exposed to handleResult. The statement would already be consumed when
generating the ResultSet object. I *really* don't see the benefit of
exposing a mozIStorageStatement object to handleResult.

Cheers,

Shawn


[1] https://bugzilla.mozilla.org/show_bug.cgi?id=421482

Sergey Yanovich

unread,
May 14, 2008, 2:34:34 PM5/14/08
to
Shawn Wilsher wrote:
> Sergey Yanovich wrote:
>> * When working the large amounts of data (like browsing history), this
>> may result is excessive memory consumption and allocation overhead. If
>> the caller plans to store all the results in its own storage, the
>> storage will be allocated 2 times: in the back-end, and by the caller.
> This already happens with most other database API's, so I don't think
> it's a big deal. Also, I don't expect most situations to be handling
> large chunks of data. However, if this does turn out to be a problem,
> we can always have the background task providing the data "slow down" as
> it were, and wait for existing data to be processed before generating more.
>
>> * Although the data is fetched row by row, nothing will be displayed,
>> until the last row is fetched.
> I'm not sure how you've come to that conclusion, but I don't believe it
> to be true.

My understanding was that handleResult() is called once on success, and
handleError() is called once on failure. If this is not correct, then
there is a new problem. I need to know when everything is fetched, cause
I will eventually need to free statement/close connection.

>> * If there are errors in the end of a large list, nothing will be
>> displayed at all. F.e., if there is network failure on the row #9999
>> of 10000, the back-end will invoke handleError(), and all fetched data
>> will be lost. The data can be recovered in handleError() as a
>> workaround, but this will violate Cohesion, right?
> I don't see why we couldn't give the caller the data we have and then
> invoke handleError. Also, many calls could have been made to
> handleResult before we even hit that error.

I thought that callback is invoked just once. Otherwise it is not clear,
when the processing is complete.

>> With all that said, we are also going to have an accessible statement,
>> which is a mozIStorageValueArray instance. So ResultSet is completely
>> redundant in this case, because the result set can be normally
>> iterated with mozIStorageStatement.executeStep(). Correct?
> We won't be exposing this though. Internally, in the background we'll
> be calling it to generate ResultSet objects, but this shouldn't be
> exposed to handleResult. The statement would already be consumed when
> generating the ResultSet object. I *really* don't see the benefit of
> exposing a mozIStorageStatement object to handleResult.

Well, the statement is available to the client, whether we want it or
not. In addition, executeStep() is the current approach to enumerating
the query results. In other words, mozIStorageStatement is a complete
logical equivalent to the proposed ResultSet, and executeStep() is a
complete logical equivalent to the proposed getNextResult(). More than
that, executeStep() is synchronous just as getNextResult().

So we are back to question:


> 3. How many times is handleResult() expected to be called, when the query yields 10, 100, 10000 rows?

Now I see, that I didn't interpreted your answer correctly. My major
problem here is that I need to know when everything is fetched. What's
wrong with calling the callback for each fetched row? That's approach
for processing large result sets. For small result sets, we can have a
different function, which will return when everything is already
fetched. I propose to name them executeStepAsync() and executeAsync()
respectively, similar to how their synchronous analogs are called.

--
Sergey Yanovich

Mike Shaver

unread,
May 14, 2008, 3:35:46 PM5/14/08
to dev-pl...@lists.mozilla.org
On Wed, May 14, 2008 at 2:34 PM, Sergey Yanovich <ynv...@gmail.com> wrote:
> I thought that callback is invoked just once. Otherwise it is not clear,
> when the processing is complete.

The callback can be called multiple times, with whatever the "natural"
batching order is from the database. Early implementations might well
use 1 as the batch size.

We could send a flag with handleResult to indicate that it's the last
result set, or use .handleComplete() call. I prefer the latter,
because it also lets us signal multiple errors using a similar model,
but it's not a strong preference.

Mike

Shawn Wilsher

unread,
May 14, 2008, 3:36:21 PM5/14/08
to
Sergey Yanovich wrote:
> My understanding was that handleResult() is called once on success, and
> handleError() is called once on failure. If this is not correct, then
> there is a new problem. I need to know when everything is fetched, cause
> I will eventually need to free statement/close connection.
I thought I made this clear in an earlier post - it is stated explicetly
on the wiki page:
void handleResult(in ResultSet aResultSet);
Handles a successful execution of the statement. This function may be
called more than once with a different ResultSet each time representing
new data.

If you really need to know when it's all done, we could add another
method that looks something like void handleCompletion();

> Well, the statement is available to the client, whether we want it or
> not. In addition, executeStep() is the current approach to enumerating
> the query results. In other words, mozIStorageStatement is a complete
> logical equivalent to the proposed ResultSet, and executeStep() is a
> complete logical equivalent to the proposed getNextResult(). More than
> that, executeStep() is synchronous just as getNextResult().

How is the statement available to the client? They don't get it in the
async callback, and the api doesn't state that the stmt that
executeAsync is called on is the one that's actually used. In fact, in
terms of implementation we can't use the same statement - we'll have to
clone it and transfer the bindings.

The difference between executeStep and getNextResult is that the former
has to read from disk, while the latter reads from memory.

> So we are back to question:
>> 3. How many times is handleResult() expected to be called, when the
>> query yields 10, 100, 10000 rows?
>
> Now I see, that I didn't interpreted your answer correctly. My major
> problem here is that I need to know when everything is fetched. What's
> wrong with calling the callback for each fetched row? That's approach
> for processing large result sets. For small result sets, we can have a
> different function, which will return when everything is already
> fetched. I propose to name them executeStepAsync() and executeAsync()
> respectively, similar to how their synchronous analogs are called.

I don't see how getting a callback for each row would solve your problem
of knowing when everything is done. Getting a callback for each row
will perform much slower than just running the call synchronously since
you have lots and lots of event proxying. Not only that, we'd still
have to copy the results over into memory for each row, otherwise the
background thread is blocked waiting for the data to be consumed on the
calling thread before it can step through the results again.

Additionally, having one API for "large" data sets and another for
"small" data sets is just asking for problems. If we can make one API
that satisfies both cases, we don't leave a footgun lying around for
developers to come across and play with. I am not yet convinced that
what is in the wiki now will perform poorly with "large" data sets.

I've gone ahead and clarified the wording for executeAsync (revision
71470 [1]):
void executeAsync([optional] in Callback aCallback);
Executes a prepared mozIStorageStatement and calls the callback (if
provided) when a result is returned. This statement is then reset, and
can be reused synchronously or asynchronously again immediately.

Cheers,

Shawn


[1]
http://developer.mozilla.org/en/docs/index.php?title=User_talk:Comrade693&oldid=71470#mozIStorageStatement

joehrbek

unread,
May 14, 2008, 3:42:51 PM5/14/08
to
add python into the mix also? :)


On May 12, 11:50 am, Shawn Wilsher <sdwi...@mozilla.com> wrote:
> I'm looking for feedback on the proposed API (with sample code) for an

> asynchronous API for mozStorage.  What I've got so far can be found here:http://developer.mozilla.org/en/docs/User_talk:Comrade693#Async_Stora...

Shawn Wilsher

unread,
May 14, 2008, 3:43:52 PM5/14/08
to
Mike Shaver wrote:
> We could send a flag with handleResult to indicate that it's the last
> result set, or use .handleComplete() call. I prefer the latter,
> because it also lets us signal multiple errors using a similar model,
> but it's not a strong preference.
I agree with the latter as well, with a slight name change. Added to
the wiki (revision 71471 [1]):

void handleCompletion();
Called when there are now more results to be obtained. This is called
only once.

Cheers,

Shawn


[1]
http://developer.mozilla.org/en/docs/index.php?title=User_talk:Comrade693&oldid=71471#Callback

Shawn Wilsher

unread,
May 14, 2008, 3:59:01 PM5/14/08
to
joehrbek wrote:
> add python into the mix also? :)
Python isn't included in the platform in general (at least by default,
but I think someone has something working), so that would a bit difficult.

Cheers,

Shawn

Sergey Yanovich

unread,
May 15, 2008, 4:50:31 AM5/15/08
to
Shawn Wilsher wrote:
> If you really need to know when it's all done, we could add another
> method that looks something like void handleCompletion();

> void executeAsync([optional] in Callback aCallback);


> Executes a prepared mozIStorageStatement and calls the callback (if
> provided) when a result is returned. This statement is then reset, and
> can be reused synchronously or asynchronously again immediately.

Mike Shaver wrote:
> The callback can be called multiple times, with whatever the "natural"
> batching order is from the database. Early implementations might well
> use 1 as the batch size.

Callback looks great now. Thanks for cooperative approach to the things.
Natural batching order is exactly the way I wished to see, because it's
the fastest.

The last remaining point is asynchronous statement creation. The
database validates the statement, and it needs to read metadata from
disks for that. If the database is over NFS mount, and the metadata is
not cached in memory, it may well be a lengthy operation, blocking UI.

Synchronous API has mozIStorageConenction::executeSimple(), which
doesn't expose the statement. We may have

void executeSimpleAsync(in AUTF8Strings aSql, [optional] in Callback
aCallback);


with a similar behavior.

Cheers,
--
Sergey Yanovich

Mike Shaver

unread,
May 15, 2008, 11:16:10 AM5/15/08
to Sergey Yanovich, dev-pl...@lists.mozilla.org
On Thu, May 15, 2008 at 4:50 AM, Sergey Yanovich <ynv...@gmail.com> wrote:
> The last remaining point is asynchronous statement creation. The
> database validates the statement, and it needs to read metadata from
> disks for that. If the database is over NFS mount, and the metadata is
> not cached in memory, it may well be a lengthy operation, blocking UI.

Metadata is eagerly cached in memory, and IMO the additional API
complexity is not worth it. Do you find that your app is unreasonably
blocked in statement preparation? If an NFS mount goes away anything
can freeze, since we treat file: loading as synchronous (for
performance reasons as well as portability), so defending against it
in this one case with additional API baggage doesn't feel necessary to
me.

I'm an open-minded sort of guy, though -- do any other APIs provide an
async single-statement preparation API? Let's see which, if so, and
why.

Mike

Sergey Yanovich

unread,
May 15, 2008, 12:54:21 PM5/15/08
to
Mike Shaver wrote:
> Metadata is eagerly cached in memory, and IMO the additional API
> complexity is not worth it. Do you find that your app is unreasonably
> blocked in statement preparation?

Most of my application database access calls use parameters, so this is
not an immediate concern. However, I will be implementing mozIStorage*
binding for MySQL. Synchronous statement preparation on a remote server
will certainly be a problem, but I will handle that in my code.

> If an NFS mount goes away anything
> can freeze, since we treat file: loading as synchronous (for
> performance reasons as well as portability), so defending against it
> in this one case with additional API baggage doesn't feel necessary to
> me.

Correct, NFS normally takes care of making file operations asynchronous,
but sqlite is using fsync(), which is the cause of unresponsiveness. So
sqlite is the only place, where we need to handle this issue.

The first time, I've heard about async API was related to NFS profiles,
so I'm just trying to ensure this particular issue is addressed. If
that's considered a non-blocker, I am just fine with everything else :)

Grepping the tree for 'ecuteSimple' shows, it is heavily used to
create/drop tables/indexes. Index operations may take some time on large
tables, so switching them to async API will improve UI responsiveness,
and maybe Ts also.

Other than that, I am fine with the latest version. Thanks for taking
MHO into consideration.

Cheers,
--
Sergey Yanovich
Abstract Accounting Ltd.
http://aasii.org/

Dan Mosedale

unread,
May 15, 2008, 8:21:33 PM5/15/08
to
Shawn Wilsher wrote:
> I thought I made this clear in an earlier post - it is stated explicetly
> on the wiki page:
> void handleResult(in ResultSet aResultSet);
> Handles a successful execution of the statement. This function may be
> called more than once with a different ResultSet each time representing
> new data.
>
> If you really need to know when it's all done, we could add another
> method that looks something like void handleCompletion();
>

If you have a long-running statement that requires some sort of
UI-feedback (eg a progress meter), it'd be nice to know when to stop
that progress meter without having to poll.

Also, how does one abort a long-running statement?

Dan

Sergey Yanovich

unread,
May 16, 2008, 10:18:22 AM5/16/08
to
Dan Mosedale wrote:

> If you have a long-running statement that requires some sort of
> UI-feedback (eg a progress meter), it'd be nice to know when to stop
> that progress meter without having to poll.

Yeah, handleCompletion() should handle this case.

> Also, how does one abort a long-running statement?

I missed this point :). What about making handleResult() and
handleError() return boolean: true - go on, false (or error) - stop here?

--
Sergey Yanovich

Shawn Wilsher

unread,
May 16, 2008, 1:08:43 PM5/16/08
to
Sergey Yanovich wrote:
> I missed this point :). What about making handleResult() and
> handleError() return boolean: true - go on, false (or error) - stop here?
Sure, but consumers would have to be able to accept handleResult or
handleError calls for events that are already placed in the queue. I'm
not sure if it's possible to cancel events that have already been posted
to a thread.

Cheers,

Shawn

Sergey Yanovich

unread,
May 16, 2008, 1:19:45 PM5/16/08
to

So we have a queue, and the current call returns false. Can backend just
release the rest of the queue with callbacks?

--
Sergey Yanovich

Sergey Yanovich

unread,
May 16, 2008, 1:25:10 PM5/16/08
to
Sergey Yanovich wrote:
> So we have a queue, and the current call returns false. Can backend just
> release the rest of the queue with callbacks?

I wanted to say "without invoking callbacks". :)

Shawn Wilsher

unread,
May 16, 2008, 10:56:58 PM5/16/08
to
Scratch this. dmose and I were talking on irc about how this is a bad
API. Callers should be able to cancel at any time, with or without
getting information in the callback yet. executeAsync should return a
nsICancelable as a result that will cancel the execution immediately.

Cheers,

Shawn

Sergey Yanovich

unread,
May 17, 2008, 5:27:21 AM5/17/08
to
Shawn Wilsher wrote:
> Scratch this. dmose and I were talking on irc about how this is a bad
> API. Callers should be able to cancel at any time, with or without
> getting information in the callback yet. executeAsync should return a
> nsICancelable as a result that will cancel the execution immediately.

It was said in a earlier post, that callbacks will be called in the
natural batching order of the underlying database. AFAICT unless
nsICancelable is going to kill the background working thread with brute
force system calls, the cancellation will be happening at the same
points, where callbacks are invoked.

In other words, the proposed nsICancelable functionality may be
implemented by the client in the Callback, without complicating the API.
But either way will work.

Cheers,
--
Sergey Yanovich

Shawn Wilsher

unread,
May 17, 2008, 1:11:29 PM5/17/08
to
I've hit a bit of an implementation snafu. I was hoping that I could
use sqlite3_transfer_bindings to transfer the bindings on the prepared
statement to the one we clone for us on the background. However, I
checked with the sqlite folks, and that won't work [1].

As a result, we have two options:
(1) Keep track of what is bound internally. For statements that are
executed synchronously, this won't be much use, but we'll need it for
the async case. This could consume a lot of memory if someone is
binding a large blob too.

(2) Pass in an "object" of bound parameters like earlier proposals of
the API suggested.

I'm OK with either solution here, but I wanted to get the opinion of
other folks.

Cheers,

Shawn


[1] http://www.mail-archive.com/sqlite...@sqlite.org/msg33987.html

Shawn Wilsher

unread,
May 21, 2008, 4:12:08 PM5/21/08
to

It appears that this is no longer neccessary. One of my assumptions
about thread safety turned out to be incorrect for sqlite.

Cheers,

Shawn

Shawn Wilsher

unread,
May 28, 2008, 1:16:27 PM5/28/08
to
I've updated the API [1] once again to reflect what is currently
implemented in the bug. Feedback is welcome (I haven't heard much from
anyone in a while on this).

Cheers,

Shawn

[1]
http://developer.mozilla.org/en/docs/index.php?title=User_talk:Comrade693&oldid=71855#Async_Storage_Statements

Adam Kowalczyk

unread,
May 28, 2008, 5:59:37 PM5/28/08
to
Overall, things are looking pretty good to me. Some thoughts:

1) How much information is going to be provided to
storageIStatementCallback.handleError()? The SQL error code and the
corresponding error string, for sure. If that's all, then perhaps they
could be passed as parameters and we could get rid of storageIError. On
the other hand, having an object which you can throw may also be useful.

2) The established nomenclature in existing interfaces uses "row", not
"tuple". See mozIStorageStatementRow, mozIStorageStatementWrapper.row.
"Row" is also used in descriptions of other attributes and methods.

3) Which methods of the callback are used by non-SELECT statements?
handleError() and handleCompletion(), I assume?

4) I am not convinced of the value of separating the iterator
(storageIResultSet) and a result (storageITuple). The iterator could
easily provide the results, too. I know this approach causes horrible
overloading of the statement object in case of synchronous statements.
However, storageIResultSet doesn't share all the other functionality of
a statement, it only handles the results. Separating things further
seems like going from one extreme to the other.

The practical consequence for the consumers is a little less concise
code, because of an additional variable. Also, you have to choose
between assigning that variable inside |while|, which is considered a
bad practice by some people, and calling getNextTuple() in two different
places. [1]

Ideally, handleResult() would take a JS array of storageITuple's and
storageIResultSet wouldn't be needed.

[1] http://pastebin.mozilla.org/445406

Sergey Yanovich

unread,
May 28, 2008, 7:54:44 PM5/28/08
to
Adam Kowalczyk wrote:
> 1) How much information is going to be provided to
> storageIStatementCallback.handleError()? The SQL error code and the
> corresponding error string, for sure. If that's all, then perhaps they
> could be passed as parameters and we could get rid of storageIError. On
> the other hand, having an object which you can throw may also be useful.

I would go further and only return nsresult, at least for the beginning.
String manipulations are expensive, especially GC string manipulations
[1].

> 2) The established nomenclature in existing interfaces uses "row", not
> "tuple". See mozIStorageStatementRow, mozIStorageStatementWrapper.row.
> "Row" is also used in descriptions of other attributes and methods.

++here. And do we really need 'storageITutle'? Maybe we can just extend
mozIStorageValueArray to have those nice two methods?

> 4) I am not convinced of the value of separating the iterator
> (storageIResultSet) and a result (storageITuple). The iterator could
> easily provide the results, too. I know this approach causes horrible
> overloading of the statement object in case of synchronous statements.
> However, storageIResultSet doesn't share all the other functionality of
> a statement, it only handles the results. Separating things further
> seems like going from one extreme to the other.
>
> The practical consequence for the consumers is a little less concise
> code, because of an additional variable. Also, you have to choose
> between assigning that variable inside |while|, which is considered a
> bad practice by some people, and calling getNextTuple() in two different
> places. [1]

I had the same concerns, and was even proposing to use ExecuteStep()
here. But ExecuteStep() is begging to be replaced, and we are discussing
the new API. So we can just have the things right the first time.

> Ideally, handleResult() would take a JS array of storageITuple's and
> storageIResultSet wouldn't be needed.

We cannot communicate JS array via XPCOM to native callers, can we?

--
Sergey Yanovich

Shawn Wilsher

unread,
May 29, 2008, 11:15:13 AM5/29/08
to
Sergey Yanovich wrote:
> I would go further and only return nsresult, at least for the beginning.
> String manipulations are expensive, especially GC string manipulations
> [1].
Why not both in that case? Does it really hurt?

> ++here. And do we really need 'storageITutle'? Maybe we can just extend
> mozIStorageValueArray to have those nice two methods?

Can't - see Bug 435994 Comment 5 [1] for the detailed story as to why.

> We cannot communicate JS array via XPCOM to native callers, can we?

Not that I'm aware of.

Cheers,

Shawn

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=435994#c5

Mike Shaver

unread,
May 29, 2008, 11:22:21 AM5/29/08
to Shawn Wilsher, dev-pl...@lists.mozilla.org
On Thu, May 29, 2008 at 11:15 AM, Shawn Wilsher <sdw...@mozilla.com> wrote:
> Sergey Yanovich wrote:
>> I would go further and only return nsresult, at least for the beginning.
>> String manipulations are expensive, especially GC string manipulations
>> [1].
> Why not both in that case? Does it really hurt?

Error-case performance isn't something I think we need to be too
concerned with; if the error checking is in your hot path, you have
problems that the API isn't going to help you with.

Mike

Shawn Wilsher

unread,
May 29, 2008, 11:34:27 AM5/29/08
to
Adam Kowalczyk wrote:
> 1) How much information is going to be provided to
> storageIStatementCallback.handleError()? The SQL error code and the
> corresponding error string, for sure. If that's all, then perhaps they
> could be passed as parameters and we could get rid of storageIError. On
> the other hand, having an object which you can throw may also be useful.
Those two things for sure, but otherwise I do not know. Personally, I'd
prefer to use the object since we use objects for most other callbacks,
and the object will scale better if we decided to add more information
later.

> 2) The established nomenclature in existing interfaces uses "row", not
> "tuple". See mozIStorageStatementRow, mozIStorageStatementWrapper.row.
> "Row" is also used in descriptions of other attributes and methods.

It seems like vlad would prefer row over tuple as well. I'm fairly
indifferent about this.

> 3) Which methods of the callback are used by non-SELECT statements?
> handleError() and handleCompletion(), I assume?

That is likely the case.

> 4) I am not convinced of the value of separating the iterator
> (storageIResultSet) and a result (storageITuple). The iterator could
> easily provide the results, too. I know this approach causes horrible
> overloading of the statement object in case of synchronous statements.
> However, storageIResultSet doesn't share all the other functionality of
> a statement, it only handles the results. Separating things further
> seems like going from one extreme to the other.

I really don't like the idea of overloading an objects meaning like
that. I know we do that already with mozIStorageStatement, but it
doesn't mean we should continue to do that if it can be avoided.

> The practical consequence for the consumers is a little less concise
> code, because of an additional variable. Also, you have to choose
> between assigning that variable inside |while|, which is considered a
> bad practice by some people, and calling getNextTuple() in two different
> places. [1]

Not sure I'm buying this argument. What's wrong with:
handleResult: function(aResultSet)
{
for (let tuple = aResultSet.getNextTuple(); tuple;
tuple = aResultSet.getNextTuple()) {
// Stuff
}
}
or, even cleaner:
handleResult: function(aResultSet)
{
let tuple;
while ((tuple = aResultSet.getNextTuple()) {
// Stuff
}
}

Like any good rule, there are always exceptions to it, and in this case,
I don't see why assigning in while is bad.

> Ideally, handleResult() would take a JS array of storageITuple's and
> storageIResultSet wouldn't be needed.

That wouldn't work well for native code.

Cheers,

Shawn

Mike Shaver

unread,
May 29, 2008, 11:54:34 AM5/29/08
to Shawn Wilsher, dev-pl...@lists.mozilla.org
On Thu, May 29, 2008 at 11:34 AM, Shawn Wilsher <sdw...@mozilla.com> wrote:
>> Ideally, handleResult() would take a JS array of storageITuple's and
>> storageIResultSet wouldn't be needed.
> That wouldn't work well for native code.

We can, and in cases where the win is big enough should, specialize
for different consumer languages.

But an array of interface pointers isn't too hard to work with from
native code, IIRC, as long as it's an in-param.

Mike

Shawn Wilsher

unread,
May 29, 2008, 12:05:49 PM5/29/08
to
Mike Shaver wrote:
> We can, and in cases where the win is big enough should, specialize
> for different consumer languages.
Right, and I strongly support that. I guess the question at this point
is do we want to use an enumerator for this API (like the current
proposal), or pass an array type structure?

> But an array of interface pointers isn't too hard to work with from
> native code, IIRC, as long as it's an in-param.

That involves passing the data and the size? Does that turn into a JS
array in JS land, or no?

Cheers,

Shawn

Sergey Yanovich

unread,
May 29, 2008, 3:03:56 PM5/29/08
to
Mike Shaver wrote:
> We can, and in cases where the win is big enough should, specialize
> for different consumer languages.

In our case, specializing for JS in mozStorage will add
mozStorage-over-JS dependency, right? Is that worth it?

--
Sergey Yanovich

Mike Shaver

unread,
May 29, 2008, 3:34:22 PM5/29/08
to Sergey Yanovich, dev-pl...@lists.mozilla.org

It's already built after JS and XPConnect, and I don't think it's a
design goal at all to have parts of toolkit work without JS and
XPConnect present.

Mike

Benjamin Smedberg

unread,
May 29, 2008, 3:46:39 PM5/29/08
to

Actually no, storage is currently built as part of tier_necko, which is
before gecko or toolkit.

But I agree, we should not base design decisions on that kind of API purity,
especially because we want to move towards clean JS APIs as our primary
programming interfaces.

--BDS

Shawn Wilsher

unread,
May 30, 2008, 11:07:16 AM5/30/08
to
While trying to test cancelability, I've realized an API change might be
in call. In paticular, I think changing the signature of
handleCompletion is needed:

const unsigned short REASON_COMPLETED = 0;
const unsigned short REASON_CANCELED = 1;
const unsigned short REASON_ERROR = 2;

void handleCompletion(in unsigned short aReason);

Problem with this is that I'm still not sure that I can reliably test
this code. The issue is that you could cancel at just the right time,
and the event to notify you about completion would already be posted to
the calling thread, so you expect to get a REASON_CANCELED, but get
REASON_COMPLETED instead. This will make the test unreliable :(

Maybe I'm missing something here though, so I'm hoping the masses have
some other insight.

Cheers,

Shawn

Sergey Yanovich

unread,
May 30, 2008, 11:36:01 AM5/30/08
to
Shawn Wilsher wrote:
> While trying to test cancelability, I've realized an API change might be
> in call. In paticular, I think changing the signature of
> handleCompletion is needed:
>
> const unsigned short REASON_COMPLETED = 0;
> const unsigned short REASON_CANCELED = 1;
> const unsigned short REASON_ERROR = 2;
>
> void handleCompletion(in unsigned short aReason);

It may also be good to know which request is completed.

> Problem with this is that I'm still not sure that I can reliably test
> this code. The issue is that you could cancel at just the right time,
> and the event to notify you about completion would already be posted to
> the calling thread, so you expect to get a REASON_CANCELED, but get
> REASON_COMPLETED instead. This will make the test unreliable :(

Don't post events directly, but to a queue. Call a queue handler on the
callback thread. The handler can access the queue with some mutex
protection, and invoke the callback. The canceling routine will also use
mutex to clean up the queue. And if the handler sees an empty queue, it
silently exits.

--
Sergey Yanovich

Shawn Wilsher

unread,
May 30, 2008, 12:02:47 PM5/30/08
to
Sergey Yanovich wrote:
> It may also be good to know which request is completed.
How? We don't tell you what request you are getting information on with
any of the other callback functions either.

> Don't post events directly, but to a queue. Call a queue handler on the
> callback thread. The handler can access the queue with some mutex
> protection, and invoke the callback. The canceling routine will also use
> mutex to clean up the queue. And if the handler sees an empty queue, it
> silently exits.

I see two issues with that approach:
1) It's a no-no to call code outside your module while you hold a mutex.
2) That still doesn't cover the case when it's actually completed and
you try to cancel. Telling the callback that it's completed because
it's canceled, while it may be true, isn't completely true because it
was in fact completed because it was complete.

Cheers,

Shawn

Sergey Yanovich

unread,
May 30, 2008, 12:22:48 PM5/30/08
to
Shawn Wilsher wrote:
> Sergey Yanovich wrote:
>> It may also be good to know which request is completed.
> How? We don't tell you what request you are getting information on with
> any of the other callback functions either.

Yeah. This can be achieved by a separate Callback object for each request.

>> Don't post events directly, but to a queue. Call a queue handler on
>> the callback thread. The handler can access the queue with some mutex
>> protection, and invoke the callback. The canceling routine will also
>> use mutex to clean up the queue. And if the handler sees an empty
>> queue, it silently exits.
> I see two issues with that approach:
> 1) It's a no-no to call code outside your module while you hold a mutex.

You don't need to to hold the mutex while invoking clients callback.
Only for queue access.

> 2) That still doesn't cover the case when it's actually completed and
> you try to cancel. Telling the callback that it's completed because
> it's canceled, while it may be true, isn't completely true because it
> was in fact completed because it was complete.

It actually does. The canceling code will know for sure whether the
request is completed. So the canceling may provide this type of feedback.

--
Sergey Yanovich

Shawn Wilsher

unread,
May 30, 2008, 1:10:51 PM5/30/08
to
Sergey Yanovich wrote:
> You don't need to to hold the mutex while invoking clients callback.
> Only for queue access.
Ah, yes - got it. I don't think I understood you correctly the first time.

I'll work on doing this now.

Cheers,

Shawn

Shawn Wilsher

unread,
May 30, 2008, 3:52:11 PM5/30/08
to
Sergey Yanovich wrote:
> It actually does. The canceling code will know for sure whether the
> request is completed. So the canceling may provide this type of feedback.
Actually, if we just track what state we are in better, we don't even
need to do the other suggestion of yours. As long as we know what state
we are in, we'll know if we have already dispatched the completion event
immediately, and can indicate so if execution is canceled (by throwing,
which sorta sucks)

Cheers,

Shawn

Sergey Yanovich

unread,
May 30, 2008, 4:23:34 PM5/30/08
to

Absolutely. That's exactly what I mean. And we should avoid throwing in
the normal flow of execution whenever possible. JS exceptions are
terribly expensive. It triggers memory scan for catch blocks in JS, then
the exception goes through xpconnect handling, which is long and
sub-optimal, because noone normally cares about error codepaths.

--
Sergey Yanovich

Dan Mosedale

unread,
May 30, 2008, 4:40:36 PM5/30/08
to
Shawn Wilsher wrote:
> Sergey Yanovich wrote:
>> It may also be good to know which request is completed.

Agreed; this allows a single object to listen for multiple outstanding
requests.

> How? We don't tell you what request you are getting information on with
> any of the other callback functions either.

Have all of the handle*() methods pass back the mozIStorageStatement
that triggered them as a param?

Dan

Shawn Wilsher

unread,
May 30, 2008, 4:39:25 PM5/30/08
to
Sergey Yanovich wrote:
> Absolutely. That's exactly what I mean. And we should avoid throwing in
> the normal flow of execution whenever possible. JS exceptions are
> terribly expensive. It triggers memory scan for catch blocks in JS, then
> the exception goes through xpconnect handling, which is long and
> sub-optimal, because noone normally cares about error codepaths.
We may want to change the API from nsICancelable to our own cancelable
thing then that returns a boolean if it worked or not?

Cheers,

Shawn

Shawn Wilsher

unread,
May 30, 2008, 5:00:30 PM5/30/08
to
Dan Mosedale wrote:
>> Sergey Yanovich wrote:
>>> It may also be good to know which request is completed.
>
> Agreed; this allows a single object to listen for multiple outstanding
> requests.
That's not really a good programming practice, now is it? Overloading
an object to do more than one thing?

>> How? We don't tell you what request you are getting information on with
>> any of the other callback functions either.
>
> Have all of the handle*() methods pass back the mozIStorageStatement
> that triggered them as a param?

Right, but the problem is that we don't use the same one you prepared.
We clone it and use it on the background since statements cannot be used
on different threads at the same time. Also, you would be unable to
use the statement on the calling thread until the background thread was
done with it. This isn't so useful if you want to do a series of async
events with the same statement.

Cheers,

Shawn

Sergey Yanovich

unread,
May 30, 2008, 5:22:37 PM5/30/08
to

I think so, but it depends on whether canceling a completed request is
considered a normal or not ;) If it is, it may even be worth to change
nsICancelable.

--
Sergey Yanovich

Dan Mosedale

unread,
May 30, 2008, 5:24:08 PM5/30/08
to
Shawn Wilsher wrote:
> Dan Mosedale wrote:
>>> Sergey Yanovich wrote:
>>>> It may also be good to know which request is completed.
>>
>> Agreed; this allows a single object to listen for multiple outstanding
>> requests.
>>
> That's not really a good programming practice, now is it? Overloading an
> object to do more than one thing?

I don't have a particularly strong opinion there. My hunch is that it's
somewhat a case-by-case sort of thing, but I'd be interested in hearing
other thoughts there.

>>> How? We don't tell you what request you are getting information on with
>>> any of the other callback functions either.
>>
>> Have all of the handle*() methods pass back the mozIStorageStatement
>> that triggered them as a param?
> Right, but the problem is that we don't use the same one you prepared.
> We clone it and use it on the background since statements cannot be used
> on different threads at the same time.

You could give each statement an integer ID. Alternately, perhaps the
statement API could make slightly stronger thread-safety promises.

> Also, you would be unable to use the statement on the calling thread until
> the background thread was done with it.

Even though the background thread is using a clone?

Dan

Mike Shaver

unread,
May 30, 2008, 5:29:49 PM5/30/08
to Shawn Wilsher, dev-pl...@lists.mozilla.org
On Fri, May 30, 2008 at 5:00 PM, Shawn Wilsher <sdw...@mozilla.com> wrote:
> Dan Mosedale wrote:
>>> Sergey Yanovich wrote:
>>>> It may also be good to know which request is completed.
>>
>> Agreed; this allows a single object to listen for multiple outstanding
>> requests.
> That's not really a good programming practice, now is it? Overloading
> an object to do more than one thing?

It's a very common practice -- how else do you rendezvous multiple
async operations? Every select() or poll() loop is like this.
Objects do multiple things all over the place in codebase, which is
why we have QI and why we try to avoid name collisions between members
of related interfaces.

If we split the history and bookmarks portions of Places, f.e., the
AwesomeBar would want to listen for the results from the searches it
kicked off in parallel on each, and handle them largely the same way.

(Personal annoyance: "not a good programming practice" is almost
content-free; say what the problem is that you are trying to avoid,
and how the practice you advocate helps.)

Mike

Shawn Wilsher

unread,
May 30, 2008, 5:29:25 PM5/30/08
to
Dan Mosedale wrote:
> Even though the background thread is using a clone?
That was in reference to the case of using the same statement. When we
clone, it doesn't matter.

Cheers,

Shawn

Shawn Wilsher

unread,
May 30, 2008, 6:03:37 PM5/30/08