Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Asynchronous API for mozStorage

34 views
Skip to first unread message

Shawn Wilsher

unread,
May 12, 2008, 12:50:45 PM5/12/08
to
I'm looking for feedback on the proposed API (with sample code) for an
asynchronous API for mozStorage. What I've got so far can be found here:
http://developer.mozilla.org/en/docs/User_talk:Comrade693#Async_Storage_Statements

This is something we are looking at for Firefox.next, so the sooner you
give feedback, the better. :)

Cheers,

Shawn

Adam Kowalczyk

unread,
May 12, 2008, 1:24:01 PM5/12/08
to
Thanks for taking this on, Shawn!

I think we should make async API as similar to the synchronous one as
possible. What's the reason behind basing it on HTML5 SQL? Developers
are already accustomed to existing mozStorage APIs, so we shouldn't
deviate from them without a good reason. Learning curve aside, it would
create a big inconsistency. Let's take a look at the differences:

* AsyncTransaction is a completely different animal than a traditional
transaction.
* mozStorage APIs revolve around statement objects. There's no explicit
statement here.
* Parameters are bound in a different way
* Results are read in a different way

By the way, AsyncTransaction.execute() reads as if the transaction is
executed, not the statement. C.f. mozIStorageStatement.execute().

Another concern is the callback. Let's make it a single function instead
of an object implementing given methods. In the latter case you've got
to implement QI and the code that handles the statement is more
scattered. If you use many async statements, you usually make a
<code>switch</code> and forward the callbacks to separate dedicated
functions, anyway.

Here is the Shawn's example, adapted to use my idea of the API. Please
note that I have no knowledge of the internals of sqlite and mozStorage,
so I don't know how feasible this is.

// conn is a mozIStorageConnection
let sql = "SELECT * FROM moz_downloads WHERE state = ?1";
let stmt = conn.createStatement(sql);
stmt.bindInt32Parameter(0, Ci.nsIDownloadManager.DOWNLOAD_FINISHED);
stmt.executeAsync(callback);

function callback(aStatement, aResult, aError) {
if (aError)
// update UI accordingly

for (let row in aResult) {
// process data
// should be able to access data in a row by name or argument
number
}
}


Alternatively, the callback could be defined in the style of XMLHttpRequest.

stmt.onResultsReady = onResultsReady;
stmt.onError = onError;
stmt.executeAsync();

function onResultsRead(aStatement, aResults) {

}

function onError(aStatement, aError) {

}

- Adam


Sergey Yanovich

unread,
May 12, 2008, 3:21:32 PM5/12/08
to

I am extensively using mozIStorage interfaces in an accounting xul
application, that is approaching production readiness. Shawn, thanks for
this opportunity to present my point.

Async calls are a great idea!

Comments on the proposed interface:

1. Is there any technical reasons that require AsyncTransaction?

It seems to be redundant otherwise. The best practice is use sql
parameters, which are to be bound after the statement is already
prepared and its syntax is verified. The counter-proposal is to add
'executeAsync()' to 'mozIStorageStatement'.

2. Separate method to obtain ResultSet is a right thing. We can probably
also change 'mozIStorageStatement' synchronous calls API in the same
way. Currently, the first call of 'executeStep()' has a special EXECUTE
meaning, while all subsequent ones actually mean FETCH ROW. Separating
this two meanings may be a good idea.

Instead of passing ResultSet proxy to a callback, let's:
drop 'boolean executeStep();',
drop 'void reset();',
drop 'readonly attribute long state;'
add 'readonly attribute mozIStorageValueArray resultSet;'
to 'mozIStorageStatement', and stop subclassing 'mozIStorageValueArray'
there. In turn 'mozIStorageValueArray' will receive
'readonly attribute long state;',
'void fetchNext();', and
'void fetchNextAsync([optional] in Callback aCallback);'

So 'mozIStorageValueArray' (aka cursor) is separated from
'mozIStorageStatement'. The former can later be extended to provide
support for random access cursors. 'reset()' goes away entirely. It can
be invoked internally in 'execute()', 'executeAsync()' and 'finalize()'.

As a result, mozIStorage* will become more SQL-ish and friendlier to
other databases like MySQL or PostgreSQL. There is a legitimate question
"why do we care about other databases?". Here is one hypothetical
example: since we already keep a hefty part of user profile (history and
bookmarks) in a relational database, we can move the rest of valuable
data -- mainly passwords and extensions (yes! extensions) -- there as
well. So the profile will contain a database (or databases) and cached
data. This creates an opportunity for big organizations to use a central
database to store and manage extensions, web settings and browsing data
of their users. And if such a user logins simultaneously on several
workstations, no browsing data will be lost.

3. Finally, can we directly use nsIWebProgressListener2 as a callback
class in both 'executeAsync()' and 'fetchNextAsync()'?

This can be the first step of many to provide full database progress
feedback using progress bar. Long places operations (the main client of
storage) will clearly benefit from this feature.

4. And woooooohooooooo, as 3.0 RC1 has finally been tagged for building!

--
Sergey Yanovich

Shawn Wilsher

unread,
May 12, 2008, 4:33:50 PM5/12/08
to
Adam Kowalczyk wrote:
> Thanks for taking this on, Shawn!
>
> I think we should make async API as similar to the synchronous one as
> possible. What's the reason behind basing it on HTML5 SQL? Developers
> are already accustomed to existing mozStorage APIs, so we shouldn't
> deviate from them without a good reason. Learning curve aside, it would
> create a big inconsistency. Let's take a look at the differences:
I based it off the HTML SQL spec for two reasons:
(1) A lot of mozilla stuff is based off of web APIs. This lowers the
barrier to entry for new developers (who have done web work at least)
since they are generally somewhat familiar with existing APIs.
(2) I don't think we should create new APIs solely based on the existing
API. It's my opinion that the current API is rather clunky in areas,
and the proposed API was meant to fix some of that.

> * AsyncTransaction is a completely different animal than a traditional
> transaction.

Yes, and no. Each statement has an implicit transaction wrapped around
it - the whole statement either succeeds or fails. With that said, I'm
open to a better name on this.

> * mozStorage APIs revolve around statement objects. There's no explicit
> statement here.

I'm not sure this is a bad thing, but it is different.

> * Parameters are bound in a different way

It's my opinion that our current approach to binding parameters is
clunky and error-prone. I'd love to fix that, and that's what I'm
trying to do here.

> * Results are read in a different way

Same as above. Note that since this is targeting Firefox.next, which is
supposed to come out some time in the fall, we don't want to change
existing APIs. So we can add to our current API, but changing it should
be avoided if possible.

> By the way, AsyncTransaction.execute() reads as if the transaction is
> executed, not the statement. C.f. mozIStorageStatement.execute().

Implicit transaction again. Although, I do believe this could be done
without the AsyncTransaction object which would probably ease this
concern of yours (I had a reason for doing it with AsyncTransaction
because I shared your line of thinking initially. Something made me
change my mind, but I can't recall what that was currently. When I
figure it out again, I'll post).

> Another concern is the callback. Let's make it a single function instead
> of an object implementing given methods. In the latter case you've got
> to implement QI and the code that handles the statement is more
> scattered. If you use many async statements, you usually make a
> <code>switch</code> and forward the callbacks to separate dedicated
> functions, anyway.

Letting one method handle both errors and handling of an actual result
is a form of Logical Cohesion which is generally frowned upon (yeah, we
do it in the mozilla codebase in various places). Functional Cohesion
is preferred. I recommend reading Code Complete 2nd Ed., Chapter 7 for
more details (I imagine wikipedia has something on it too).

Additionally, JS consumers don't have to implement the QI code, and they
don't even need to provide an error or result handler if they don't want
to. XPConnect, as I understand it, will just throw
NS_ERROR_NOT_IMPLEMENTED if the back-end code tries to call it. The
back-end should be able to handle that gracefully.

> // conn is a mozIStorageConnection
> let sql = "SELECT * FROM moz_downloads WHERE state = ?1";
> let stmt = conn.createStatement(sql);
> stmt.bindInt32Parameter(0, Ci.nsIDownloadManager.DOWNLOAD_FINISHED);
> stmt.executeAsync(callback);

The problem with this approach is that preparing the statement (the
createStatement call) is often the most expensive operation for SELECT
statements. With the current API we can do that work on the background
thread[pool].

> Alternatively, the callback could be defined in the style of
> XMLHttpRequest.

The only reason why I don't like this approach is because it seems like
a lot more lines of code to accomplish the same thing. With that said,
I'm indifferent about it.


Thanks for the great feedback. Let me know if this addresses your concerns.

Cheers,

Shawn

Shawn Wilsher

unread,
May 12, 2008, 9:13:51 PM5/12/08
to
Sergey Yanovich wrote:
> 1. Is there any technical reasons that require AsyncTransaction?
At this point, I don't think so. I had a reason at some point, but I
cannot recall what it was. The only thing that might be useful is
leaving it in for the callback to handle, but then it should have some
properties on it, otherwise it seems useless (checking as to why the
HTML 5 spec did this would probably be useful too - I'll get on that).

> It seems to be redundant otherwise. The best practice is use sql
> parameters, which are to be bound after the statement is already
> prepared and its syntax is verified. The counter-proposal is to add
> 'executeAsync()' to 'mozIStorageStatement'.

I think that this might be the way to go, but I don't think we should
have them prepare it first (see previous post about that in reply to Adam).

> 2. Separate method to obtain ResultSet is a right thing. We can probably
> also change 'mozIStorageStatement' synchronous calls API in the same
> way. Currently, the first call of 'executeStep()' has a special EXECUTE
> meaning, while all subsequent ones actually mean FETCH ROW. Separating
> this two meanings may be a good idea.

I'd love to, but we don't want to have that much API churn for Firefox
3.1. For Firefox 4 we can look at that.

> 3. Finally, can we directly use nsIWebProgressListener2 as a callback
> class in both 'executeAsync()' and 'fetchNextAsync()'?

I'm not sure I understand how that is useful, so you'd have to elaborate
on that.

Cheers,

Shawn

Shawn Wilsher

unread,
May 12, 2008, 10:11:32 PM5/12/08
to
I've gone and incorporated some of the feedback from Adam and Sergey.
All changes are up on the wiki [1], but here's a summary:

(1) Dropped the AsyncTransaction interface, and moved its executeAsync
method onto mozIStorageConnection. There was no reason for the
interface to exist, and it's one less thing consumers have to worry about.

(2) Added a SQLError interface that will be passed back to the callback
for errors. There is, however, no definition as I'm not 100% sure what
we want to include when we notify that there was an error.

(3) Updated the sample code to go along with the new interface proposal.

Cheers,

Shawn

[1]
http://developer.mozilla.org/en/docs/User_talk:Comrade693#Async_Storage_Statements

Mook

unread,
May 13, 2008, 3:30:21 AM5/13/08
to
(I'm looking at the wiki, r71407)

Overall: It would probably be useful to explicitly state, near the top
of the wiki page, exactly which Mozilla version this is targeting -
sounds like Firefox.next / 1.9.nn?

Shawn Wilsher wrote:


> Adam Kowalczyk wrote:
>> // conn is a mozIStorageConnection
>> let sql = "SELECT * FROM moz_downloads WHERE state = ?1";
>> let stmt = conn.createStatement(sql);
>> stmt.bindInt32Parameter(0, Ci.nsIDownloadManager.DOWNLOAD_FINISHED);
>> stmt.executeAsync(callback);
> The problem with this approach is that preparing the statement (the
> createStatement call) is often the most expensive operation for SELECT
> statements. With the current API we can do that work on the background
> thread[pool].

On the other hand, this also means it's not possible to re-use statements.

I'm not quite sure how that second argument (the one with
the things to bind) can be implemented, given that XPIDL doesn't, as far
as I know, have method overloading. It seems to take either a JS Array,
Object, or a C++ Parameters thing that is to be determined. Would we
still get type-checking on the C++ side? (Hey, if you somehow manage to
get that, all the better.)

Of course, I know about nil about API design ;)

(Off-topic note-to-self: trigger + mozIStorageFunction = win)

--
Mook

Boris Zbarsky

unread,
May 13, 2008, 4:05:40 AM5/13/08
to
Mook wrote:
> I'm not quite sure how that second argument (the one with
> the things to bind) can be implemented, given that XPIDL doesn't, as far
> as I know, have method overloading.

While strictly speaking true, there are ways to get around this, esp. if all
your calls are from JS into C++. You can use nsIVariant, or declare no args in
the IDL and read them directly using XPConnect APIs.

-Boris

Sergey Yanovich

unread,
May 13, 2008, 7:54:03 AM5/13/08
to
Shawn Wilsher wrote:

> Sergey Yanovich wrote:
>> 2. Separate method to obtain ResultSet is a right thing. We can
>> probably also change 'mozIStorageStatement' synchronous calls API in
>> the same way. Currently, the first call of 'executeStep()' has a
>> special EXECUTE meaning, while all subsequent ones actually mean FETCH
>> ROW. Separating this two meanings may be a good idea.
> I'd love to, but we don't want to have that much API churn for Firefox
> 3.1. For Firefox 4 we can look at that.

Sure. I mistook Firefox.next as Firefox 4.

>> 3. Finally, can we directly use nsIWebProgressListener2 as a callback
>> class in both 'executeAsync()' and 'fetchNextAsync()'?
> I'm not sure I understand how that is useful, so you'd have to elaborate
> on that.

I had in mind whole-sale database access like when History -> Show All
History is invoked. Our current Library window doesn't contain progress
bar, because it would be useless, since the routine that populates the
main tree is synchronous. But things will change, once async call
becomes possible. We can then provide substantially better user
experience by doing async loading and showing progress in a progress bar.

However, in the context of a dot release, introducing this new
functionality to the mozIStorage* no longer seems a good idea.

Sergey Yanovich

unread,
May 13, 2008, 10:12:40 AM5/13/08
to
Shawn Wilsher wrote:
> I've gone and incorporated some of the feedback from Adam and Sergey.
> All changes are up on the wiki [1], but here's a summary:
>
> (1) Dropped the AsyncTransaction interface, and moved its executeAsync
> method onto mozIStorageConnection. There was no reason for the
> interface to exist, and it's one less thing consumers have to worry about.

A) I started to argue with the necessity of changing connection
interface, but then came around this:


> The problem with this approach is that preparing the statement (the createStatement call) is often the most expensive operation for SELECT statements.

This is a great argument in favor of creating a statement
asynchronously, but it is also an argument in favor of storing the
produced statement for re-use.

This may have been Shawn's motivation to introduce AsyncTransaction.
Nevertheless, it must be an async operation. A possible name could be
createStatementAsync(). Another important method of the statement is
finalize(), which is the only reliable way to free the database from JS.

B) In the view of createStatementAsync(), Parameters callback receives
additional task of storing the statement.

C) Finally, ResultSet behavior needs some clarification. From how it is
used in the example, it may be inferred, that getNextResult() is
synchronous. That means, all data needs to be fetched and saved locally
before handleResult() is invoked. A number of questions arise:

* When working the large amounts of data (like browsing history), this
may result is excessive memory consumption.

* Although the data is fetched row by row, nothing will be displayed,
until the last row is fetched.

* If there are errors in the end of a large list, nothing will be
displayed at all.

Proposed solution is to have API for both paths of execution:
1. mozIStorageConnection.executeSimpleAsync(in AUTF8String aSQL,
[optional] in Parameters aParameters, [optional] in Callback aCallback);

2. mozIStorageConnection.createStatementAsync(in in AUTF8String aSQL,
[optional] in Parameters aParameters), then
mozIStorageStatement.executeAsync(in Callback aCallback), then
mozIStorageValueArray.fetchNextAsync(in Callback aCallback) for each
row.

D) Callback needs to have separate methods handleAllResults/handleResult
respectively to handle 1. and 2.

--
Sergey Yanovich

Mike Shaver

unread,
May 13, 2008, 10:56:16 AM5/13/08
to Sergey Yanovich, dev-pl...@lists.mozilla.org
On Tue, May 13, 2008 at 10:12 AM, Sergey Yanovich <ynv...@gmail.com> wrote:
> > The problem with this approach is that preparing the statement (the createStatement call) is often the most expensive operation for SELECT statements.

This is true for very light-weight SELECT statements, to be clear, and
it only matters if you're running them many times.

> This is a great argument in favor of creating a statement
> asynchronously,

No -- it can't really block, and the additional complexity of using it
will be a huge developer cost for no gain.

> but it is also an argument in favor of storing the
> produced statement for re-use.

Yes.

:)

Mike

Shawn Wilsher

unread,
May 13, 2008, 11:24:08 AM5/13/08
to
Mook wrote:
> Overall: It would probably be useful to explicitly state, near the top
> of the wiki page, exactly which Mozilla version this is targeting -
> sounds like Firefox.next / 1.9.nn?
Done

>> The problem with this approach is that preparing the statement (the
>> createStatement call) is often the most expensive operation for SELECT
>> statements. With the current API we can do that work on the
>> background thread[pool].
> On the other hand, this also means it's not possible to re-use statements.

This is true, but I think I have a solution for that. I need to think
about it a bit more though, so that'll come in a later post.

> I'm not quite sure how that second argument (the one with
> the things to bind) can be implemented, given that XPIDL doesn't, as far
> as I know, have method overloading. It seems to take either a JS Array,
> Object, or a C++ Parameters thing that is to be determined. Would we
> still get type-checking on the C++ side? (Hey, if you somehow manage to
> get that, all the better.)

I've updated the wiki (revision 71419 [1]) to better detail this. I
think it's quite possible to do all this and still get type-checking.

Cheers,

Shawn

[1]
http://developer.mozilla.org/en/docs/index.php?title=User_talk:Comrade693&oldid=71419#Async_Storage_Statements

Shawn Wilsher

unread,
May 13, 2008, 12:11:03 PM5/13/08
to
Sergey Yanovich wrote:
> A) I started to argue with the necessity of changing connection
> interface, but then came around this:
>> The problem with this approach is that preparing the statement (the
>> createStatement call) is often the most expensive operation for SELECT
>> statements.
>
> This is a great argument in favor of creating a statement
> asynchronously, but it is also an argument in favor of storing the
> produced statement for re-use.
>
> This may have been Shawn's motivation to introduce AsyncTransaction.
> Nevertheless, it must be an async operation. A possible name could be
> createStatementAsync(). Another important method of the statement is
> finalize(), which is the only reliable way to free the database from JS.
I've put some thought to this, and I have what may be a big concern with
storing the statement to use it again - locks. A statement can only be
active at one time, so we would have to protect it's use with a lock to
make sure that multiple async calls were not trying to use it at the
same time. Now, we could do this, but it seems to me that the common
case would be that there wouldn't be more than one call at a time, so
introducing the locks adds overhead for the common case. We'd also have
to be calling to code outside the storage module while holding the lock,
which is a no-no.

With that said, I think my previous statement wasn't very complete.
With large, fragmented database files, stepping through the result can
also be expensive since it has to keep reading from various places on
the disk.

Additionally, while preparing the statement is work, it's not blocking
the UI, so I'm not sure we should be worrying about saving that work.
Perhaps I'm wrong on this though...

> B) In the view of createStatementAsync(), Parameters callback receives
> additional task of storing the statement.

That violates Functional Cohesion, which I'd prefer to not do.

> C) Finally, ResultSet behavior needs some clarification. From how it is
> used in the example, it may be inferred, that getNextResult() is
> synchronous. That means, all data needs to be fetched and saved locally
> before handleResult() is invoked. A number of questions arise:
>
> * When working the large amounts of data (like browsing history), this
> may result is excessive memory consumption.
>
> * Although the data is fetched row by row, nothing will be displayed,
> until the last row is fetched.
>
> * If there are errors in the end of a large list, nothing will be
> displayed at all.

I'm not sure I understand this point specifically.

> Proposed solution is to have API for both paths of execution:
> 1. mozIStorageConnection.executeSimpleAsync(in AUTF8String aSQL,
> [optional] in Parameters aParameters, [optional] in Callback aCallback);
>
> 2. mozIStorageConnection.createStatementAsync(in in AUTF8String aSQL,
> [optional] in Parameters aParameters), then
> mozIStorageStatement.executeAsync(in Callback aCallback), then
> mozIStorageValueArray.fetchNextAsync(in Callback aCallback) for each
> row.

Based on my previous comments in this reply (and the one I'm about to
make), I don't think that's needed.

> D) Callback needs to have separate methods handleAllResults/handleResult
> respectively to handle 1. and 2.

I do not believe that this is necessary. handleResult can handle one or
more rows at a time, and it can be called more than once. I've updated
the wiki page accordingly (revision 71426 [1])

This may seem odd, but at least one other API that I know of does that -
Threading Building Blocks with it's Body and Range concepts. A Body may
have its function operator called more than once on different ranges,
which is essentially what we are doing here.

Let me know if this doesn't clarify things enough.

Cheers,

Shawn


[1]
http://developer.mozilla.org/en/docs/index.php?title=User_talk:Comrade693&oldid=71426

Boris Zbarsky

unread,
May 13, 2008, 1:01:55 PM5/13/08
to
Shawn Wilsher wrote:
> I've put some thought to this, and I have what may be a big concern with
> storing the statement to use it again - locks. A statement can only be
> active at one time, so we would have to protect it's use with a lock to
> make sure that multiple async calls were not trying to use it at the
> same time.

I'm jumping into the middle of the thread here, but would the following work?
Store both the original string and a prepared statement. Have an unsigned int
counter initialized to 0.

An attempt to execute the statement does the following:

1) atomically increment the counter
2a) If the counter value is now 1, go ahead and execute
the prepared statement, then atomically decrement the counter.
2b) Otherwise, atomically decrement the counter, create a one-off
statement from the string, and execute it.

This maintains the invariant that any time we're executing the prepared
statement the counter is nonzero, so that no one else can enter the prepared
statement execution. In the common case, the only overhead over blindly
executing the prepared statement is that we have to do the increment/decrement
and one branch. If two execution attempts race, one will hit the fast path,
while the other hits the slow path. At least if the branch between 2a and 2b is
done on the return value of PR_AtomicIncrement.

> Additionally, while preparing the statement is work, it's not blocking
> the UI, so I'm not sure we should be worrying about saving that work.
> Perhaps I'm wrong on this though...

Are all statement preparations happening on the same thread? How many
statements do we expect to be in-flight at once? Depending on the answer to the
former, it seems to me that if there is the potential for lots of statement
preparation to be happening, we'll either be blocking the statement preparation
thread (causing UI lag in whatever operation is showing a progress bar to the
user while waiting for the query to complete) or spawn off lots of statement
preparation threads (and some will end up blocking on lack of execution units,
though that might become less of a problem as processors have more and more cores).

But in any case, if it's cheap enough it might be worth optimizing for something
other than just UI responsiveness here (e.g. power consumption? ;) ).

-Boris

Sergey Yanovich

unread,
May 13, 2008, 2:01:13 PM5/13/08
to
Shawn Wilsher wrote:
> Sergey Yanovich wrote:

Yeah, now I seem to have more questions, than before :)

1. What's the common use case(s) for async calls? Should the resulting
API be universal or optimized for common usage?

2. When is mozIStorageStatement.finalize() expected to be called, if at all?

3. How many times is handleResult() expected to be called, when the
query yields 10, 100, 10000 rows? Is ResultSet.getNextResult() synchronous?

The answers will let me be more specific with my comments.

Thanks,
--
Sergey Yanovich

Shawn Wilsher

unread,
May 13, 2008, 3:08:36 PM5/13/08
to
Sergey Yanovich wrote:
> 1. What's the common use case(s) for async calls? Should the resulting
> API be universal or optimized for common usage?
After discussing this at length with shaver online, I think we are going
to move in the direction suggested by Adam in his first post. As a
result, I think that makes this concern moot - correct me if I'm wrong.

> 2. When is mozIStorageStatement.finalize() expected to be called, if at
> all?

You don't have to ever call it. Internally, it's called in the
statements destructor so when your js var is garbage collected it is
taken care of. However, it was added because you couldn't move/delete
the database file if a connection was still open, and a connection stays
open as long as there is a statement that isn't finalized. Seamonkey
specifically needed this.

> 3. How many times is handleResult() expected to be called, when the
> query yields 10, 100, 10000 rows? Is ResultSet.getNextResult() synchronous?

I think defining an amount would be a poor idea. The back-end could
chunk it in time, or it could chunk it some other way. I think that
decision should be decided at implementation time and be hidden from the
interface.

As a result, I don't think ResultSet::getNextResult is synchronous in
terms of database access. I'm envisioning the back-end to dump the data
into an object that is passed back to the callback. If that's not what
you meant by synchronous, you'll have to elaborate more :)

Cheers,

Shawn

Shawn Wilsher

unread,
May 13, 2008, 4:20:41 PM5/13/08
to
OK, version 3.0 of the API is up on the wiki (revision 71435 [1]). The
changes incorporated feedback obtained so far, as well as discussion
with shaver on irc. In summary:

(1) executeAsync has been moved onto mozIStorageStatement. You can now
use a prepared statement that has parameters already bound with this API.

(2) As a result of (1), the fancy binding of parameters magic has been
removed. We might look at that after this is done for Firefox 3.1, but
it will likely make Firefox 4.

(3) Code samples updated accordingly.

Questions and feedback welcome :)

Cheers,

Shawn

[1]
http://developer.mozilla.org/en/docs/index.php?title=User_talk:Comrade693&oldid=71435#Async_Storage_Statements

Sergey Yanovich

unread,
May 14, 2008, 3:24:59 AM5/14/08
to
Shawn Wilsher wrote:
> Sergey Yanovich wrote:
>> 1. What's the common use case(s) for async calls? Should the resulting
>> API be universal or optimized for common usage?
> After discussing this at length with shaver online, I think we are going
> to move in the direction suggested by Adam in his first post. As a
> result, I think that makes this concern moot - correct me if I'm wrong.

Either way was good to me, so there was no concern :) Nevertheless, I
like the made choice more than the other.

>> 2. When is mozIStorageStatement.finalize() expected to be called, if
>> at all?
> You don't have to ever call it. Internally, it's called in the
> statements destructor so when your js var is garbage collected it is
> taken care of. However, it was added because you couldn't move/delete
> the database file if a connection was still open, and a connection stays
> open as long as there is a statement that isn't finalized. Seamonkey
> specifically needed this.

After the decision to keep creating statements old way, this concern is
void. Statement object will remain to be normally accessible from js.

But this answers brings us back to the question 1. What we are trying to
solve? Please correct me, but I've read in bugzilla, that lack of
asynchronous API in storage makes browsing experience really ugly, if
the profile resides on a NFS mount. If this case falls into the range of
issues being addressed, we could probably reconsider this decision.

>> 3. How many times is handleResult() expected to be called, when the
>> query yields 10, 100, 10000 rows? Is ResultSet.getNextResult()
>> synchronous?
> I think defining an amount would be a poor idea. The back-end could
> chunk it in time, or it could chunk it some other way. I think that
> decision should be decided at implementation time and be hidden from the
> interface.
>
> As a result, I don't think ResultSet::getNextResult is synchronous in
> terms of database access. I'm envisioning the back-end to dump the data
> into an object that is passed back to the callback. If that's not what
> you meant by synchronous, you'll have to elaborate more :)

I understand the word "synchronous" when used with a method as follows.
If the method returns after the job is done, it is synchronous. If the
job continues to run in a different thread/process, when the method
returns, it is asynchronous.

It is now crystal clear from your answer, that the result set is deemed
to be created before handelResult() is invoked, so
ResultSet::GetNextResult() is a synchronous method (which fetches
products of asynchronous database access :)).

Now I can rephrase my earlier concerns on how this approach can handle
large result sets, like browsing history:

* When working the large amounts of data (like browsing history), this

may result is excessive memory consumption and allocation overhead. If
the caller plans to store all the results in its own storage, the
storage will be allocated 2 times: in the back-end, and by the caller.

* Although the data is fetched row by row, nothing will be displayed,
until the last row is fetched.

* If there are errors in the end of a large list, nothing will be

displayed at all. F.e., if there is network failure on the row #9999 of
10000, the back-end will invoke handleError(), and all fetched data will
be lost. The data can be recovered in handleError() as a workaround, but
this will violate Cohesion, right?

With all that said, we are also going to have an accessible statement,
which is a mozIStorageValueArray instance. So ResultSet is completely
redundant in this case, because the result set can be normally iterated
with mozIStorageStatement.executeStep(). Correct?

My proposal is that we add a executeStepAsync(). When the caller knows,
that the result set may be huge, it will use this and expect to be
called once for each row.

To enforce Functional Cohesion, Callback will need 2 separate methods
for this two async calls. They may be handleResultSet (for all) and
handleResult (for one). Both should receive mozIStorageStatement as a
parameter.

Thanks,
--
Sergey Yanovich

Shawn Wilsher

unread,
May 14, 2008, 1:09:17 PM5/14/08
to
Sergey Yanovich wrote:
> But this answers brings us back to the question 1. What we are trying to
> solve? Please correct me, but I've read in bugzilla, that lack of
> asynchronous API in storage makes browsing experience really ugly, if
> the profile resides on a NFS mount. If this case falls into the range of
> issues being addressed, we could probably reconsider this decision.
The problem, especially in the NFS case, is the call to fsync, which can
take a while (much more of an issue with some file systems like ext3
[1]) and lock up the UI. Placing this call on a background thread means
it won't lock up the UI.

> Now I can rephrase my earlier concerns on how this approach can handle
> large result sets, like browsing history:
>
> * When working the large amounts of data (like browsing history), this
> may result is excessive memory consumption and allocation overhead. If
> the caller plans to store all the results in its own storage, the
> storage will be allocated 2 times: in the back-end, and by the caller.

This already happens with most other database API's, so I don't think
it's a big deal. Also, I don't expect most situations to be handling
large chunks of data. However, if this does turn out to be a problem,
we can always have the background task providing the data "slow down" as
it were, and wait for existing data to be processed before generating more.

> * Although the data is fetched row by row, nothing will be displayed,
> until the last row is fetched.

I'm not sure how you've come to that conclusion, but I don't believe it
to be true.

> * If there are errors in the end of a large list, nothing will be
> displayed at all. F.e., if there is network failure on the row #9999 of
> 10000, the back-end will invoke handleError(), and all fetched data will
> be lost. The data can be recovered in handleError() as a workaround, but
> this will violate Cohesion, right?

I don't see why we couldn't give the caller the data we have and then
invoke handleError. Also, many calls could have been made to
handleResult before we even hit that error.

> With all that said, we are also going to have an accessible statement,
> which is a mozIStorageValueArray instance. So ResultSet is completely
> redundant in this case, because the result set can be normally iterated
> with mozIStorageStatement.executeStep(). Correct?

We won't be exposing this though. Internally, in the background we'll
be calling it to generate ResultSet objects, but this shouldn't be
exposed to handleResult. The statement would already be consumed when
generating the ResultSet object. I *really* don't see the benefit of
exposing a mozIStorageStatement object to handleResult.

Cheers,

Shawn


[1] https://bugzilla.mozilla.org/show_bug.cgi?id=421482

Sergey Yanovich

unread,
May 14, 2008, 2:34:34 PM5/14/08
to
Shawn Wilsher wrote:
> Sergey Yanovich wrote:
>> * When working the large amounts of data (like browsing history), this
>> may result is excessive memory consumption and allocation overhead. If
>> the caller plans to store all the results in its own storage, the
>> storage will be allocated 2 times: in the back-end, and by the caller.
> This already happens with most other database API's, so I don't think
> it's a big deal. Also, I don't expect most situations to be handling
> large chunks of data. However, if this does turn out to be a problem,
> we can always have the background task providing the data "slow down" as
> it were, and wait for existing data to be processed before generating more.
>
>> * Although the data is fetched row by row, nothing will be displayed,
>> until the last row is fetched.
> I'm not sure how you've come to that conclusion, but I don't believe it
> to be true.

My understanding was that handleResult() is called once on success, and
handleError() is called once on failure. If this is not correct, then
there is a new problem. I need to know when everything is fetched, cause
I will eventually need to free statement/close connection.

>> * If there are errors in the end of a large list, nothing will be
>> displayed at all. F.e., if there is network failure on the row #9999
>> of 10000, the back-end will invoke handleError(), and all fetched data
>> will be lost. The data can be recovered in handleError() as a
>> workaround, but this will violate Cohesion, right?
> I don't see why we couldn't give the caller the data we have and then
> invoke handleError. Also, many calls could have been made to
> handleResult before we even hit that error.

I thought that callback is invoked just once. Otherwise it is not clear,
when the processing is complete.

>> With all that said, we are also going to have an accessible statement,
>> which is a mozIStorageValueArray instance. So ResultSet is completely
>> redundant in this case, because the result set can be normally
>> iterated with mozIStorageStatement.executeStep(). Correct?
> We won't be exposing this though. Internally, in the background we'll
> be calling it to generate ResultSet objects, but this shouldn't be
> exposed to handleResult. The statement would already be consumed when
> generating the ResultSet object. I *really* don't see the benefit of
> exposing a mozIStorageStatement object to handleResult.

Well, the statement is available to the client, whether we want it or
not. In addition, executeStep() is the current approach to enumerating
the query results. In other words, mozIStorageStatement is a complete
logical equivalent to the proposed ResultSet, and executeStep() is a
complete logical equivalent to the proposed getNextResult(). More than
that, executeStep() is synchronous just as getNextResult().

So we are back to question:


> 3. How many times is handleResult() expected to be called, when the query yields 10, 100, 10000 rows?

Now I see, that I didn't interpreted your answer correctly. My major
problem here is that I need to know when everything is fetched. What's
wrong with calling the callback for each fetched row? That's approach
for processing large result sets. For small result sets, we can have a
different function, which will return when everything is already
fetched. I propose to name them executeStepAsync() and executeAsync()
respectively, similar to how their synchronous analogs are called.

--
Sergey Yanovich

Mike Shaver

unread,
May 14, 2008, 3:35:46 PM5/14/08
to dev-pl...@lists.mozilla.org
On Wed, May 14, 2008 at 2:34 PM, Sergey Yanovich <ynv...@gmail.com> wrote:
> I thought that callback is invoked just once. Otherwise it is not clear,
> when the processing is complete.

The callback can be called multiple times, with whatever the "natural"
batching order is from the database. Early implementations might well
use 1 as the batch size.

We could send a flag with handleResult to indicate that it's the last
result set, or use .handleComplete() call. I prefer the latter,
because it also lets us signal multiple errors using a similar model,
but it's not a strong preference.

Mike

Shawn Wilsher

unread,
May 14, 2008, 3:36:21 PM5/14/08
to
Sergey Yanovich wrote:
> My understanding was that handleResult() is called once on success, and
> handleError() is called once on failure. If this is not correct, then
> there is a new problem. I need to know when everything is fetched, cause
> I will eventually need to free statement/close connection.
I thought I made this clear in an earlier post - it is stated explicetly
on the wiki page:
void handleResult(in ResultSet aResultSet);
Handles a successful execution of the statement. This function may be
called more than once with a different ResultSet each time representing
new data.

If you really need to know when it's all done, we could add another
method that looks something like void handleCompletion();

> Well, the statement is available to the client, whether we want it or
> not. In addition, executeStep() is the current approach to enumerating
> the query results. In other words, mozIStorageStatement is a complete
> logical equivalent to the proposed ResultSet, and executeStep() is a
> complete logical equivalent to the proposed getNextResult(). More than
> that, executeStep() is synchronous just as getNextResult().

How is the statement available to the client? They don't get it in the
async callback, and the api doesn't state that the stmt that
executeAsync is called on is the one that's actually used. In fact, in
terms of implementation we can't use the same statement - we'll have to
clone it and transfer the bindings.

The difference between executeStep and getNextResult is that the former
has to read from disk, while the latter reads from memory.

> So we are back to question:
>> 3. How many times is handleResult() expected to be called, when the
>> query yields 10, 100, 10000 rows?
>
> Now I see, that I didn't interpreted your answer correctly. My major
> problem here is that I need to know when everything is fetched. What's
> wrong with calling the callback for each fetched row? That's approach
> for processing large result sets. For small result sets, we can have a
> different function, which will return when everything is already
> fetched. I propose to name them executeStepAsync() and executeAsync()
> respectively, similar to how their synchronous analogs are called.

I don't see how getting a callback for each row would solve your problem
of knowing when everything is done. Getting a callback for each row
will perform much slower than just running the call synchronously since
you have lots and lots of event proxying. Not only that, we'd still
have to copy the results over into memory for each row, otherwise the
background thread is blocked waiting for the data to be consumed on the
calling thread before it can step through the results again.

Additionally, having one API for "large" data sets and another for
"small" data sets is just asking for problems. If we can make one API
that satisfies both cases, we don't leave a footgun lying around for
developers to come across and play with. I am not yet convinced that
what is in the wiki now will perform poorly with "large" data sets.

I've gone ahead and clarified the wording for executeAsync (revision
71470 [1]):
void executeAsync([optional] in Callback aCallback);
Executes a prepared mozIStorageStatement and calls the callback (if
provided) when a result is returned. This statement is then reset, and
can be reused synchronously or asynchronously again immediately.

Cheers,

Shawn


[1]
http://developer.mozilla.org/en/docs/index.php?title=User_talk:Comrade693&oldid=71470#mozIStorageStatement

joehrbek

unread,
May 14, 2008, 3:42:51 PM5/14/08
to
add python into the mix also? :)


On May 12, 11:50 am, Shawn Wilsher <sdwi...@mozilla.com> wrote:
> I'm looking for feedback on the proposed API (with sample code) for an

> asynchronous API for mozStorage.  What I've got so far can be found here:http://developer.mozilla.org/en/docs/User_talk:Comrade693#Async_Stora...

Shawn Wilsher

unread,
May 14, 2008, 3:43:52 PM5/14/08
to
Mike Shaver wrote:
> We could send a flag with handleResult to indicate that it's the last
> result set, or use .handleComplete() call. I prefer the latter,
> because it also lets us signal multiple errors using a similar model,
> but it's not a strong preference.
I agree with the latter as well, with a slight name change. Added to
the wiki (revision 71471 [1]):

void handleCompletion();
Called when there are now more results to be obtained. This is called
only once.

Cheers,

Shawn


[1]
http://developer.mozilla.org/en/docs/index.php?title=User_talk:Comrade693&oldid=71471#Callback

Shawn Wilsher

unread,
May 14, 2008, 3:59:01 PM5/14/08
to
joehrbek wrote:
> add python into the mix also? :)
Python isn't included in the platform in general (at least by default,
but I think someone has something working), so that would a bit difficult.

Cheers,

Shawn

Sergey Yanovich

unread,
May 15, 2008, 4:50:31 AM5/15/08
to
Shawn Wilsher wrote:
> If you really need to know when it's all done, we could add another
> method that looks something like void handleCompletion();

> void executeAsync([optional] in Callback aCallback);


> Executes a prepared mozIStorageStatement and calls the callback (if
> provided) when a result is returned. This statement is then reset, and
> can be reused synchronously or asynchronously again immediately.

Mike Shaver wrote:
> The callback can be called multiple times, with whatever the "natural"
> batching order is from the database. Early implementations might well
> use 1 as the batch size.

Callback looks great now. Thanks for cooperative approach to the things.
Natural batching order is exactly the way I wished to see, because it's
the fastest.

The last remaining point is asynchronous statement creation. The
database validates the statement, and it needs to read metadata from
disks for that. If the database is over NFS mount, and the metadata is
not cached in memory, it may well be a lengthy operation, blocking UI.

Synchronous API has mozIStorageConenction::executeSimple(), which
doesn't expose the statement. We may have

void executeSimpleAsync(in AUTF8Strings aSql, [optional] in Callback
aCallback);


with a similar behavior.

Cheers,
--
Sergey Yanovich

Mike Shaver

unread,
May 15, 2008, 11:16:10 AM5/15/08
to Sergey Yanovich, dev-pl...@lists.mozilla.org
On Thu, May 15, 2008 at 4:50 AM, Sergey Yanovich <ynv...@gmail.com> wrote:
> The last remaining point is asynchronous statement creation. The
> database validates the statement, and it needs to read metadata from
> disks for that. If the database is over NFS mount, and the metadata is
> not cached in memory, it may well be a lengthy operation, blocking UI.

Metadata is eagerly cached in memory, and IMO the additional API
complexity is not worth it. Do you find that your app is unreasonably
blocked in statement preparation? If an NFS mount goes away anything
can freeze, since we treat file: loading as synchronous (for
performance reasons as well as portability), so defending against it
in this one case with additional API baggage doesn't feel necessary to
me.

I'm an open-minded sort of guy, though -- do any other APIs provide an
async single-statement preparation API? Let's see which, if so, and
why.

Mike

Sergey Yanovich

unread,
May 15, 2008, 12:54:21 PM5/15/08
to
Mike Shaver wrote:
> Metadata is eagerly cached in memory, and IMO the additional API
> complexity is not worth it. Do you find that your app is unreasonably
> blocked in statement preparation?

Most of my application database access calls use parameters, so this is
not an immediate concern. However, I will be implementing mozIStorage*
binding for MySQL. Synchronous statement preparation on a remote server
will certainly be a problem, but I will handle that in my code.

> If an NFS mount goes away anything
> can freeze, since we treat file: loading as synchronous (for
> performance reasons as well as portability), so defending against it
> in this one case with additional API baggage doesn't feel necessary to
> me.

Correct, NFS normally takes care of making file operations asynchronous,
but sqlite is using fsync(), which is the cause of unresponsiveness. So
sqlite is the only place, where we need to handle this issue.

The first time, I've heard about async API was related to NFS profiles,
so I'm just trying to ensure this particular issue is addressed. If
that's considered a non-blocker, I am just fine with everything else :)

Grepping the tree for 'ecuteSimple' shows, it is heavily used to
create/drop tables/indexes. Index operations may take some time on large
tables, so switching them to async API will improve UI responsiveness,
and maybe Ts also.

Other than that, I am fine with the latest version. Thanks for taking
MHO into consideration.

Cheers,
--
Sergey Yanovich
Abstract Accounting Ltd.
http://aasii.org/

Dan Mosedale

unread,
May 15, 2008, 8:21:33 PM5/15/08
to
Shawn Wilsher wrote:
> I thought I made this clear in an earlier post - it is stated explicetly
> on the wiki page:
> void handleResult(in ResultSet aResultSet);
> Handles a successful execution of the statement. This function may be
> called more than once with a different ResultSet each time representing
> new data.
>
> If you really need to know when it's all done, we could add another
> method that looks something like void handleCompletion();
>

If you have a long-running statement that requires some sort of
UI-feedback (eg a progress meter), it'd be nice to know when to stop
that progress meter without having to poll.

Also, how does one abort a long-running statement?

Dan

Sergey Yanovich

unread,
May 16, 2008, 10:18:22 AM5/16/08
to
Dan Mosedale wrote:

> If you have a long-running statement that requires some sort of
> UI-feedback (eg a progress meter), it'd be nice to know when to stop
> that progress meter without having to poll.

Yeah, handleCompletion() should handle this case.

> Also, how does one abort a long-running statement?

I missed this point :). What about making handleResult() and
handleError() return boolean: true - go on, false (or error) - stop here?

--
Sergey Yanovich

Shawn Wilsher

unread,
May 16, 2008, 1:08:43 PM5/16/08
to
Sergey Yanovich wrote:
> I missed this point :). What about making handleResult() and
> handleError() return boolean: true - go on, false (or error) - stop here?
Sure, but consumers would have to be able to accept handleResult or
handleError calls for events that are already placed in the queue. I'm
not sure if it's possible to cancel events that have already been posted
to a thread.

Cheers,

Shawn

Sergey Yanovich

unread,
May 16, 2008, 1:19:45 PM5/16/08
to

So we have a queue, and the current call returns false. Can backend just
release the rest of the queue with callbacks?

--
Sergey Yanovich

Sergey Yanovich

unread,
May 16, 2008, 1:25:10 PM5/16/08
to
Sergey Yanovich wrote:
> So we have a queue, and the current call returns false. Can backend just
> release the rest of the queue with callbacks?

I wanted to say "without invoking callbacks". :)

Shawn Wilsher

unread,
May 16, 2008, 10:56:58 PM5/16/08
to
Scratch this. dmose and I were talking on irc about how this is a bad
API. Callers should be able to cancel at any time, with or without
getting information in the callback yet. executeAsync should return a
nsICancelable as a result that will cancel the execution immediately.

Cheers,

Shawn

Sergey Yanovich

unread,
May 17, 2008, 5:27:21 AM5/17/08
to
Shawn Wilsher wrote:
> Scratch this. dmose and I were talking on irc about how this is a bad
> API. Callers should be able to cancel at any time, with or without
> getting information in the callback yet. executeAsync should return a
> nsICancelable as a result that will cancel the execution immediately.

It was said in a earlier post, that callbacks will be called in the
natural batching order of the underlying database. AFAICT unless
nsICancelable is going to kill the background working thread with brute
force system calls, the cancellation will be happening at the same
points, where callbacks are invoked.

In other words, the proposed nsICancelable functionality may be
implemented by the client in the Callback, without complicating the API.
But either way will work.

Cheers,
--
Sergey Yanovich

Shawn Wilsher

unread,
May 17, 2008, 1:11:29 PM5/17/08
to
I've hit a bit of an implementation snafu. I was hoping that I could
use sqlite3_transfer_bindings to transfer the bindings on the prepared
statement to the one we clone for us on the background. However, I
checked with the sqlite folks, and that won't work [1].

As a result, we have two options:
(1) Keep track of what is bound internally. For statements that are
executed synchronously, this won't be much use, but we'll need it for
the async case. This could consume a lot of memory if someone is
binding a large blob too.

(2) Pass in an "object" of bound parameters like earlier proposals of
the API suggested.

I'm OK with either solution here, but I wanted to get the opinion of
other folks.

Cheers,

Shawn


[1] http://www.mail-archive.com/sqlite...@sqlite.org/msg33987.html

Shawn Wilsher

unread,
May 21, 2008, 4:12:08 PM5/21/08
to

It appears that this is no longer neccessary. One of my assumptions
about thread safety turned out to be incorrect for sqlite.

Cheers,

Shawn

Shawn Wilsher

unread,
May 28, 2008, 1:16:27 PM5/28/08
to
I've updated the API [1] once again to reflect what is currently
implemented in the bug. Feedback is welcome (I haven't heard much from
anyone in a while on this).

Cheers,

Shawn

[1]
http://developer.mozilla.org/en/docs/index.php?title=User_talk:Comrade693&oldid=71855#Async_Storage_Statements

Adam Kowalczyk

unread,
May 28, 2008, 5:59:37 PM5/28/08
to
Overall, things are looking pretty good to me. Some thoughts:

1) How much information is going to be provided to
storageIStatementCallback.handleError()? The SQL error code and the
corresponding error string, for sure. If that's all, then perhaps they
could be passed as parameters and we could get rid of storageIError. On
the other hand, having an object which you can throw may also be useful.

2) The established nomenclature in existing interfaces uses "row", not
"tuple". See mozIStorageStatementRow, mozIStorageStatementWrapper.row.
"Row" is also used in descriptions of other attributes and methods.

3) Which methods of the callback are used by non-SELECT statements?
handleError() and handleCompletion(), I assume?

4) I am not convinced of the value of separating the iterator
(storageIResultSet) and a result (storageITuple). The iterator could
easily provide the results, too. I know this approach causes horrible
overloading of the statement object in case of synchronous statements.
However, storageIResultSet doesn't share all the other functionality of
a statement, it only handles the results. Separating things further
seems like going from one extreme to the other.

The practical consequence for the consumers is a little less concise
code, because of an additional variable. Also, you have to choose
between assigning that variable inside |while|, which is considered a
bad practice by some people, and calling getNextTuple() in two different
places. [1]

Ideally, handleResult() would take a JS array of storageITuple's and
storageIResultSet wouldn't be needed.

[1] http://pastebin.mozilla.org/445406

Sergey Yanovich

unread,
May 28, 2008, 7:54:44 PM5/28/08
to
Adam Kowalczyk wrote:
> 1) How much information is going to be provided to
> storageIStatementCallback.handleError()? The SQL error code and the
> corresponding error string, for sure. If that's all, then perhaps they
> could be passed as parameters and we could get rid of storageIError. On
> the other hand, having an object which you can throw may also be useful.

I would go further and only return nsresult, at least for the beginning.
String manipulations are expensive, especially GC string manipulations
[1].

> 2) The established nomenclature in existing interfaces uses "row", not
> "tuple". See mozIStorageStatementRow, mozIStorageStatementWrapper.row.
> "Row" is also used in descriptions of other attributes and methods.

++here. And do we really need 'storageITutle'? Maybe we can just extend
mozIStorageValueArray to have those nice two methods?

> 4) I am not convinced of the value of separating the iterator
> (storageIResultSet) and a result (storageITuple). The iterator could
> easily provide the results, too. I know this approach causes horrible
> overloading of the statement object in case of synchronous statements.
> However, storageIResultSet doesn't share all the other functionality of
> a statement, it only handles the results. Separating things further
> seems like going from one extreme to the other.
>
> The practical consequence for the consumers is a little less concise
> code, because of an additional variable. Also, you have to choose
> between assigning that variable inside |while|, which is considered a
> bad practice by some people, and calling getNextTuple() in two different
> places. [1]

I had the same concerns, and was even proposing to use ExecuteStep()
here. But ExecuteStep() is begging to be replaced, and we are discussing
the new API. So we can just have the things right the first time.

> Ideally, handleResult() would take a JS array of storageITuple's and
> storageIResultSet wouldn't be needed.

We cannot communicate JS array via XPCOM to native callers, can we?

--
Sergey Yanovich

Shawn Wilsher

unread,
May 29, 2008, 11:15:13 AM5/29/08
to
Sergey Yanovich wrote:
> I would go further and only return nsresult, at least for the beginning.
> String manipulations are expensive, especially GC string manipulations
> [1].
Why not both in that case? Does it really hurt?

> ++here. And do we really need 'storageITutle'? Maybe we can just extend
> mozIStorageValueArray to have those nice two methods?

Can't - see Bug 435994 Comment 5 [1] for the detailed story as to why.

> We cannot communicate JS array via XPCOM to native callers, can we?

Not that I'm aware of.

Cheers,

Shawn

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=435994#c5

Mike Shaver

unread,
May 29, 2008, 11:22:21 AM5/29/08
to Shawn Wilsher, dev-pl...@lists.mozilla.org
On Thu, May 29, 2008 at 11:15 AM, Shawn Wilsher <sdw...@mozilla.com> wrote:
> Sergey Yanovich wrote:
>> I would go further and only return nsresult, at least for the beginning.
>> String manipulations are expensive, especially GC string manipulations
>> [1].
> Why not both in that case? Does it really hurt?

Error-case performance isn't something I think we need to be too
concerned with; if the error checking is in your hot path, you have
problems that the API isn't going to help you with.

Mike

Shawn Wilsher

unread,
May 29, 2008, 11:34:27 AM5/29/08
to
Adam Kowalczyk wrote:
> 1) How much information is going to be provided to
> storageIStatementCallback.handleError()? The SQL error code and the
> corresponding error string, for sure. If that's all, then perhaps they
> could be passed as parameters and we could get rid of storageIError. On
> the other hand, having an object which you can throw may also be useful.
Those two things for sure, but otherwise I do not know. Personally, I'd
prefer to use the object since we use objects for most other callbacks,
and the object will scale better if we decided to add more information
later.

> 2) The established nomenclature in existing interfaces uses "row", not
> "tuple". See mozIStorageStatementRow, mozIStorageStatementWrapper.row.
> "Row" is also used in descriptions of other attributes and methods.

It seems like vlad would prefer row over tuple as well. I'm fairly
indifferent about this.

> 3) Which methods of the callback are used by non-SELECT statements?
> handleError() and handleCompletion(), I assume?

That is likely the case.

> 4) I am not convinced of the value of separating the iterator
> (storageIResultSet) and a result (storageITuple). The iterator could
> easily provide the results, too. I know this approach causes horrible
> overloading of the statement object in case of synchronous statements.
> However, storageIResultSet doesn't share all the other functionality of
> a statement, it only handles the results. Separating things further
> seems like going from one extreme to the other.

I really don't like the idea of overloading an objects meaning like
that. I know we do that already with mozIStorageStatement, but it
doesn't mean we should continue to do that if it can be avoided.

> The practical consequence for the consumers is a little less concise
> code, because of an additional variable. Also, you have to choose
> between assigning that variable inside |while|, which is considered a
> bad practice by some people, and calling getNextTuple() in two different
> places. [1]

Not sure I'm buying this argument. What's wrong with:
handleResult: function(aResultSet)
{
for (let tuple = aResultSet.getNextTuple(); tuple;
tuple = aResultSet.getNextTuple()) {
// Stuff
}
}
or, even cleaner:
handleResult: function(aResultSet)
{
let tuple;
while ((tuple = aResultSet.getNextTuple()) {
// Stuff
}
}

Like any good rule, there are always exceptions to it, and in this case,
I don't see why assigning in while is bad.

> Ideally, handleResult() would take a JS array of storageITuple's and
> storageIResultSet wouldn't be needed.

That wouldn't work well for native code.

Cheers,

Shawn

Mike Shaver

unread,
May 29, 2008, 11:54:34 AM5/29/08
to Shawn Wilsher, dev-pl...@lists.mozilla.org
On Thu, May 29, 2008 at 11:34 AM, Shawn Wilsher <sdw...@mozilla.com> wrote:
>> Ideally, handleResult() would take a JS array of storageITuple's and
>> storageIResultSet wouldn't be needed.
> That wouldn't work well for native code.

We can, and in cases where the win is big enough should, specialize
for different consumer languages.

But an array of interface pointers isn't too hard to work with from
native code, IIRC, as long as it's an in-param.

Mike

Shawn Wilsher

unread,
May 29, 2008, 12:05:49 PM5/29/08
to
Mike Shaver wrote:
> We can, and in cases where the win is big enough should, specialize
> for different consumer languages.
Right, and I strongly support that. I guess the question at this point
is do we want to use an enumerator for this API (like the current
proposal), or pass an array type structure?

> But an array of interface pointers isn't too hard to work with from
> native code, IIRC, as long as it's an in-param.

That involves passing the data and the size? Does that turn into a JS
array in JS land, or no?

Cheers,

Shawn

Sergey Yanovich

unread,
May 29, 2008, 3:03:56 PM5/29/08
to
Mike Shaver wrote:
> We can, and in cases where the win is big enough should, specialize
> for different consumer languages.

In our case, specializing for JS in mozStorage will add
mozStorage-over-JS dependency, right? Is that worth it?

--
Sergey Yanovich

Mike Shaver

unread,
May 29, 2008, 3:34:22 PM5/29/08
to Sergey Yanovich, dev-pl...@lists.mozilla.org

It's already built after JS and XPConnect, and I don't think it's a
design goal at all to have parts of toolkit work without JS and
XPConnect present.

Mike

Benjamin Smedberg

unread,
May 29, 2008, 3:46:39 PM5/29/08
to

Actually no, storage is currently built as part of tier_necko, which is
before gecko or toolkit.

But I agree, we should not base design decisions on that kind of API purity,
especially because we want to move towards clean JS APIs as our primary
programming interfaces.

--BDS

Shawn Wilsher

unread,
May 30, 2008, 11:07:16 AM5/30/08
to
While trying to test cancelability, I've realized an API change might be
in call. In paticular, I think changing the signature of
handleCompletion is needed:

const unsigned short REASON_COMPLETED = 0;
const unsigned short REASON_CANCELED = 1;
const unsigned short REASON_ERROR = 2;

void handleCompletion(in unsigned short aReason);

Problem with this is that I'm still not sure that I can reliably test
this code. The issue is that you could cancel at just the right time,
and the event to notify you about completion would already be posted to
the calling thread, so you expect to get a REASON_CANCELED, but get
REASON_COMPLETED instead. This will make the test unreliable :(

Maybe I'm missing something here though, so I'm hoping the masses have
some other insight.

Cheers,

Shawn

Sergey Yanovich

unread,
May 30, 2008, 11:36:01 AM5/30/08
to
Shawn Wilsher wrote:
> While trying to test cancelability, I've realized an API change might be
> in call. In paticular, I think changing the signature of
> handleCompletion is needed:
>
> const unsigned short REASON_COMPLETED = 0;
> const unsigned short REASON_CANCELED = 1;
> const unsigned short REASON_ERROR = 2;
>
> void handleCompletion(in unsigned short aReason);

It may also be good to know which request is completed.

> Problem with this is that I'm still not sure that I can reliably test
> this code. The issue is that you could cancel at just the right time,
> and the event to notify you about completion would already be posted to
> the calling thread, so you expect to get a REASON_CANCELED, but get
> REASON_COMPLETED instead. This will make the test unreliable :(

Don't post events directly, but to a queue. Call a queue handler on the
callback thread. The handler can access the queue with some mutex
protection, and invoke the callback. The canceling routine will also use
mutex to clean up the queue. And if the handler sees an empty queue, it
silently exits.

--
Sergey Yanovich

Shawn Wilsher

unread,
May 30, 2008, 12:02:47 PM5/30/08
to
Sergey Yanovich wrote:
> It may also be good to know which request is completed.
How? We don't tell you what request you are getting information on with
any of the other callback functions either.

> Don't post events directly, but to a queue. Call a queue handler on the
> callback thread. The handler can access the queue with some mutex
> protection, and invoke the callback. The canceling routine will also use
> mutex to clean up the queue. And if the handler sees an empty queue, it
> silently exits.

I see two issues with that approach:
1) It's a no-no to call code outside your module while you hold a mutex.
2) That still doesn't cover the case when it's actually completed and
you try to cancel. Telling the callback that it's completed because
it's canceled, while it may be true, isn't completely true because it
was in fact completed because it was complete.

Cheers,

Shawn

Sergey Yanovich

unread,
May 30, 2008, 12:22:48 PM5/30/08
to
Shawn Wilsher wrote:
> Sergey Yanovich wrote:
>> It may also be good to know which request is completed.
> How? We don't tell you what request you are getting information on with
> any of the other callback functions either.

Yeah. This can be achieved by a separate Callback object for each request.

>> Don't post events directly, but to a queue. Call a queue handler on
>> the callback thread. The handler can access the queue with some mutex
>> protection, and invoke the callback. The canceling routine will also
>> use mutex to clean up the queue. And if the handler sees an empty
>> queue, it silently exits.
> I see two issues with that approach:
> 1) It's a no-no to call code outside your module while you hold a mutex.

You don't need to to hold the mutex while invoking clients callback.
Only for queue access.

> 2) That still doesn't cover the case when it's actually completed and
> you try to cancel. Telling the callback that it's completed because
> it's canceled, while it may be true, isn't completely true because it
> was in fact completed because it was complete.

It actually does. The canceling code will know for sure whether the
request is completed. So the canceling may provide this type of feedback.

--
Sergey Yanovich

Shawn Wilsher

unread,
May 30, 2008, 1:10:51 PM5/30/08
to
Sergey Yanovich wrote:
> You don't need to to hold the mutex while invoking clients callback.
> Only for queue access.
Ah, yes - got it. I don't think I understood you correctly the first time.

I'll work on doing this now.

Cheers,

Shawn

Shawn Wilsher

unread,
May 30, 2008, 3:52:11 PM5/30/08
to
Sergey Yanovich wrote:
> It actually does. The canceling code will know for sure whether the
> request is completed. So the canceling may provide this type of feedback.
Actually, if we just track what state we are in better, we don't even
need to do the other suggestion of yours. As long as we know what state
we are in, we'll know if we have already dispatched the completion event
immediately, and can indicate so if execution is canceled (by throwing,
which sorta sucks)

Cheers,

Shawn

Sergey Yanovich

unread,
May 30, 2008, 4:23:34 PM5/30/08
to

Absolutely. That's exactly what I mean. And we should avoid throwing in
the normal flow of execution whenever possible. JS exceptions are
terribly expensive. It triggers memory scan for catch blocks in JS, then
the exception goes through xpconnect handling, which is long and
sub-optimal, because noone normally cares about error codepaths.

--
Sergey Yanovich

Dan Mosedale

unread,
May 30, 2008, 4:40:36 PM5/30/08
to
Shawn Wilsher wrote:
> Sergey Yanovich wrote:
>> It may also be good to know which request is completed.

Agreed; this allows a single object to listen for multiple outstanding
requests.

> How? We don't tell you what request you are getting information on with
> any of the other callback functions either.

Have all of the handle*() methods pass back the mozIStorageStatement
that triggered them as a param?

Dan

Shawn Wilsher

unread,
May 30, 2008, 4:39:25 PM5/30/08
to
Sergey Yanovich wrote:
> Absolutely. That's exactly what I mean. And we should avoid throwing in
> the normal flow of execution whenever possible. JS exceptions are
> terribly expensive. It triggers memory scan for catch blocks in JS, then
> the exception goes through xpconnect handling, which is long and
> sub-optimal, because noone normally cares about error codepaths.
We may want to change the API from nsICancelable to our own cancelable
thing then that returns a boolean if it worked or not?

Cheers,

Shawn

Shawn Wilsher

unread,
May 30, 2008, 5:00:30 PM5/30/08
to
Dan Mosedale wrote:
>> Sergey Yanovich wrote:
>>> It may also be good to know which request is completed.
>
> Agreed; this allows a single object to listen for multiple outstanding
> requests.
That's not really a good programming practice, now is it? Overloading
an object to do more than one thing?

>> How? We don't tell you what request you are getting information on with
>> any of the other callback functions either.
>
> Have all of the handle*() methods pass back the mozIStorageStatement
> that triggered them as a param?

Right, but the problem is that we don't use the same one you prepared.
We clone it and use it on the background since statements cannot be used
on different threads at the same time. Also, you would be unable to
use the statement on the calling thread until the background thread was
done with it. This isn't so useful if you want to do a series of async
events with the same statement.

Cheers,

Shawn

Sergey Yanovich

unread,
May 30, 2008, 5:22:37 PM5/30/08
to

I think so, but it depends on whether canceling a completed request is
considered a normal or not ;) If it is, it may even be worth to change
nsICancelable.

--
Sergey Yanovich

Dan Mosedale

unread,
May 30, 2008, 5:24:08 PM5/30/08
to
Shawn Wilsher wrote:
> Dan Mosedale wrote:
>>> Sergey Yanovich wrote:
>>>> It may also be good to know which request is completed.
>>
>> Agreed; this allows a single object to listen for multiple outstanding
>> requests.
>>
> That's not really a good programming practice, now is it? Overloading an
> object to do more than one thing?

I don't have a particularly strong opinion there. My hunch is that it's
somewhat a case-by-case sort of thing, but I'd be interested in hearing
other thoughts there.

>>> How? We don't tell you what request you are getting information on with
>>> any of the other callback functions either.
>>
>> Have all of the handle*() methods pass back the mozIStorageStatement
>> that triggered them as a param?
> Right, but the problem is that we don't use the same one you prepared.
> We clone it and use it on the background since statements cannot be used
> on different threads at the same time.

You could give each statement an integer ID. Alternately, perhaps the
statement API could make slightly stronger thread-safety promises.

> Also, you would be unable to use the statement on the calling thread until
> the background thread was done with it.

Even though the background thread is using a clone?

Dan

Mike Shaver

unread,
May 30, 2008, 5:29:49 PM5/30/08
to Shawn Wilsher, dev-pl...@lists.mozilla.org
On Fri, May 30, 2008 at 5:00 PM, Shawn Wilsher <sdw...@mozilla.com> wrote:
> Dan Mosedale wrote:
>>> Sergey Yanovich wrote:
>>>> It may also be good to know which request is completed.
>>
>> Agreed; this allows a single object to listen for multiple outstanding
>> requests.
> That's not really a good programming practice, now is it? Overloading
> an object to do more than one thing?

It's a very common practice -- how else do you rendezvous multiple
async operations? Every select() or poll() loop is like this.
Objects do multiple things all over the place in codebase, which is
why we have QI and why we try to avoid name collisions between members
of related interfaces.

If we split the history and bookmarks portions of Places, f.e., the
AwesomeBar would want to listen for the results from the searches it
kicked off in parallel on each, and handle them largely the same way.

(Personal annoyance: "not a good programming practice" is almost
content-free; say what the problem is that you are trying to avoid,
and how the practice you advocate helps.)

Mike

Shawn Wilsher

unread,
May 30, 2008, 5:29:25 PM5/30/08
to
Dan Mosedale wrote:
> Even though the background thread is using a clone?
That was in reference to the case of using the same statement. When we
clone, it doesn't matter.

Cheers,

Shawn

Shawn Wilsher

unread,
May 30, 2008, 6:03:37 PM5/30/08
to
Mike Shaver wrote:
> It's a very common practice -- how else do you rendezvous multiple
> async operations? Every select() or poll() loop is like this.
> Objects do multiple things all over the place in codebase, which is
> why we have QI and why we try to avoid name collisions between members
> of related interfaces.
Report back to some object, or carry some shared state between the
callbacks? There are several ways to accomplish this as well.

> If we split the history and bookmarks portions of Places, f.e., the
> AwesomeBar would want to listen for the results from the searches it
> kicked off in parallel on each, and handle them largely the same way.

I'm not really buying this argument since bookmarks are a subset of
history almost always (and enough so that they can be considered as such).

> (Personal annoyance: "not a good programming practice" is almost
> content-free; say what the problem is that you are trying to avoid,
> and how the practice you advocate helps.)

In this case, I just recall reading somewhere that overloading objects
to do more than one thing can end up getting quite confusing. An object
that does one thing and only one thing can do it well much easier than
an object that does lots of things and has to do all of them really
well. With that said, it's just a nagging feeling in the back of my
head that I'm not sure if it's worth listening too, but wanted to bring
it up. It's one of those "strong opinions, weakly held" [1].

Cheers,

Shawn

[1] http://www.codinghorror.com/blog/archives/001124.html

Shawn Wilsher

unread,
May 30, 2008, 6:12:11 PM5/30/08
to
Dan Mosedale wrote:
> You could give each statement an integer ID. Alternately, perhaps the
> statement API could make slightly stronger thread-safety promises.
re: integer id
How would we communicate that to the callback initially? I'm not sure
this really gets us anywhere useful - code samples maybe?

re: thread-safety
I thought about this initially, but I had a concern:
Many consumers would still want to execute statements synchronously
(some API's are inherently sync, and can't be changed to async without
breaking API compatibility), so do we really want to add a lot of
locking overhead for something that isn't the common case (at least
initially, that may change over time making this no longer an issue).

This seems to be a big concern, since I know there are a fair number of
statements that run during startup too.


Cheers,

Shawn

Shawn Wilsher

unread,
May 30, 2008, 7:16:37 PM5/30/08
to
Sergey Yanovich wrote:
> I think so, but it depends on whether canceling a completed request is
> considered a normal or not ;) If it is, it may even be worth to change
> nsICancelable.
So, it turns out that canceling kinda sucks as it stands. Take a look
at my latest attachement for tests [1], and you'll see what I mean
(towards the bottom). I think it might be a good idea to change what
mozIStorageStatement::executeAsync returns to being a new interface:

interface storageICancelable : nsISupports
{
/**
* Attempts to canceled.
*
* @returns true if it was successfully canceled, and false if it
* could not be canceled.
*/
boolean cancel();
}

Thoughts?

Cheers,

Shawn

[1] https://bugzilla.mozilla.org/attachment.cgi?id=323165

Shawn Wilsher

unread,
May 30, 2008, 7:22:27 PM5/30/08
to
I've updated the wiki again to reflect the changes for the
handleCompletion method [1].

It now has a parameter that indicates the reason for the completion,
which can be a normal completion, cancellation, or a fatal error.

Cheers,

Shawn

[1]
http://developer.mozilla.org/en/docs/index.php?title=User_talk:Comrade693&oldid=71937#storageIStatementCallback

Dan Mosedale

unread,
Jun 2, 2008, 7:02:26 PM6/2/08
to
Shawn Wilsher wrote:
> Dan Mosedale wrote:
>> You could give each statement an integer ID. Alternately, perhaps the
>> statement API could make slightly stronger thread-safety promises.
> re: integer id
> How would we communicate that to the callback initially? I'm not sure
> this really gets us anywhere useful - code samples maybe?

Either extend nsICancelable or return an integer instead of an
nsICancellable.

> re: thread-safety
> I thought about this initially, but I had a concern:
> Many consumers would still want to execute statements synchronously
> (some API's are inherently sync, and can't be changed to async without
> breaking API compatibility), so do we really want to add a lot of
> locking overhead for something that isn't the common case (at least
> initially, that may change over time making this no longer an issue).

One could imagine making a threadsafety frob on the API that could be
adjusted at some appropriate level of granularity (per statement?) and
would default to off.

Dan

Shawn Wilsher

unread,
Jun 10, 2008, 5:58:31 PM6/10/08
to Mike Shaver
Previously we've discussed the "proper" grouping of results in this
thread. For the time being we decided to settle on dispatching just one
row per result set, but have the ability to give more if we decide to do
so in the future.

When playing with the API more in Bug 438342 (build the download list
asynchronously) [1], I've come to the conclusion that this behavior
isn't ideal for any moderately large list. When drawing the downloads
window, it would remain blank for three to four seconds, then things
would display. I should note that the browser UI was fully responsive
during this time - the downloads window just didn't draw as fast as it
does now. Now, there are a few reasons why I suspect this is happening:
1) The downloads.sqlite file I'm generating this from has 590+ entries.
This is more than most users would have, but I'm using it as a good
way to stress test our API.
2) The query that is used by the download manager has an ORDER BY
clause, which means SQLite executes the query and obtains *all* the
results, then sorts them. This makes this a great query to run on the
background, but at the same time means that results are obtained very
quickly. As a result, we have 590+ events dispatched to the main thread
in a very short amount of time.

Personally, I think this is a problem with the current implementation.
I think we would be better off grouping more than one row per result set
dispatched. This is what I feel would be a good grouping strategy to
implement. There is a disclaimer on this though - I haven't tested
anything with these changes. I'm looking for feedback on this to see if
I'm blatantly missing something obvious, etc. With that said, we should
group results by the following conditions (whichever one comes first):
1) 200 milliseconds have elapsed since we last dispatched a result set
to the callback and we have at least one row. This number isn't just
pulled out of the air! 200-250 ms is generally accepted as the "safe"
amount of time to wait before displaying some form of feedback to the
user that something is happening.
2) 10-20 results. This number would need some experimentation, but the
idea is to not have too much in one result set, so the processing
function doesn't get a ton of things to process and manage to lock up
the main thread.
3) when we've run out of results. I shouldn't have to explain this one.

I figured it's best to get these thoughts out here in the open so folks
can comment on them.

Cheers,

Shawn

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=438342

Justin Dolske

unread,
Jun 10, 2008, 6:28:30 PM6/10/08
to
Shawn Wilsher wrote:

> When playing with the API more in Bug 438342 (build the download list
> asynchronously) [1], I've come to the conclusion that this behavior
> isn't ideal for any moderately large list. When drawing the downloads
> window, it would remain blank for three to four seconds, then things
> would display. I should note that the browser UI was fully responsive
> during this time - the downloads window just didn't draw as fast as it
> does now. Now, there are a few reasons why I suspect this is happening:

How much of that is just due to the perf issues in bug 413209?

In any case, batching query results doesn't seem unreasonable.

> Personally, I think this is a problem with the current implementation. I
> think we would be better off grouping more than one row per result set
> dispatched. This is what I feel would be a good grouping strategy to
> implement.

The caller might want to have some control over this?

You might also want to consider having the batching only kick in after
the first N results, so that initial results can come in quickly.

200-250ms also seems on the high side for UI feedback. Also consider
that the caller might have yet more work to do before presenting
results. And some callers might not be driving any UI with the result
(but may way immediate results. or not.)

Justin

Shawn Wilsher

unread,
Jun 11, 2008, 12:14:00 AM6/11/08
to Justin Dolske
Justin Dolske wrote:
> How much of that is just due to the perf issues in bug 413209?
No, that bug is about a deliberate delay that we add in the download
manager currently. The bug I mentioned removes it entirely and relies
on the storage API.

> The caller might want to have some control over this?
>
> You might also want to consider having the batching only kick in after
> the first N results, so that initial results can come in quickly.

That's covered by point (2) in my post, right?

> 200-250ms also seems on the high side for UI feedback. Also consider
> that the caller might have yet more work to do before presenting
> results. And some callers might not be driving any UI with the result
> (but may way immediate results. or not.)

OK, so 150-200 ms. 200-250 ms is generally considered to be a safe
amount of time to delay feedback for a user's action.

We still don't want to dump too much back to the caller at once (iff
it's on the main thread) because even if it doesn't drive UI, it can
block the UI. We can have a much simpler heuristic for callbacks that
do not live on the main thread though.

I'm not so sure that consumers that want results immediately should be
using the async api in the first place. That seems like synchronous is
the way to go then.

Cheers,

Shawn

Mike Shaver

unread,
Jun 11, 2008, 12:27:20 AM6/11/08
to Shawn Wilsher, dev-pl...@lists.mozilla.org
On Wed, Jun 11, 2008 at 12:14 AM, Shawn Wilsher <sdw...@mozilla.com> wrote:
> I'm not so sure that consumers that want results immediately should be
> using the async api in the first place. That seems like synchronous is
> the way to go then.

Sync hurts if it can interrupt the user, such as when triggering an
awesomebar query during typing, or if you're on a slow/busy
filesystem. It should be avoided, like sync XHR.

It'd be good to get some data on what the batch size looks like at 10,
25, 50, 100, 150, 200, etc. ms in your test setup. I agree that we
want some batching, but I'm not sure why firing N 1-entry callbacks is
worse than firing N/M M-entry callbacks in terms of starvation; is the
per-callback invocation cost really so high? I don't get why it would
have different effects on event scheduling.

Edumacate me! :)

Mike

Mike Shaver

unread,
Jun 11, 2008, 12:28:45 AM6/11/08
to Shawn Wilsher, dev-pl...@lists.mozilla.org
On Wed, Jun 11, 2008 at 12:14 AM, Shawn Wilsher <sdw...@mozilla.com> wrote:
>> 200-250ms also seems on the high side for UI feedback. Also consider
>> that the caller might have yet more work to do before presenting
>> results. And some callers might not be driving any UI with the result
>> (but may way immediate results. or not.)
> OK, so 150-200 ms. 200-250 ms is generally considered to be a safe
> amount of time to delay feedback for a user's action.

There's a difference between "delay feedback" and "interrupt user".
Try typing in an unbuffered ssh session with even 100ms latency, and
you'll see why. :)

Mike

Justin Dolske

unread,
Jun 11, 2008, 12:45:07 AM6/11/08
to
Shawn Wilsher wrote:

> OK, so 150-200 ms. 200-250 ms is generally considered to be a safe
> amount of time to delay feedback for a user's action.

I'm mostly just wary of making perf decisions for a core framework based
on average overall UI response times... It's a good perf metric for the
top-level product ("no UI response delayed by > 200 ms"), but I wouldn't
want to end up in a state where multiple layers between the user and the
data is assuming it can delay things by 200ms.

> I'm not so sure that consumers that want results immediately should be
> using the async api in the first place. That seems like synchronous is
> the way to go then.

That seems orthogonal. Async shouldn't have to mean high-latency.

Justin

Shawn Wilsher

unread,
Jun 11, 2008, 3:04:10 PM6/11/08
to Mike Shaver
Mike Shaver wrote:
> It'd be good to get some data on what the batch size looks like at 10,
> 25, 50, 100, 150, 200, etc. ms in your test setup. I agree that we
> want some batching, but I'm not sure why firing N 1-entry callbacks is
> worse than firing N/M M-entry callbacks in terms of starvation; is the
> per-callback invocation cost really so high? I don't get why it would
> have different effects on event scheduling.
I can't get any consistency with my test results. However, I'd like to
change my statement about the UI being completely responsive. Last time
I was just switching between windows, which works out fine. However,
what doesn't work so well is actually trying to use the UI (like typing
a url into the location bar and pressing enter for it to load). Nothing
drew on the screen for a very very long time (at least until the
downloads window was fully loaded and fully drawn as far as I could
tell). During this time of drawing, pressing cmd + w to try and close a
window would result in this in the console too:
WARNING: GetGeckoKeyCodeFromChar has failed.: file
/Code/moz/central/mozilla/widget/src/cocoa/nsChildView.mm, line 3796

It should be noted that I'm testing this in a debug build, so it's bound
to be a bit slower.

Cheers,

Shawn

Shawn Wilsher

unread,
Jun 12, 2008, 2:50:26 PM6/12/08
to Mike Shaver
Mike Shaver wrote:
> Edumacate me! :)
Sure thing. I was playing around with shark, with this patch applied to
the download manager:
http://hg.mozilla.org/users/sdwilsh_shawnwilsher.com/storage-async/index.cgi/file/0d48c200ffb3/dm.shark

And then applied this on top of the current async storage work:
http://hg.mozilla.org/users/sdwilsh_shawnwilsher.com/storage-async/index.cgi/file/0d48c200ffb3/storage.shark-playing

I'm going to paste some of the results in this message - I saved this
shark profiles so I can get more data from them. (I apologize for any
formatting issues in advance)

Size = 1
16.6% 16.6% libSystem.B.dylib mach_msg_trap
2.7% 2.7% XUL XPCWrappedNative::CallMethod(XPCCallContext&,
XPCWrappedNative::CallMode)
2.2% 2.2% libmozjs.dylib js_Interpret
2.1% 2.1% libmozjs.dylib SearchTable(JSDHashTable*, void const*,
unsigned, JSDHashOperator)
2.1% 2.1% libSystem.B.dylib pread$UNIX2003
1.9% 1.9% XUL XPCWrappedNative::FindTearOff(XPCCallContext&,
XPCNativeInterface*, int, unsigned*)
1.7% 1.7% libmozjs.dylib js_LookupPropertyWithFlags
1.6% 1.6% libSystem.B.dylib getattrlist$UNIX2003
1.6% 1.6% libnspr4.dylib _PR_Darwin_x86_AtomicDecrement
1.5% 1.5% libnspr4.dylib _PR_Darwin_x86_AtomicIncrement
1.2% 1.2% XUL XPCConvert::NativeInterface2JSObject(XPCCallContext&,
nsIXPConnectJSObjectHolder**, nsISupports*, nsID const*, JSObject*, int,
int, unsigned*)
1.2% 1.2% XUL SearchTable
1.2% 1.2% libSystem.B.dylib pthread_mutex_unlock
1.2% 1.2% commpage [libSystem.B.dylib] __spin_lock
1.1% 1.1% libmozjs.dylib JS_GetReservedSlot

Size = 10
10.9% 10.9% libSystem.B.dylib mach_msg_trap
3.4% 3.4% libmozjs.dylib js_Interpret
2.4% 2.4% XUL XPCWrappedNative::CallMethod(XPCCallContext&,
XPCWrappedNative::CallMode)
1.9% 1.9% XUL XPCWrappedNative::FindTearOff(XPCCallContext&,
XPCNativeInterface*, int, unsigned*)
1.9% 1.9% libmozjs.dylib js_LookupPropertyWithFlags
1.7% 1.7% XUL SearchTable
1.7% 1.7% libSystem.B.dylib pread$UNIX2003
1.7% 1.7% libnspr4.dylib PR_AtomicIncrement
1.7% 1.7% commpage [libSystem.B.dylib] __memcpy
1.5% 1.5% libnspr4.dylib _PR_Darwin_x86_AtomicIncrement
1.3% 1.3% libmozjs.dylib js_Invoke
1.3% 1.3% libSystem.B.dylib getattrlist$UNIX2003
1.3% 1.3% commpage [libSystem.B.dylib] __spin_lock
1.1% 1.1% XUL XBLResolve(JSContext*, JSObject*, long, unsigned, JSObject**)
1.1% 1.1% XUL SelectorMatches(RuleProcessorData&, nsCSSSelector*, int,
nsIAtom*, int*)
1.1% 1.1% libmozjs.dylib SearchTable(JSDHashTable*, void const*,
unsigned, JSDHashOperator)
1.1% 1.1% libSystem.B.dylib pthread_mutex_unlock
1.1% 1.1% XUL PL_DHashTableOperate
1.1% 1.1% XUL
nsCSSFrameConstructor::FindFrameWithContent(nsFrameManager*, nsIFrame*,
nsIContent*, nsIContent*, nsFindFrameHint*)
1.1% 1.1% HIServices ICMapping::GetIndMapEntry(long, long*, ICMapEntry*)
1.1% 1.1% HIServices ICDataConverter::GetMappingFromPropList(void
const*, long, void*, long*)

Size = 25
17.4% 17.4% libSystem.B.dylib mach_msg_trap
4.4% 4.4% libmozjs.dylib js_Interpret
2.2% 2.2% XUL XPCWrappedNative::CallMethod(XPCCallContext&,
XPCWrappedNative::CallMode)
2.2% 2.2% libmozjs.dylib js_LookupPropertyWithFlags
2.0% 2.0% libSystem.B.dylib pread$UNIX2003
2.0% 2.0% commpage [libSystem.B.dylib] __spin_lock
1.8% 1.8% XUL XPCWrappedNative::FindTearOff(XPCCallContext&,
XPCNativeInterface*, int, unsigned*)
1.5% 1.5% XUL SearchTable
1.5% 1.5% libmozjs.dylib js_SearchScope
1.3% 1.3% libmozjs.dylib SearchTable(JSDHashTable*, void const*,
unsigned, JSDHashOperator)
1.1% 1.1% libmozjs.dylib js_Invoke
1.1% 1.1% commpage [libSystem.B.dylib] __memcpy
1.1% 1.1% commpage [libSystem.B.dylib] __bzero

Size = 50
14.8% 14.8% libSystem.B.dylib close$UNIX2003
9.5% 9.5% libSystem.B.dylib mach_msg_trap
3.2% 3.2% libSystem.B.dylib pread$UNIX2003
2.8% 2.8% libmozjs.dylib js_Interpret
2.5% 2.5% XUL XPCWrappedNative::CallMethod(XPCCallContext&,
XPCWrappedNative::CallMode)
2.5% 2.5% libmozjs.dylib SearchTable(JSDHashTable*, void const*,
unsigned, JSDHashOperator)
1.9% 1.9% commpage [libSystem.B.dylib] __memcpy
1.7% 1.7% XUL XPCWrappedNative::FindTearOff(XPCCallContext&,
XPCNativeInterface*, int, unsigned*)
1.5% 1.5% XUL SearchTable
1.5% 1.5% commpage [libSystem.B.dylib] __spin_lock
1.3% 1.3% libmozjs.dylib js_LookupPropertyWithFlags
1.1% 1.1% libSystem.B.dylib pthread_mutex_lock

Some comments on all this:
We tend to spend a lot of % of scope in mach_msg_trap - and a good chunk
of this is in CoreGraphics. The amount in OSServices seems to drop,
with the exception of size = 25 (I should probably re-run that since it
seems like an anomaly all around) as the size increases.
For size = 1, we are in CFRunLoopRunSpecific (again with mach_msg_trap),
but CoreGraphics still beats it by a long shot.

So, I think in this case our biggest hit is all the drawing. With that
said, I've never used shark before today, so I'm not sure if I got all
the useful information out of it I could have, or if there's something
else I should have gotten. Feedback is, of course, appreciated.

Cheers,

Shawn

Shawn Wilsher

unread,
Jun 12, 2008, 4:44:35 PM6/12/08
to Mike Shaver
More on this. I've uploaded the shark profiles (per bsmedberg's
request), and they can be found here:
http://files.shawnwilsher.com/2008/6/12/

I re-ran the 25 size one (size = 25(2).mshark), and get similar results,
so maybe it's not an anomaly. Note, this is adding ~590 download items
to the download manager window.

Follow-ups to dev.platform.

Cheers,

Shawn

Sergey Yanovich

unread,
Jun 13, 2008, 10:47:17 AM6/13/08
to
I've dived a bit deeper into Mozilla's transport layer. I am confident
now, that "correct' API implementation should base mozIStorageStatement
of nsIRequest. The latter not only provides all control methods
discussed in this thread, but also defines callbacks and allows seamless
integration with existing progress/notification UI framework.

I am going to work on this in my project, and will report if there are
any results.

--
Sergey Yanovich

Shawn Wilsher

unread,
Jun 13, 2008, 11:49:49 AM6/13/08
to Sergey Yanovich
Sergey Yanovich wrote:
> I've dived a bit deeper into Mozilla's transport layer. I am confident
> now, that "correct' API implementation should base mozIStorageStatement
> of nsIRequest. The latter not only provides all control methods
> discussed in this thread, but also defines callbacks and allows seamless
> integration with existing progress/notification UI framework.
Maybe in terms of the two additional methods it provides (suspend and
resume), but it also has attributes that are unneeded such as loadGroup
and loadFlags.

I'd like to here more of *why* you feel that we need to use nsIRequest
instead of what's proposed (which is pretty similar).

Cheers,

Shawn

Sergey Yanovich

unread,
Jun 13, 2008, 12:16:08 PM6/13/08
to Shawn Wilsher
Shawn Wilsher wrote:
> Maybe in terms of the two additional methods it provides (suspend and
> resume), but it also has attributes that are unneeded such as loadGroup
> and loadFlags.

IIUC, loadGroup is a way to loosely couple specific request with UI (for
both actions and notifications). I am digging the tree for this ATM.

> I'd like to here more of *why* you feel that we need to use nsIRequest
> instead of what's proposed (which is pretty similar).

Right, it is pretty similar, probably even one-for-one equivalent. And
that's the main argument in favor of using pre-existing set of interfaces.

Going down the inheritance chain, we see nsIChannel with two important
methods: open() and asyncOpen(). nsIChannel also defines data transfer
scenarios: either via async callback (nsIStreamListener) or directly
(nsIInputStream). The scenario also has predefined error messages (like
NS_ERROR_IN_PROGRESS or NS_ERROR_ALREADY_OPENED) and notification points
(onStartRequest, onStopRequest), states (range of nsresult values).

In short, there already exists a well-tested async API inside Mozilla,
which is wired into existing user interface schemes.

--
Sergey Yanovich

Shawn Wilsher

unread,
Jun 13, 2008, 1:18:39 PM6/13/08
to Sergey Yanovich
Sergey Yanovich wrote:
> IIUC, loadGroup is a way to loosely couple specific request with UI
> (for both actions and notifications). I am digging the tree for this ATM.
loadFlags don't really apply in our case, but I think I see where you
are going with loadGroup. I'd like to see an actual use case of it for
the storage case though.

> Right, it is pretty similar, probably even one-for-one equivalent. And
> that's the main argument in favor of using pre-existing set of
> interfaces.
>
> Going down the inheritance chain, we see nsIChannel with two important
> methods: open() and asyncOpen(). nsIChannel also defines data transfer
> scenarios: either via async callback (nsIStreamListener) or directly
> (nsIInputStream). The scenario also has predefined error messages
> (like NS_ERROR_IN_PROGRESS or NS_ERROR_ALREADY_OPENED) and
> notification points (onStartRequest, onStopRequest), states (range of
> nsresult values).
What would our callback API be like exactly in this case?

> In short, there already exists a well-tested async API inside Mozilla,
> which is wired into existing user interface schemes.
I'll agree with you, but at the same time, I'm disinclined to make an
API work with our use case because it already exists. I brought up the
issue about changing the nsICancelable that we currently return because
it doesn't really work out well in practice, and I see the same issues
here potentially.

Cheers,

Shawn

Sergey Yanovich

unread,
Jun 13, 2008, 1:42:20 PM6/13/08
to Shawn Wilsher
Shawn Wilsher wrote:
> loadFlags don't really apply in our case, but I think I see where you
> are going with loadGroup. I'd like to see an actual use case of it for
> the storage case though.

Progress bars in download manager and history window, each tracking list
loading.

>> Right, it is pretty similar, probably even one-for-one equivalent. And
>> that's the main argument in favor of using pre-existing set of
>> interfaces.
>>
>> Going down the inheritance chain, we see nsIChannel with two important
>> methods: open() and asyncOpen(). nsIChannel also defines data transfer
>> scenarios: either via async callback (nsIStreamListener) or directly
>> (nsIInputStream). The scenario also has predefined error messages
>> (like NS_ERROR_IN_PROGRESS or NS_ERROR_ALREADY_OPENED) and
>> notification points (onStartRequest, onStopRequest), states (range of
>> nsresult values).
> What would our callback API be like exactly in this case?

Something like:

mozIStorageStatement : nsIChannel
*ResultSet : nsIInputStream
*StatementDataCallback : nsIStreamListener
*StatementStatusCallback : nsIRequestObserver

*Tuple unchanged

no *StatementCallback
no *Error

--
Sergey Yanovich

Shawn Wilsher

unread,
Jun 13, 2008, 2:32:07 PM6/13/08
to Sergey Yanovich
Sergey Yanovich wrote:
> Shawn Wilsher wrote:
> > loadFlags don't really apply in our case, but I think I see where you
> > are going with loadGroup. I'd like to see an actual use case of it for
> > the storage case though.
>
> Progress bars in download manager and history window, each tracking
> list loading.
I don't see how that's helpful since we don't know how many times we'll
be able to call ExecuteStep.

> >> Right, it is pretty similar, probably even one-for-one equivalent. And
> >> that's the main argument in favor of using pre-existing set of
> >> interfaces.
> >>
> >> Going down the inheritance chain, we see nsIChannel with two important
> >> methods: open() and asyncOpen(). nsIChannel also defines data transfer
> >> scenarios: either via async callback (nsIStreamListener) or directly
> >> (nsIInputStream). The scenario also has predefined error messages
> >> (like NS_ERROR_IN_PROGRESS or NS_ERROR_ALREADY_OPENED) and
> >> notification points (onStartRequest, onStopRequest), states (range of
> >> nsresult values).
> > What would our callback API be like exactly in this case?
>
> Something like:
>
> mozIStorageStatement : nsIChannel
This looks like a big binary compatibility change, which means this
can't be for 1.9.1
> *ResultSet : nsIInputStream
I don't see how this mapping works at all.
> *StatementDataCallback : nsIStreamListener
aOffset and aCount don't make sense for our use in onDataAvailable
> *StatementStatusCallback : nsIRequestObserver
I don't really see the benefit of having an onStartRequest in our situation.


Overall, I don't really like this - it feels like we are trying to take
an existing API and force our use case onto it. I think that the end
result would be more developer confusion because of overloaded API
semantics. With that said, we can model our API off what works well in
other API's (I'm all for stealing good ideas).

Cheers,

Shawn

Boris Zbarsky

unread,
Jun 14, 2008, 12:27:51 AM6/14/08
to Joe Drew
Joe Drew wrote:
> We aren't sticking hard and fast to this rule, AIUI. As long as your
> interface isn't frozen, anyways (and mozIStorageStatement doesn't seem
> to be).

That said, we _are_ avoiding such changes if they would have extension fallout.

-Boris

0 new messages