This is something we are looking at for Firefox.next, so the sooner you
give feedback, the better. :)
Cheers,
Shawn
I think we should make async API as similar to the synchronous one as
possible. What's the reason behind basing it on HTML5 SQL? Developers
are already accustomed to existing mozStorage APIs, so we shouldn't
deviate from them without a good reason. Learning curve aside, it would
create a big inconsistency. Let's take a look at the differences:
* AsyncTransaction is a completely different animal than a traditional
transaction.
* mozStorage APIs revolve around statement objects. There's no explicit
statement here.
* Parameters are bound in a different way
* Results are read in a different way
By the way, AsyncTransaction.execute() reads as if the transaction is
executed, not the statement. C.f. mozIStorageStatement.execute().
Another concern is the callback. Let's make it a single function instead
of an object implementing given methods. In the latter case you've got
to implement QI and the code that handles the statement is more
scattered. If you use many async statements, you usually make a
<code>switch</code> and forward the callbacks to separate dedicated
functions, anyway.
Here is the Shawn's example, adapted to use my idea of the API. Please
note that I have no knowledge of the internals of sqlite and mozStorage,
so I don't know how feasible this is.
// conn is a mozIStorageConnection
let sql = "SELECT * FROM moz_downloads WHERE state = ?1";
let stmt = conn.createStatement(sql);
stmt.bindInt32Parameter(0, Ci.nsIDownloadManager.DOWNLOAD_FINISHED);
stmt.executeAsync(callback);
function callback(aStatement, aResult, aError) {
if (aError)
// update UI accordingly
for (let row in aResult) {
// process data
// should be able to access data in a row by name or argument
number
}
}
Alternatively, the callback could be defined in the style of XMLHttpRequest.
stmt.onResultsReady = onResultsReady;
stmt.onError = onError;
stmt.executeAsync();
function onResultsRead(aStatement, aResults) {
}
function onError(aStatement, aError) {
}
- Adam
I am extensively using mozIStorage interfaces in an accounting xul
application, that is approaching production readiness. Shawn, thanks for
this opportunity to present my point.
Async calls are a great idea!
Comments on the proposed interface:
1. Is there any technical reasons that require AsyncTransaction?
It seems to be redundant otherwise. The best practice is use sql
parameters, which are to be bound after the statement is already
prepared and its syntax is verified. The counter-proposal is to add
'executeAsync()' to 'mozIStorageStatement'.
2. Separate method to obtain ResultSet is a right thing. We can probably
also change 'mozIStorageStatement' synchronous calls API in the same
way. Currently, the first call of 'executeStep()' has a special EXECUTE
meaning, while all subsequent ones actually mean FETCH ROW. Separating
this two meanings may be a good idea.
Instead of passing ResultSet proxy to a callback, let's:
drop 'boolean executeStep();',
drop 'void reset();',
drop 'readonly attribute long state;'
add 'readonly attribute mozIStorageValueArray resultSet;'
to 'mozIStorageStatement', and stop subclassing 'mozIStorageValueArray'
there. In turn 'mozIStorageValueArray' will receive
'readonly attribute long state;',
'void fetchNext();', and
'void fetchNextAsync([optional] in Callback aCallback);'
So 'mozIStorageValueArray' (aka cursor) is separated from
'mozIStorageStatement'. The former can later be extended to provide
support for random access cursors. 'reset()' goes away entirely. It can
be invoked internally in 'execute()', 'executeAsync()' and 'finalize()'.
As a result, mozIStorage* will become more SQL-ish and friendlier to
other databases like MySQL or PostgreSQL. There is a legitimate question
"why do we care about other databases?". Here is one hypothetical
example: since we already keep a hefty part of user profile (history and
bookmarks) in a relational database, we can move the rest of valuable
data -- mainly passwords and extensions (yes! extensions) -- there as
well. So the profile will contain a database (or databases) and cached
data. This creates an opportunity for big organizations to use a central
database to store and manage extensions, web settings and browsing data
of their users. And if such a user logins simultaneously on several
workstations, no browsing data will be lost.
3. Finally, can we directly use nsIWebProgressListener2 as a callback
class in both 'executeAsync()' and 'fetchNextAsync()'?
This can be the first step of many to provide full database progress
feedback using progress bar. Long places operations (the main client of
storage) will clearly benefit from this feature.
4. And woooooohooooooo, as 3.0 RC1 has finally been tagged for building!
--
Sergey Yanovich
> * AsyncTransaction is a completely different animal than a traditional
> transaction.
Yes, and no. Each statement has an implicit transaction wrapped around
it - the whole statement either succeeds or fails. With that said, I'm
open to a better name on this.
> * mozStorage APIs revolve around statement objects. There's no explicit
> statement here.
I'm not sure this is a bad thing, but it is different.
> * Parameters are bound in a different way
It's my opinion that our current approach to binding parameters is
clunky and error-prone. I'd love to fix that, and that's what I'm
trying to do here.
> * Results are read in a different way
Same as above. Note that since this is targeting Firefox.next, which is
supposed to come out some time in the fall, we don't want to change
existing APIs. So we can add to our current API, but changing it should
be avoided if possible.
> By the way, AsyncTransaction.execute() reads as if the transaction is
> executed, not the statement. C.f. mozIStorageStatement.execute().
Implicit transaction again. Although, I do believe this could be done
without the AsyncTransaction object which would probably ease this
concern of yours (I had a reason for doing it with AsyncTransaction
because I shared your line of thinking initially. Something made me
change my mind, but I can't recall what that was currently. When I
figure it out again, I'll post).
> Another concern is the callback. Let's make it a single function instead
> of an object implementing given methods. In the latter case you've got
> to implement QI and the code that handles the statement is more
> scattered. If you use many async statements, you usually make a
> <code>switch</code> and forward the callbacks to separate dedicated
> functions, anyway.
Letting one method handle both errors and handling of an actual result
is a form of Logical Cohesion which is generally frowned upon (yeah, we
do it in the mozilla codebase in various places). Functional Cohesion
is preferred. I recommend reading Code Complete 2nd Ed., Chapter 7 for
more details (I imagine wikipedia has something on it too).
Additionally, JS consumers don't have to implement the QI code, and they
don't even need to provide an error or result handler if they don't want
to. XPConnect, as I understand it, will just throw
NS_ERROR_NOT_IMPLEMENTED if the back-end code tries to call it. The
back-end should be able to handle that gracefully.
> // conn is a mozIStorageConnection
> let sql = "SELECT * FROM moz_downloads WHERE state = ?1";
> let stmt = conn.createStatement(sql);
> stmt.bindInt32Parameter(0, Ci.nsIDownloadManager.DOWNLOAD_FINISHED);
> stmt.executeAsync(callback);
The problem with this approach is that preparing the statement (the
createStatement call) is often the most expensive operation for SELECT
statements. With the current API we can do that work on the background
thread[pool].
> Alternatively, the callback could be defined in the style of
> XMLHttpRequest.
The only reason why I don't like this approach is because it seems like
a lot more lines of code to accomplish the same thing. With that said,
I'm indifferent about it.
Thanks for the great feedback. Let me know if this addresses your concerns.
Cheers,
Shawn
> It seems to be redundant otherwise. The best practice is use sql
> parameters, which are to be bound after the statement is already
> prepared and its syntax is verified. The counter-proposal is to add
> 'executeAsync()' to 'mozIStorageStatement'.
I think that this might be the way to go, but I don't think we should
have them prepare it first (see previous post about that in reply to Adam).
> 2. Separate method to obtain ResultSet is a right thing. We can probably
> also change 'mozIStorageStatement' synchronous calls API in the same
> way. Currently, the first call of 'executeStep()' has a special EXECUTE
> meaning, while all subsequent ones actually mean FETCH ROW. Separating
> this two meanings may be a good idea.
I'd love to, but we don't want to have that much API churn for Firefox
3.1. For Firefox 4 we can look at that.
> 3. Finally, can we directly use nsIWebProgressListener2 as a callback
> class in both 'executeAsync()' and 'fetchNextAsync()'?
I'm not sure I understand how that is useful, so you'd have to elaborate
on that.
Cheers,
Shawn
(1) Dropped the AsyncTransaction interface, and moved its executeAsync
method onto mozIStorageConnection. There was no reason for the
interface to exist, and it's one less thing consumers have to worry about.
(2) Added a SQLError interface that will be passed back to the callback
for errors. There is, however, no definition as I'm not 100% sure what
we want to include when we notify that there was an error.
(3) Updated the sample code to go along with the new interface proposal.
Cheers,
Shawn
[1]
http://developer.mozilla.org/en/docs/User_talk:Comrade693#Async_Storage_Statements
Overall: It would probably be useful to explicitly state, near the top
of the wiki page, exactly which Mozilla version this is targeting -
sounds like Firefox.next / 1.9.nn?
Shawn Wilsher wrote:
> Adam Kowalczyk wrote:
>> // conn is a mozIStorageConnection
>> let sql = "SELECT * FROM moz_downloads WHERE state = ?1";
>> let stmt = conn.createStatement(sql);
>> stmt.bindInt32Parameter(0, Ci.nsIDownloadManager.DOWNLOAD_FINISHED);
>> stmt.executeAsync(callback);
> The problem with this approach is that preparing the statement (the
> createStatement call) is often the most expensive operation for SELECT
> statements. With the current API we can do that work on the background
> thread[pool].
On the other hand, this also means it's not possible to re-use statements.
I'm not quite sure how that second argument (the one with
the things to bind) can be implemented, given that XPIDL doesn't, as far
as I know, have method overloading. It seems to take either a JS Array,
Object, or a C++ Parameters thing that is to be determined. Would we
still get type-checking on the C++ side? (Hey, if you somehow manage to
get that, all the better.)
Of course, I know about nil about API design ;)
(Off-topic note-to-self: trigger + mozIStorageFunction = win)
--
Mook
While strictly speaking true, there are ways to get around this, esp. if all
your calls are from JS into C++. You can use nsIVariant, or declare no args in
the IDL and read them directly using XPConnect APIs.
-Boris
Sure. I mistook Firefox.next as Firefox 4.
>> 3. Finally, can we directly use nsIWebProgressListener2 as a callback
>> class in both 'executeAsync()' and 'fetchNextAsync()'?
> I'm not sure I understand how that is useful, so you'd have to elaborate
> on that.
I had in mind whole-sale database access like when History -> Show All
History is invoked. Our current Library window doesn't contain progress
bar, because it would be useless, since the routine that populates the
main tree is synchronous. But things will change, once async call
becomes possible. We can then provide substantially better user
experience by doing async loading and showing progress in a progress bar.
However, in the context of a dot release, introducing this new
functionality to the mozIStorage* no longer seems a good idea.
A) I started to argue with the necessity of changing connection
interface, but then came around this:
> The problem with this approach is that preparing the statement (the createStatement call) is often the most expensive operation for SELECT statements.
This is a great argument in favor of creating a statement
asynchronously, but it is also an argument in favor of storing the
produced statement for re-use.
This may have been Shawn's motivation to introduce AsyncTransaction.
Nevertheless, it must be an async operation. A possible name could be
createStatementAsync(). Another important method of the statement is
finalize(), which is the only reliable way to free the database from JS.
B) In the view of createStatementAsync(), Parameters callback receives
additional task of storing the statement.
C) Finally, ResultSet behavior needs some clarification. From how it is
used in the example, it may be inferred, that getNextResult() is
synchronous. That means, all data needs to be fetched and saved locally
before handleResult() is invoked. A number of questions arise:
* When working the large amounts of data (like browsing history), this
may result is excessive memory consumption.
* Although the data is fetched row by row, nothing will be displayed,
until the last row is fetched.
* If there are errors in the end of a large list, nothing will be
displayed at all.
Proposed solution is to have API for both paths of execution:
1. mozIStorageConnection.executeSimpleAsync(in AUTF8String aSQL,
[optional] in Parameters aParameters, [optional] in Callback aCallback);
2. mozIStorageConnection.createStatementAsync(in in AUTF8String aSQL,
[optional] in Parameters aParameters), then
mozIStorageStatement.executeAsync(in Callback aCallback), then
mozIStorageValueArray.fetchNextAsync(in Callback aCallback) for each
row.
D) Callback needs to have separate methods handleAllResults/handleResult
respectively to handle 1. and 2.
--
Sergey Yanovich
This is true for very light-weight SELECT statements, to be clear, and
it only matters if you're running them many times.
> This is a great argument in favor of creating a statement
> asynchronously,
No -- it can't really block, and the additional complexity of using it
will be a huge developer cost for no gain.
> but it is also an argument in favor of storing the
> produced statement for re-use.
Yes.
:)
Mike
>> The problem with this approach is that preparing the statement (the
>> createStatement call) is often the most expensive operation for SELECT
>> statements. With the current API we can do that work on the
>> background thread[pool].
> On the other hand, this also means it's not possible to re-use statements.
This is true, but I think I have a solution for that. I need to think
about it a bit more though, so that'll come in a later post.
> I'm not quite sure how that second argument (the one with
> the things to bind) can be implemented, given that XPIDL doesn't, as far
> as I know, have method overloading. It seems to take either a JS Array,
> Object, or a C++ Parameters thing that is to be determined. Would we
> still get type-checking on the C++ side? (Hey, if you somehow manage to
> get that, all the better.)
I've updated the wiki (revision 71419 [1]) to better detail this. I
think it's quite possible to do all this and still get type-checking.
Cheers,
Shawn
With that said, I think my previous statement wasn't very complete.
With large, fragmented database files, stepping through the result can
also be expensive since it has to keep reading from various places on
the disk.
Additionally, while preparing the statement is work, it's not blocking
the UI, so I'm not sure we should be worrying about saving that work.
Perhaps I'm wrong on this though...
> B) In the view of createStatementAsync(), Parameters callback receives
> additional task of storing the statement.
That violates Functional Cohesion, which I'd prefer to not do.
> C) Finally, ResultSet behavior needs some clarification. From how it is
> used in the example, it may be inferred, that getNextResult() is
> synchronous. That means, all data needs to be fetched and saved locally
> before handleResult() is invoked. A number of questions arise:
>
> * When working the large amounts of data (like browsing history), this
> may result is excessive memory consumption.
>
> * Although the data is fetched row by row, nothing will be displayed,
> until the last row is fetched.
>
> * If there are errors in the end of a large list, nothing will be
> displayed at all.
I'm not sure I understand this point specifically.
> Proposed solution is to have API for both paths of execution:
> 1. mozIStorageConnection.executeSimpleAsync(in AUTF8String aSQL,
> [optional] in Parameters aParameters, [optional] in Callback aCallback);
>
> 2. mozIStorageConnection.createStatementAsync(in in AUTF8String aSQL,
> [optional] in Parameters aParameters), then
> mozIStorageStatement.executeAsync(in Callback aCallback), then
> mozIStorageValueArray.fetchNextAsync(in Callback aCallback) for each
> row.
Based on my previous comments in this reply (and the one I'm about to
make), I don't think that's needed.
> D) Callback needs to have separate methods handleAllResults/handleResult
> respectively to handle 1. and 2.
I do not believe that this is necessary. handleResult can handle one or
more rows at a time, and it can be called more than once. I've updated
the wiki page accordingly (revision 71426 [1])
This may seem odd, but at least one other API that I know of does that -
Threading Building Blocks with it's Body and Range concepts. A Body may
have its function operator called more than once on different ranges,
which is essentially what we are doing here.
Let me know if this doesn't clarify things enough.
Cheers,
Shawn
[1]
http://developer.mozilla.org/en/docs/index.php?title=User_talk:Comrade693&oldid=71426
I'm jumping into the middle of the thread here, but would the following work?
Store both the original string and a prepared statement. Have an unsigned int
counter initialized to 0.
An attempt to execute the statement does the following:
1) atomically increment the counter
2a) If the counter value is now 1, go ahead and execute
the prepared statement, then atomically decrement the counter.
2b) Otherwise, atomically decrement the counter, create a one-off
statement from the string, and execute it.
This maintains the invariant that any time we're executing the prepared
statement the counter is nonzero, so that no one else can enter the prepared
statement execution. In the common case, the only overhead over blindly
executing the prepared statement is that we have to do the increment/decrement
and one branch. If two execution attempts race, one will hit the fast path,
while the other hits the slow path. At least if the branch between 2a and 2b is
done on the return value of PR_AtomicIncrement.
> Additionally, while preparing the statement is work, it's not blocking
> the UI, so I'm not sure we should be worrying about saving that work.
> Perhaps I'm wrong on this though...
Are all statement preparations happening on the same thread? How many
statements do we expect to be in-flight at once? Depending on the answer to the
former, it seems to me that if there is the potential for lots of statement
preparation to be happening, we'll either be blocking the statement preparation
thread (causing UI lag in whatever operation is showing a progress bar to the
user while waiting for the query to complete) or spawn off lots of statement
preparation threads (and some will end up blocking on lack of execution units,
though that might become less of a problem as processors have more and more cores).
But in any case, if it's cheap enough it might be worth optimizing for something
other than just UI responsiveness here (e.g. power consumption? ;) ).
-Boris
Yeah, now I seem to have more questions, than before :)
1. What's the common use case(s) for async calls? Should the resulting
API be universal or optimized for common usage?
2. When is mozIStorageStatement.finalize() expected to be called, if at all?
3. How many times is handleResult() expected to be called, when the
query yields 10, 100, 10000 rows? Is ResultSet.getNextResult() synchronous?
The answers will let me be more specific with my comments.
Thanks,
--
Sergey Yanovich
> 2. When is mozIStorageStatement.finalize() expected to be called, if at
> all?
You don't have to ever call it. Internally, it's called in the
statements destructor so when your js var is garbage collected it is
taken care of. However, it was added because you couldn't move/delete
the database file if a connection was still open, and a connection stays
open as long as there is a statement that isn't finalized. Seamonkey
specifically needed this.
> 3. How many times is handleResult() expected to be called, when the
> query yields 10, 100, 10000 rows? Is ResultSet.getNextResult() synchronous?
I think defining an amount would be a poor idea. The back-end could
chunk it in time, or it could chunk it some other way. I think that
decision should be decided at implementation time and be hidden from the
interface.
As a result, I don't think ResultSet::getNextResult is synchronous in
terms of database access. I'm envisioning the back-end to dump the data
into an object that is passed back to the callback. If that's not what
you meant by synchronous, you'll have to elaborate more :)
Cheers,
Shawn
(1) executeAsync has been moved onto mozIStorageStatement. You can now
use a prepared statement that has parameters already bound with this API.
(2) As a result of (1), the fancy binding of parameters magic has been
removed. We might look at that after this is done for Firefox 3.1, but
it will likely make Firefox 4.
(3) Code samples updated accordingly.
Questions and feedback welcome :)
Cheers,
Shawn
Either way was good to me, so there was no concern :) Nevertheless, I
like the made choice more than the other.
>> 2. When is mozIStorageStatement.finalize() expected to be called, if
>> at all?
> You don't have to ever call it. Internally, it's called in the
> statements destructor so when your js var is garbage collected it is
> taken care of. However, it was added because you couldn't move/delete
> the database file if a connection was still open, and a connection stays
> open as long as there is a statement that isn't finalized. Seamonkey
> specifically needed this.
After the decision to keep creating statements old way, this concern is
void. Statement object will remain to be normally accessible from js.
But this answers brings us back to the question 1. What we are trying to
solve? Please correct me, but I've read in bugzilla, that lack of
asynchronous API in storage makes browsing experience really ugly, if
the profile resides on a NFS mount. If this case falls into the range of
issues being addressed, we could probably reconsider this decision.
>> 3. How many times is handleResult() expected to be called, when the
>> query yields 10, 100, 10000 rows? Is ResultSet.getNextResult()
>> synchronous?
> I think defining an amount would be a poor idea. The back-end could
> chunk it in time, or it could chunk it some other way. I think that
> decision should be decided at implementation time and be hidden from the
> interface.
>
> As a result, I don't think ResultSet::getNextResult is synchronous in
> terms of database access. I'm envisioning the back-end to dump the data
> into an object that is passed back to the callback. If that's not what
> you meant by synchronous, you'll have to elaborate more :)
I understand the word "synchronous" when used with a method as follows.
If the method returns after the job is done, it is synchronous. If the
job continues to run in a different thread/process, when the method
returns, it is asynchronous.
It is now crystal clear from your answer, that the result set is deemed
to be created before handelResult() is invoked, so
ResultSet::GetNextResult() is a synchronous method (which fetches
products of asynchronous database access :)).
Now I can rephrase my earlier concerns on how this approach can handle
large result sets, like browsing history:
* When working the large amounts of data (like browsing history), this
may result is excessive memory consumption and allocation overhead. If
the caller plans to store all the results in its own storage, the
storage will be allocated 2 times: in the back-end, and by the caller.
* Although the data is fetched row by row, nothing will be displayed,
until the last row is fetched.
* If there are errors in the end of a large list, nothing will be
displayed at all. F.e., if there is network failure on the row #9999 of
10000, the back-end will invoke handleError(), and all fetched data will
be lost. The data can be recovered in handleError() as a workaround, but
this will violate Cohesion, right?
With all that said, we are also going to have an accessible statement,
which is a mozIStorageValueArray instance. So ResultSet is completely
redundant in this case, because the result set can be normally iterated
with mozIStorageStatement.executeStep(). Correct?
My proposal is that we add a executeStepAsync(). When the caller knows,
that the result set may be huge, it will use this and expect to be
called once for each row.
To enforce Functional Cohesion, Callback will need 2 separate methods
for this two async calls. They may be handleResultSet (for all) and
handleResult (for one). Both should receive mozIStorageStatement as a
parameter.
Thanks,
--
Sergey Yanovich
> Now I can rephrase my earlier concerns on how this approach can handle
> large result sets, like browsing history:
>
> * When working the large amounts of data (like browsing history), this
> may result is excessive memory consumption and allocation overhead. If
> the caller plans to store all the results in its own storage, the
> storage will be allocated 2 times: in the back-end, and by the caller.
This already happens with most other database API's, so I don't think
it's a big deal. Also, I don't expect most situations to be handling
large chunks of data. However, if this does turn out to be a problem,
we can always have the background task providing the data "slow down" as
it were, and wait for existing data to be processed before generating more.
> * Although the data is fetched row by row, nothing will be displayed,
> until the last row is fetched.
I'm not sure how you've come to that conclusion, but I don't believe it
to be true.
> * If there are errors in the end of a large list, nothing will be
> displayed at all. F.e., if there is network failure on the row #9999 of
> 10000, the back-end will invoke handleError(), and all fetched data will
> be lost. The data can be recovered in handleError() as a workaround, but
> this will violate Cohesion, right?
I don't see why we couldn't give the caller the data we have and then
invoke handleError. Also, many calls could have been made to
handleResult before we even hit that error.
> With all that said, we are also going to have an accessible statement,
> which is a mozIStorageValueArray instance. So ResultSet is completely
> redundant in this case, because the result set can be normally iterated
> with mozIStorageStatement.executeStep(). Correct?
We won't be exposing this though. Internally, in the background we'll
be calling it to generate ResultSet objects, but this shouldn't be
exposed to handleResult. The statement would already be consumed when
generating the ResultSet object. I *really* don't see the benefit of
exposing a mozIStorageStatement object to handleResult.
Cheers,
Shawn
My understanding was that handleResult() is called once on success, and
handleError() is called once on failure. If this is not correct, then
there is a new problem. I need to know when everything is fetched, cause
I will eventually need to free statement/close connection.
>> * If there are errors in the end of a large list, nothing will be
>> displayed at all. F.e., if there is network failure on the row #9999
>> of 10000, the back-end will invoke handleError(), and all fetched data
>> will be lost. The data can be recovered in handleError() as a
>> workaround, but this will violate Cohesion, right?
> I don't see why we couldn't give the caller the data we have and then
> invoke handleError. Also, many calls could have been made to
> handleResult before we even hit that error.
I thought that callback is invoked just once. Otherwise it is not clear,
when the processing is complete.
>> With all that said, we are also going to have an accessible statement,
>> which is a mozIStorageValueArray instance. So ResultSet is completely
>> redundant in this case, because the result set can be normally
>> iterated with mozIStorageStatement.executeStep(). Correct?
> We won't be exposing this though. Internally, in the background we'll
> be calling it to generate ResultSet objects, but this shouldn't be
> exposed to handleResult. The statement would already be consumed when
> generating the ResultSet object. I *really* don't see the benefit of
> exposing a mozIStorageStatement object to handleResult.
Well, the statement is available to the client, whether we want it or
not. In addition, executeStep() is the current approach to enumerating
the query results. In other words, mozIStorageStatement is a complete
logical equivalent to the proposed ResultSet, and executeStep() is a
complete logical equivalent to the proposed getNextResult(). More than
that, executeStep() is synchronous just as getNextResult().
So we are back to question:
> 3. How many times is handleResult() expected to be called, when the query yields 10, 100, 10000 rows?
Now I see, that I didn't interpreted your answer correctly. My major
problem here is that I need to know when everything is fetched. What's
wrong with calling the callback for each fetched row? That's approach
for processing large result sets. For small result sets, we can have a
different function, which will return when everything is already
fetched. I propose to name them executeStepAsync() and executeAsync()
respectively, similar to how their synchronous analogs are called.
--
Sergey Yanovich
The callback can be called multiple times, with whatever the "natural"
batching order is from the database. Early implementations might well
use 1 as the batch size.
We could send a flag with handleResult to indicate that it's the last
result set, or use .handleComplete() call. I prefer the latter,
because it also lets us signal multiple errors using a similar model,
but it's not a strong preference.
Mike
If you really need to know when it's all done, we could add another
method that looks something like void handleCompletion();
> Well, the statement is available to the client, whether we want it or
> not. In addition, executeStep() is the current approach to enumerating
> the query results. In other words, mozIStorageStatement is a complete
> logical equivalent to the proposed ResultSet, and executeStep() is a
> complete logical equivalent to the proposed getNextResult(). More than
> that, executeStep() is synchronous just as getNextResult().
How is the statement available to the client? They don't get it in the
async callback, and the api doesn't state that the stmt that
executeAsync is called on is the one that's actually used. In fact, in
terms of implementation we can't use the same statement - we'll have to
clone it and transfer the bindings.
The difference between executeStep and getNextResult is that the former
has to read from disk, while the latter reads from memory.
> So we are back to question:
>> 3. How many times is handleResult() expected to be called, when the
>> query yields 10, 100, 10000 rows?
>
> Now I see, that I didn't interpreted your answer correctly. My major
> problem here is that I need to know when everything is fetched. What's
> wrong with calling the callback for each fetched row? That's approach
> for processing large result sets. For small result sets, we can have a
> different function, which will return when everything is already
> fetched. I propose to name them executeStepAsync() and executeAsync()
> respectively, similar to how their synchronous analogs are called.
I don't see how getting a callback for each row would solve your problem
of knowing when everything is done. Getting a callback for each row
will perform much slower than just running the call synchronously since
you have lots and lots of event proxying. Not only that, we'd still
have to copy the results over into memory for each row, otherwise the
background thread is blocked waiting for the data to be consumed on the
calling thread before it can step through the results again.
Additionally, having one API for "large" data sets and another for
"small" data sets is just asking for problems. If we can make one API
that satisfies both cases, we don't leave a footgun lying around for
developers to come across and play with. I am not yet convinced that
what is in the wiki now will perform poorly with "large" data sets.
I've gone ahead and clarified the wording for executeAsync (revision
71470 [1]):
void executeAsync([optional] in Callback aCallback);
Executes a prepared mozIStorageStatement and calls the callback (if
provided) when a result is returned. This statement is then reset, and
can be reused synchronously or asynchronously again immediately.
Cheers,
Shawn
On May 12, 11:50Â am, Shawn Wilsher <sdwi...@mozilla.com> wrote:
> I'm looking for feedback on the proposed API (with sample code) for an
> asynchronous API for mozStorage. Â What I've got so far can be found here:http://developer.mozilla.org/en/docs/User_talk:Comrade693#Async_Stora...
void handleCompletion();
Called when there are now more results to be obtained. This is called
only once.
Cheers,
Shawn
[1]
http://developer.mozilla.org/en/docs/index.php?title=User_talk:Comrade693&oldid=71471#Callback
Cheers,
Shawn
> void executeAsync([optional] in Callback aCallback);
> Executes a prepared mozIStorageStatement and calls the callback (if
> provided) when a result is returned. This statement is then reset, and
> can be reused synchronously or asynchronously again immediately.
Mike Shaver wrote:
> The callback can be called multiple times, with whatever the "natural"
> batching order is from the database. Early implementations might well
> use 1 as the batch size.
Callback looks great now. Thanks for cooperative approach to the things.
Natural batching order is exactly the way I wished to see, because it's
the fastest.
The last remaining point is asynchronous statement creation. The
database validates the statement, and it needs to read metadata from
disks for that. If the database is over NFS mount, and the metadata is
not cached in memory, it may well be a lengthy operation, blocking UI.
Synchronous API has mozIStorageConenction::executeSimple(), which
doesn't expose the statement. We may have
void executeSimpleAsync(in AUTF8Strings aSql, [optional] in Callback
aCallback);
with a similar behavior.
Cheers,
--
Sergey Yanovich
Metadata is eagerly cached in memory, and IMO the additional API
complexity is not worth it. Do you find that your app is unreasonably
blocked in statement preparation? If an NFS mount goes away anything
can freeze, since we treat file: loading as synchronous (for
performance reasons as well as portability), so defending against it
in this one case with additional API baggage doesn't feel necessary to
me.
I'm an open-minded sort of guy, though -- do any other APIs provide an
async single-statement preparation API? Let's see which, if so, and
why.
Mike
Most of my application database access calls use parameters, so this is
not an immediate concern. However, I will be implementing mozIStorage*
binding for MySQL. Synchronous statement preparation on a remote server
will certainly be a problem, but I will handle that in my code.
> If an NFS mount goes away anything
> can freeze, since we treat file: loading as synchronous (for
> performance reasons as well as portability), so defending against it
> in this one case with additional API baggage doesn't feel necessary to
> me.
Correct, NFS normally takes care of making file operations asynchronous,
but sqlite is using fsync(), which is the cause of unresponsiveness. So
sqlite is the only place, where we need to handle this issue.
The first time, I've heard about async API was related to NFS profiles,
so I'm just trying to ensure this particular issue is addressed. If
that's considered a non-blocker, I am just fine with everything else :)
Grepping the tree for 'ecuteSimple' shows, it is heavily used to
create/drop tables/indexes. Index operations may take some time on large
tables, so switching them to async API will improve UI responsiveness,
and maybe Ts also.
Other than that, I am fine with the latest version. Thanks for taking
MHO into consideration.
Cheers,
--
Sergey Yanovich
Abstract Accounting Ltd.
http://aasii.org/
If you have a long-running statement that requires some sort of
UI-feedback (eg a progress meter), it'd be nice to know when to stop
that progress meter without having to poll.
Also, how does one abort a long-running statement?
Dan
> If you have a long-running statement that requires some sort of
> UI-feedback (eg a progress meter), it'd be nice to know when to stop
> that progress meter without having to poll.
Yeah, handleCompletion() should handle this case.
> Also, how does one abort a long-running statement?
I missed this point :). What about making handleResult() and
handleError() return boolean: true - go on, false (or error) - stop here?
--
Sergey Yanovich
Cheers,
Shawn
So we have a queue, and the current call returns false. Can backend just
release the rest of the queue with callbacks?
--
Sergey Yanovich
I wanted to say "without invoking callbacks". :)
Cheers,
Shawn
It was said in a earlier post, that callbacks will be called in the
natural batching order of the underlying database. AFAICT unless
nsICancelable is going to kill the background working thread with brute
force system calls, the cancellation will be happening at the same
points, where callbacks are invoked.
In other words, the proposed nsICancelable functionality may be
implemented by the client in the Callback, without complicating the API.
But either way will work.
Cheers,
--
Sergey Yanovich
As a result, we have two options:
(1) Keep track of what is bound internally. For statements that are
executed synchronously, this won't be much use, but we'll need it for
the async case. This could consume a lot of memory if someone is
binding a large blob too.
(2) Pass in an "object" of bound parameters like earlier proposals of
the API suggested.
I'm OK with either solution here, but I wanted to get the opinion of
other folks.
Cheers,
Shawn
[1] http://www.mail-archive.com/sqlite...@sqlite.org/msg33987.html
It appears that this is no longer neccessary. One of my assumptions
about thread safety turned out to be incorrect for sqlite.
Cheers,
Shawn
Cheers,
Shawn
1) How much information is going to be provided to
storageIStatementCallback.handleError()? The SQL error code and the
corresponding error string, for sure. If that's all, then perhaps they
could be passed as parameters and we could get rid of storageIError. On
the other hand, having an object which you can throw may also be useful.
2) The established nomenclature in existing interfaces uses "row", not
"tuple". See mozIStorageStatementRow, mozIStorageStatementWrapper.row.
"Row" is also used in descriptions of other attributes and methods.
3) Which methods of the callback are used by non-SELECT statements?
handleError() and handleCompletion(), I assume?
4) I am not convinced of the value of separating the iterator
(storageIResultSet) and a result (storageITuple). The iterator could
easily provide the results, too. I know this approach causes horrible
overloading of the statement object in case of synchronous statements.
However, storageIResultSet doesn't share all the other functionality of
a statement, it only handles the results. Separating things further
seems like going from one extreme to the other.
The practical consequence for the consumers is a little less concise
code, because of an additional variable. Also, you have to choose
between assigning that variable inside |while|, which is considered a
bad practice by some people, and calling getNextTuple() in two different
places. [1]
Ideally, handleResult() would take a JS array of storageITuple's and
storageIResultSet wouldn't be needed.
I would go further and only return nsresult, at least for the beginning.
String manipulations are expensive, especially GC string manipulations
[1].
> 2) The established nomenclature in existing interfaces uses "row", not
> "tuple". See mozIStorageStatementRow, mozIStorageStatementWrapper.row.
> "Row" is also used in descriptions of other attributes and methods.
++here. And do we really need 'storageITutle'? Maybe we can just extend
mozIStorageValueArray to have those nice two methods?
> 4) I am not convinced of the value of separating the iterator
> (storageIResultSet) and a result (storageITuple). The iterator could
> easily provide the results, too. I know this approach causes horrible
> overloading of the statement object in case of synchronous statements.
> However, storageIResultSet doesn't share all the other functionality of
> a statement, it only handles the results. Separating things further
> seems like going from one extreme to the other.
>
> The practical consequence for the consumers is a little less concise
> code, because of an additional variable. Also, you have to choose
> between assigning that variable inside |while|, which is considered a
> bad practice by some people, and calling getNextTuple() in two different
> places. [1]
I had the same concerns, and was even proposing to use ExecuteStep()
here. But ExecuteStep() is begging to be replaced, and we are discussing
the new API. So we can just have the things right the first time.
> Ideally, handleResult() would take a JS array of storageITuple's and
> storageIResultSet wouldn't be needed.
We cannot communicate JS array via XPCOM to native callers, can we?
--
Sergey Yanovich
> ++here. And do we really need 'storageITutle'? Maybe we can just extend
> mozIStorageValueArray to have those nice two methods?
Can't - see Bug 435994 Comment 5 [1] for the detailed story as to why.
> We cannot communicate JS array via XPCOM to native callers, can we?
Not that I'm aware of.
Cheers,
Shawn
Error-case performance isn't something I think we need to be too
concerned with; if the error checking is in your hot path, you have
problems that the API isn't going to help you with.
Mike
> 2) The established nomenclature in existing interfaces uses "row", not
> "tuple". See mozIStorageStatementRow, mozIStorageStatementWrapper.row.
> "Row" is also used in descriptions of other attributes and methods.
It seems like vlad would prefer row over tuple as well. I'm fairly
indifferent about this.
> 3) Which methods of the callback are used by non-SELECT statements?
> handleError() and handleCompletion(), I assume?
That is likely the case.
> 4) I am not convinced of the value of separating the iterator
> (storageIResultSet) and a result (storageITuple). The iterator could
> easily provide the results, too. I know this approach causes horrible
> overloading of the statement object in case of synchronous statements.
> However, storageIResultSet doesn't share all the other functionality of
> a statement, it only handles the results. Separating things further
> seems like going from one extreme to the other.
I really don't like the idea of overloading an objects meaning like
that. I know we do that already with mozIStorageStatement, but it
doesn't mean we should continue to do that if it can be avoided.
> The practical consequence for the consumers is a little less concise
> code, because of an additional variable. Also, you have to choose
> between assigning that variable inside |while|, which is considered a
> bad practice by some people, and calling getNextTuple() in two different
> places. [1]
Not sure I'm buying this argument. What's wrong with:
handleResult: function(aResultSet)
{
for (let tuple = aResultSet.getNextTuple(); tuple;
tuple = aResultSet.getNextTuple()) {
// Stuff
}
}
or, even cleaner:
handleResult: function(aResultSet)
{
let tuple;
while ((tuple = aResultSet.getNextTuple()) {
// Stuff
}
}
Like any good rule, there are always exceptions to it, and in this case,
I don't see why assigning in while is bad.
> Ideally, handleResult() would take a JS array of storageITuple's and
> storageIResultSet wouldn't be needed.
That wouldn't work well for native code.
Cheers,
Shawn
We can, and in cases where the win is big enough should, specialize
for different consumer languages.
But an array of interface pointers isn't too hard to work with from
native code, IIRC, as long as it's an in-param.
Mike
> But an array of interface pointers isn't too hard to work with from
> native code, IIRC, as long as it's an in-param.
That involves passing the data and the size? Does that turn into a JS
array in JS land, or no?
Cheers,
Shawn
In our case, specializing for JS in mozStorage will add
mozStorage-over-JS dependency, right? Is that worth it?
--
Sergey Yanovich
It's already built after JS and XPConnect, and I don't think it's a
design goal at all to have parts of toolkit work without JS and
XPConnect present.
Mike
Actually no, storage is currently built as part of tier_necko, which is
before gecko or toolkit.
But I agree, we should not base design decisions on that kind of API purity,
especially because we want to move towards clean JS APIs as our primary
programming interfaces.
--BDS
const unsigned short REASON_COMPLETED = 0;
const unsigned short REASON_CANCELED = 1;
const unsigned short REASON_ERROR = 2;
void handleCompletion(in unsigned short aReason);
Problem with this is that I'm still not sure that I can reliably test
this code. The issue is that you could cancel at just the right time,
and the event to notify you about completion would already be posted to
the calling thread, so you expect to get a REASON_CANCELED, but get
REASON_COMPLETED instead. This will make the test unreliable :(
Maybe I'm missing something here though, so I'm hoping the masses have
some other insight.
Cheers,
Shawn
It may also be good to know which request is completed.
> Problem with this is that I'm still not sure that I can reliably test
> this code. The issue is that you could cancel at just the right time,
> and the event to notify you about completion would already be posted to
> the calling thread, so you expect to get a REASON_CANCELED, but get
> REASON_COMPLETED instead. This will make the test unreliable :(
Don't post events directly, but to a queue. Call a queue handler on the
callback thread. The handler can access the queue with some mutex
protection, and invoke the callback. The canceling routine will also use
mutex to clean up the queue. And if the handler sees an empty queue, it
silently exits.
--
Sergey Yanovich
> Don't post events directly, but to a queue. Call a queue handler on the
> callback thread. The handler can access the queue with some mutex
> protection, and invoke the callback. The canceling routine will also use
> mutex to clean up the queue. And if the handler sees an empty queue, it
> silently exits.
I see two issues with that approach:
1) It's a no-no to call code outside your module while you hold a mutex.
2) That still doesn't cover the case when it's actually completed and
you try to cancel. Telling the callback that it's completed because
it's canceled, while it may be true, isn't completely true because it
was in fact completed because it was complete.
Cheers,
Shawn
Yeah. This can be achieved by a separate Callback object for each request.
>> Don't post events directly, but to a queue. Call a queue handler on
>> the callback thread. The handler can access the queue with some mutex
>> protection, and invoke the callback. The canceling routine will also
>> use mutex to clean up the queue. And if the handler sees an empty
>> queue, it silently exits.
> I see two issues with that approach:
> 1) It's a no-no to call code outside your module while you hold a mutex.
You don't need to to hold the mutex while invoking clients callback.
Only for queue access.
> 2) That still doesn't cover the case when it's actually completed and
> you try to cancel. Telling the callback that it's completed because
> it's canceled, while it may be true, isn't completely true because it
> was in fact completed because it was complete.
It actually does. The canceling code will know for sure whether the
request is completed. So the canceling may provide this type of feedback.
--
Sergey Yanovich
I'll work on doing this now.
Cheers,
Shawn
Cheers,
Shawn
Absolutely. That's exactly what I mean. And we should avoid throwing in
the normal flow of execution whenever possible. JS exceptions are
terribly expensive. It triggers memory scan for catch blocks in JS, then
the exception goes through xpconnect handling, which is long and
sub-optimal, because noone normally cares about error codepaths.
--
Sergey Yanovich
Agreed; this allows a single object to listen for multiple outstanding
requests.
> How? We don't tell you what request you are getting information on with
> any of the other callback functions either.
Have all of the handle*() methods pass back the mozIStorageStatement
that triggered them as a param?
Dan
Cheers,
Shawn
>> How? We don't tell you what request you are getting information on with
>> any of the other callback functions either.
>
> Have all of the handle*() methods pass back the mozIStorageStatement
> that triggered them as a param?
Right, but the problem is that we don't use the same one you prepared.
We clone it and use it on the background since statements cannot be used
on different threads at the same time. Also, you would be unable to
use the statement on the calling thread until the background thread was
done with it. This isn't so useful if you want to do a series of async
events with the same statement.
Cheers,
Shawn
I think so, but it depends on whether canceling a completed request is
considered a normal or not ;) If it is, it may even be worth to change
nsICancelable.
--
Sergey Yanovich
I don't have a particularly strong opinion there. My hunch is that it's
somewhat a case-by-case sort of thing, but I'd be interested in hearing
other thoughts there.
>>> How? We don't tell you what request you are getting information on with
>>> any of the other callback functions either.
>>
>> Have all of the handle*() methods pass back the mozIStorageStatement
>> that triggered them as a param?
> Right, but the problem is that we don't use the same one you prepared.
> We clone it and use it on the background since statements cannot be used
> on different threads at the same time.
You could give each statement an integer ID. Alternately, perhaps the
statement API could make slightly stronger thread-safety promises.
> Also, you would be unable to use the statement on the calling thread until
> the background thread was done with it.
Even though the background thread is using a clone?
Dan
It's a very common practice -- how else do you rendezvous multiple
async operations? Every select() or poll() loop is like this.
Objects do multiple things all over the place in codebase, which is
why we have QI and why we try to avoid name collisions between members
of related interfaces.
If we split the history and bookmarks portions of Places, f.e., the
AwesomeBar would want to listen for the results from the searches it
kicked off in parallel on each, and handle them largely the same way.
(Personal annoyance: "not a good programming practice" is almost
content-free; say what the problem is that you are trying to avoid,
and how the practice you advocate helps.)
Mike
Cheers,
Shawn
> If we split the history and bookmarks portions of Places, f.e., the
> AwesomeBar would want to listen for the results from the searches it
> kicked off in parallel on each, and handle them largely the same way.
I'm not really buying this argument since bookmarks are a subset of
history almost always (and enough so that they can be considered as such).
> (Personal annoyance: "not a good programming practice" is almost
> content-free; say what the problem is that you are trying to avoid,
> and how the practice you advocate helps.)
In this case, I just recall reading somewhere that overloading objects
to do more than one thing can end up getting quite confusing. An object
that does one thing and only one thing can do it well much easier than
an object that does lots of things and has to do all of them really
well. With that said, it's just a nagging feeling in the back of my
head that I'm not sure if it's worth listening too, but wanted to bring
it up. It's one of those "strong opinions, weakly held" [1].
Cheers,
Shawn
re: thread-safety
I thought about this initially, but I had a concern:
Many consumers would still want to execute statements synchronously
(some API's are inherently sync, and can't be changed to async without
breaking API compatibility), so do we really want to add a lot of
locking overhead for something that isn't the common case (at least
initially, that may change over time making this no longer an issue).
This seems to be a big concern, since I know there are a fair number of
statements that run during startup too.
Cheers,
Shawn
interface storageICancelable : nsISupports
{
/**
* Attempts to canceled.
*
* @returns true if it was successfully canceled, and false if it
* could not be canceled.
*/
boolean cancel();
}
Thoughts?
Cheers,
Shawn
It now has a parameter that indicates the reason for the completion,
which can be a normal completion, cancellation, or a fatal error.
Cheers,
Shawn
Either extend nsICancelable or return an integer instead of an
nsICancellable.
> re: thread-safety
> I thought about this initially, but I had a concern:
> Many consumers would still want to execute statements synchronously
> (some API's are inherently sync, and can't be changed to async without
> breaking API compatibility), so do we really want to add a lot of
> locking overhead for something that isn't the common case (at least
> initially, that may change over time making this no longer an issue).
One could imagine making a threadsafety frob on the API that could be
adjusted at some appropriate level of granularity (per statement?) and
would default to off.
Dan
When playing with the API more in Bug 438342 (build the download list
asynchronously) [1], I've come to the conclusion that this behavior
isn't ideal for any moderately large list. When drawing the downloads
window, it would remain blank for three to four seconds, then things
would display. I should note that the browser UI was fully responsive
during this time - the downloads window just didn't draw as fast as it
does now. Now, there are a few reasons why I suspect this is happening:
1) The downloads.sqlite file I'm generating this from has 590+ entries.
This is more than most users would have, but I'm using it as a good
way to stress test our API.
2) The query that is used by the download manager has an ORDER BY
clause, which means SQLite executes the query and obtains *all* the
results, then sorts them. This makes this a great query to run on the
background, but at the same time means that results are obtained very
quickly. As a result, we have 590+ events dispatched to the main thread
in a very short amount of time.
Personally, I think this is a problem with the current implementation.
I think we would be better off grouping more than one row per result set
dispatched. This is what I feel would be a good grouping strategy to
implement. There is a disclaimer on this though - I haven't tested
anything with these changes. I'm looking for feedback on this to see if
I'm blatantly missing something obvious, etc. With that said, we should
group results by the following conditions (whichever one comes first):
1) 200 milliseconds have elapsed since we last dispatched a result set
to the callback and we have at least one row. This number isn't just
pulled out of the air! 200-250 ms is generally accepted as the "safe"
amount of time to wait before displaying some form of feedback to the
user that something is happening.
2) 10-20 results. This number would need some experimentation, but the
idea is to not have too much in one result set, so the processing
function doesn't get a ton of things to process and manage to lock up
the main thread.
3) when we've run out of results. I shouldn't have to explain this one.
I figured it's best to get these thoughts out here in the open so folks
can comment on them.
Cheers,
Shawn
> When playing with the API more in Bug 438342 (build the download list
> asynchronously) [1], I've come to the conclusion that this behavior
> isn't ideal for any moderately large list. When drawing the downloads
> window, it would remain blank for three to four seconds, then things
> would display. I should note that the browser UI was fully responsive
> during this time - the downloads window just didn't draw as fast as it
> does now. Now, there are a few reasons why I suspect this is happening:
How much of that is just due to the perf issues in bug 413209?
In any case, batching query results doesn't seem unreasonable.
> Personally, I think this is a problem with the current implementation. I
> think we would be better off grouping more than one row per result set
> dispatched. This is what I feel would be a good grouping strategy to
> implement.
The caller might want to have some control over this?
You might also want to consider having the batching only kick in after
the first N results, so that initial results can come in quickly.
200-250ms also seems on the high side for UI feedback. Also consider
that the caller might have yet more work to do before presenting
results. And some callers might not be driving any UI with the result
(but may way immediate results. or not.)
Justin
> The caller might want to have some control over this?
>
> You might also want to consider having the batching only kick in after
> the first N results, so that initial results can come in quickly.
That's covered by point (2) in my post, right?
> 200-250ms also seems on the high side for UI feedback. Also consider
> that the caller might have yet more work to do before presenting
> results. And some callers might not be driving any UI with the result
> (but may way immediate results. or not.)
OK, so 150-200 ms. 200-250 ms is generally considered to be a safe
amount of time to delay feedback for a user's action.
We still don't want to dump too much back to the caller at once (iff
it's on the main thread) because even if it doesn't drive UI, it can
block the UI. We can have a much simpler heuristic for callbacks that
do not live on the main thread though.
I'm not so sure that consumers that want results immediately should be
using the async api in the first place. That seems like synchronous is
the way to go then.
Cheers,
Shawn
Sync hurts if it can interrupt the user, such as when triggering an
awesomebar query during typing, or if you're on a slow/busy
filesystem. It should be avoided, like sync XHR.
It'd be good to get some data on what the batch size looks like at 10,
25, 50, 100, 150, 200, etc. ms in your test setup. I agree that we
want some batching, but I'm not sure why firing N 1-entry callbacks is
worse than firing N/M M-entry callbacks in terms of starvation; is the
per-callback invocation cost really so high? I don't get why it would
have different effects on event scheduling.
Edumacate me! :)
Mike
There's a difference between "delay feedback" and "interrupt user".
Try typing in an unbuffered ssh session with even 100ms latency, and
you'll see why. :)
Mike
> OK, so 150-200 ms. 200-250 ms is generally considered to be a safe
> amount of time to delay feedback for a user's action.
I'm mostly just wary of making perf decisions for a core framework based
on average overall UI response times... It's a good perf metric for the
top-level product ("no UI response delayed by > 200 ms"), but I wouldn't
want to end up in a state where multiple layers between the user and the
data is assuming it can delay things by 200ms.
> I'm not so sure that consumers that want results immediately should be
> using the async api in the first place. That seems like synchronous is
> the way to go then.
That seems orthogonal. Async shouldn't have to mean high-latency.
Justin
It should be noted that I'm testing this in a debug build, so it's bound
to be a bit slower.
Cheers,
Shawn
And then applied this on top of the current async storage work:
http://hg.mozilla.org/users/sdwilsh_shawnwilsher.com/storage-async/index.cgi/file/0d48c200ffb3/storage.shark-playing
I'm going to paste some of the results in this message - I saved this
shark profiles so I can get more data from them. (I apologize for any
formatting issues in advance)
Size = 1
16.6% 16.6% libSystem.B.dylib mach_msg_trap
2.7% 2.7% XUL XPCWrappedNative::CallMethod(XPCCallContext&,
XPCWrappedNative::CallMode)
2.2% 2.2% libmozjs.dylib js_Interpret
2.1% 2.1% libmozjs.dylib SearchTable(JSDHashTable*, void const*,
unsigned, JSDHashOperator)
2.1% 2.1% libSystem.B.dylib pread$UNIX2003
1.9% 1.9% XUL XPCWrappedNative::FindTearOff(XPCCallContext&,
XPCNativeInterface*, int, unsigned*)
1.7% 1.7% libmozjs.dylib js_LookupPropertyWithFlags
1.6% 1.6% libSystem.B.dylib getattrlist$UNIX2003
1.6% 1.6% libnspr4.dylib _PR_Darwin_x86_AtomicDecrement
1.5% 1.5% libnspr4.dylib _PR_Darwin_x86_AtomicIncrement
1.2% 1.2% XUL XPCConvert::NativeInterface2JSObject(XPCCallContext&,
nsIXPConnectJSObjectHolder**, nsISupports*, nsID const*, JSObject*, int,
int, unsigned*)
1.2% 1.2% XUL SearchTable
1.2% 1.2% libSystem.B.dylib pthread_mutex_unlock
1.2% 1.2% commpage [libSystem.B.dylib] __spin_lock
1.1% 1.1% libmozjs.dylib JS_GetReservedSlot
Size = 10
10.9% 10.9% libSystem.B.dylib mach_msg_trap
3.4% 3.4% libmozjs.dylib js_Interpret
2.4% 2.4% XUL XPCWrappedNative::CallMethod(XPCCallContext&,
XPCWrappedNative::CallMode)
1.9% 1.9% XUL XPCWrappedNative::FindTearOff(XPCCallContext&,
XPCNativeInterface*, int, unsigned*)
1.9% 1.9% libmozjs.dylib js_LookupPropertyWithFlags
1.7% 1.7% XUL SearchTable
1.7% 1.7% libSystem.B.dylib pread$UNIX2003
1.7% 1.7% libnspr4.dylib PR_AtomicIncrement
1.7% 1.7% commpage [libSystem.B.dylib] __memcpy
1.5% 1.5% libnspr4.dylib _PR_Darwin_x86_AtomicIncrement
1.3% 1.3% libmozjs.dylib js_Invoke
1.3% 1.3% libSystem.B.dylib getattrlist$UNIX2003
1.3% 1.3% commpage [libSystem.B.dylib] __spin_lock
1.1% 1.1% XUL XBLResolve(JSContext*, JSObject*, long, unsigned, JSObject**)
1.1% 1.1% XUL SelectorMatches(RuleProcessorData&, nsCSSSelector*, int,
nsIAtom*, int*)
1.1% 1.1% libmozjs.dylib SearchTable(JSDHashTable*, void const*,
unsigned, JSDHashOperator)
1.1% 1.1% libSystem.B.dylib pthread_mutex_unlock
1.1% 1.1% XUL PL_DHashTableOperate
1.1% 1.1% XUL
nsCSSFrameConstructor::FindFrameWithContent(nsFrameManager*, nsIFrame*,
nsIContent*, nsIContent*, nsFindFrameHint*)
1.1% 1.1% HIServices ICMapping::GetIndMapEntry(long, long*, ICMapEntry*)
1.1% 1.1% HIServices ICDataConverter::GetMappingFromPropList(void
const*, long, void*, long*)
Size = 25
17.4% 17.4% libSystem.B.dylib mach_msg_trap
4.4% 4.4% libmozjs.dylib js_Interpret
2.2% 2.2% XUL XPCWrappedNative::CallMethod(XPCCallContext&,
XPCWrappedNative::CallMode)
2.2% 2.2% libmozjs.dylib js_LookupPropertyWithFlags
2.0% 2.0% libSystem.B.dylib pread$UNIX2003
2.0% 2.0% commpage [libSystem.B.dylib] __spin_lock
1.8% 1.8% XUL XPCWrappedNative::FindTearOff(XPCCallContext&,
XPCNativeInterface*, int, unsigned*)
1.5% 1.5% XUL SearchTable
1.5% 1.5% libmozjs.dylib js_SearchScope
1.3% 1.3% libmozjs.dylib SearchTable(JSDHashTable*, void const*,
unsigned, JSDHashOperator)
1.1% 1.1% libmozjs.dylib js_Invoke
1.1% 1.1% commpage [libSystem.B.dylib] __memcpy
1.1% 1.1% commpage [libSystem.B.dylib] __bzero
Size = 50
14.8% 14.8% libSystem.B.dylib close$UNIX2003
9.5% 9.5% libSystem.B.dylib mach_msg_trap
3.2% 3.2% libSystem.B.dylib pread$UNIX2003
2.8% 2.8% libmozjs.dylib js_Interpret
2.5% 2.5% XUL XPCWrappedNative::CallMethod(XPCCallContext&,
XPCWrappedNative::CallMode)
2.5% 2.5% libmozjs.dylib SearchTable(JSDHashTable*, void const*,
unsigned, JSDHashOperator)
1.9% 1.9% commpage [libSystem.B.dylib] __memcpy
1.7% 1.7% XUL XPCWrappedNative::FindTearOff(XPCCallContext&,
XPCNativeInterface*, int, unsigned*)
1.5% 1.5% XUL SearchTable
1.5% 1.5% commpage [libSystem.B.dylib] __spin_lock
1.3% 1.3% libmozjs.dylib js_LookupPropertyWithFlags
1.1% 1.1% libSystem.B.dylib pthread_mutex_lock
Some comments on all this:
We tend to spend a lot of % of scope in mach_msg_trap - and a good chunk
of this is in CoreGraphics. The amount in OSServices seems to drop,
with the exception of size = 25 (I should probably re-run that since it
seems like an anomaly all around) as the size increases.
For size = 1, we are in CFRunLoopRunSpecific (again with mach_msg_trap),
but CoreGraphics still beats it by a long shot.
So, I think in this case our biggest hit is all the drawing. With that
said, I've never used shark before today, so I'm not sure if I got all
the useful information out of it I could have, or if there's something
else I should have gotten. Feedback is, of course, appreciated.
Cheers,
Shawn
I re-ran the 25 size one (size = 25(2).mshark), and get similar results,
so maybe it's not an anomaly. Note, this is adding ~590 download items
to the download manager window.
Follow-ups to dev.platform.
Cheers,
Shawn
I am going to work on this in my project, and will report if there are
any results.
--
Sergey Yanovich
I'd like to here more of *why* you feel that we need to use nsIRequest
instead of what's proposed (which is pretty similar).
Cheers,
Shawn
IIUC, loadGroup is a way to loosely couple specific request with UI (for
both actions and notifications). I am digging the tree for this ATM.
> I'd like to here more of *why* you feel that we need to use nsIRequest
> instead of what's proposed (which is pretty similar).
Right, it is pretty similar, probably even one-for-one equivalent. And
that's the main argument in favor of using pre-existing set of interfaces.
Going down the inheritance chain, we see nsIChannel with two important
methods: open() and asyncOpen(). nsIChannel also defines data transfer
scenarios: either via async callback (nsIStreamListener) or directly
(nsIInputStream). The scenario also has predefined error messages (like
NS_ERROR_IN_PROGRESS or NS_ERROR_ALREADY_OPENED) and notification points
(onStartRequest, onStopRequest), states (range of nsresult values).
In short, there already exists a well-tested async API inside Mozilla,
which is wired into existing user interface schemes.
--
Sergey Yanovich
Cheers,
Shawn
Progress bars in download manager and history window, each tracking list
loading.
>> Right, it is pretty similar, probably even one-for-one equivalent. And
>> that's the main argument in favor of using pre-existing set of
>> interfaces.
>>
>> Going down the inheritance chain, we see nsIChannel with two important
>> methods: open() and asyncOpen(). nsIChannel also defines data transfer
>> scenarios: either via async callback (nsIStreamListener) or directly
>> (nsIInputStream). The scenario also has predefined error messages
>> (like NS_ERROR_IN_PROGRESS or NS_ERROR_ALREADY_OPENED) and
>> notification points (onStartRequest, onStopRequest), states (range of
>> nsresult values).
> What would our callback API be like exactly in this case?
Something like:
mozIStorageStatement : nsIChannel
*ResultSet : nsIInputStream
*StatementDataCallback : nsIStreamListener
*StatementStatusCallback : nsIRequestObserver
*Tuple unchanged
no *StatementCallback
no *Error
--
Sergey Yanovich
Overall, I don't really like this - it feels like we are trying to take
an existing API and force our use case onto it. I think that the end
result would be more developer confusion because of overloaded API
semantics. With that said, we can model our API off what works well in
other API's (I'm all for stealing good ideas).
Cheers,
Shawn
That said, we _are_ avoiding such changes if they would have extension fallout.
-Boris