To make an offline application really offline means to store the user
data locally. While we have a lot of support for many things now,
application might become relatively rich with a lot of functions letting
data handled by them become complex and large. If you were making a
desktop application or your application was a common web application
with a server code written in, for example, PHP, what would you use to
store and retrieve user data? SQL database, probably MySQL or SQLite. I
would like to have this for my offline application as well, in the browser.
There were one major argument against from Boris Zbarsky: "We just don't
want to ship something that exposes implementation details (e.g. behaves
exactly like a particular version of SQLLite, forcing us to ship that
version forevermore in order to avoid breaking website compat)."
And other argument from Shawn Wilsher: "SQL doesn't feel like a very
DOM-oriented API. We should be able to do better."
So, now comes an idea to create a wrapping API, probably object one,
that would cover access to a "structured storage". I know, personally,
one guy that tried to do this. Even he was very clever programmer, he
failed to create a really useful and performance API, because to create
such is quit impossible. When your data gets more complex and grows and
your queries come more complex while you need a scalability and
performance, what is a very important factor of success, you find out,
that you need to directly use a statement to do high performance optimal
queries. And that is the SQL language, existing for years, proved by
many use cases. That is the only full-power API to access a "structured
storage".
Simple look at wikipedia about SQL tells me that there were some
standard in the past and looks like there is still some at the present:
http://en.wikipedia.org/wiki/SQL:2008, paradoxically, you have to
purchase it, so it's probably useless for us.
It seems to me easier to give chance to some SQL statement filter
between a user (offline application developer) and the implementation
(say SQLite) that wouldn't deny some constructions or keywords we might
consider non-standard and try to be clever and limit to an area that
would make this 'filter' be relatively easily remade to an adapter to a
different SQL engine implementation. I never tried to write something
like that and never though about such thing. Maybe this is yet more
impossible than to have an API as I described above. But I'am willing to
discover this land.
-hb-
Yeah, this is a really hard problem. I wasn't suggesting this by any
means... It'd be nice if we could do it, but I'm not convinced we can.
> you need to directly use a statement to do high performance optimal
> queries. And that is the SQL language, existing for years, proved by
> many use cases.
"the" SQL language? There is no such thing. Maybe you meant "a" SQL
language? If so, which one? As far as I can tell, each database
implementation has its own SQL, many of them quite incompatible with
each other. While there are some SQL standards, every single database I
know of implements a superset at best (many implement something that
overlaps the standards in some, but not all, areas), and in many cases
you have to use non-standard features to "do high performance optimal
queries".
> Simple look at wikipedia about SQL tells me that there were some
> standard in the past and looks like there is still some at the present:
> http://en.wikipedia.org/wiki/SQL:2008, paradoxically, you have to
> purchase it, so it's probably useless for us.
Yep.
> It seems to me easier to give chance to some SQL statement filter
> between a user (offline application developer) and the implementation
> (say SQLite) that wouldn't deny some constructions or keywords we might
> consider non-standard and try to be clever and limit to an area that
> would make this 'filter' be relatively easily remade to an adapter to a
> different SQL engine implementation.
Indeed. It seems to me like what we want to start with is by exposing
some subset of SQL that is in fact commonly enough implemented that
queries written to that subset Just Work. Even then we'll run into
issues because SQLite allows various things to work that don't work with
other databases and are hard to check for (e.g. its weak typing).
Question: would such a limited SQL subset be good enough for app developers?
Take anything I say here with a grain of salt of salt, by the way; I'm
not exactly a database expert. ;)
-Boris
On the server side one pretty much forced to use a database even if
using filesystem would be easier because it is so hard to get scalable
locking semantics right with concurrent access.
On the other hand desktop applications typically assumes that only one
user can access the persistent data and do not need to deal with
locking etc. So it is not surprising that IDE's, word processors,
graphical and video editors etc. do not use databases and rely on a
file system for their persistent storage.
Since offline applications more closer in nature to their desktop
equivalents, I do not see why it is so essential to expose SQL
database interface for them.
Regards, Igor
cheers,
david
> _______________________________________________
> dev-platform mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
Cheers,
Shawn
I'd just like to say -1 to ORM, after just having this discussion in
#developers re: Django's ORM. If we're going to provide something
that's ostensibly a relational database, then we should use SQL,
period. SQL is the language for querying relational databases. Even if
it varies between implementations, it's still a lot easier for someone
with some SQL knowledge to work in a dialect of it than to learn a
completely different ORM interface to query their data.
That being said, I also agree with Igor's point in that I'm not sure
why it's so necessary for webapps to have a SQL database. Are offline
apps really going to store so much data that a simple hashtable is
insufficient?
-Ted
It's not the amount of data but the structure. A simple hashtable is
great for simple data.
Unfortunately as it was already has been pointed out, there is no such
thing as the SQL language. There are *languages* which look similar
but that are different in details and in practice one just cannot
write a portable SQL application.
Regards, Igor
For many practical data (case in point - results of various
measurements or statistical modeling) the current relational databases
are not suitable either. So although relational databases provide a
better support for complex data structures than simple hashtables,
they do not cover all the needs. And given the lack of a 100% standard
way to access relational data, I see no point to provide API to access
them from offline applications.
Regards, Igor
For an offline app like Zimbra or gmail, which wants to allow rapid
search of all your e-mail, how is a simple hashtable possibly sufficient?
-Boris
Not necessarily the case. Desktop apps use databases for their
persistent storage too. Examples that come to mind off-hand are Kodak's
EasyShare photo-management software (stores all image metadata in an
EasyShare database; just opening the file shows some of the SQL used to
create the tables) and Firefox (stores bookmarks, history, etc, in an
SQLite database).
> Since offline applications more closer in nature to their desktop
> equivalents, I do not see why it is so essential to expose SQL
> database interface for them.
Offline app data storage is meant to replace server access, and in
particular to replace the server data store they can't get to while
offline. As you've pointed out, that server-side data store is SQL...
Of course it's probably a different SQL from whatever SQL browsers ship,
so it's not like you can use the same queries on client and server.
Nevertheless, I can see where the desire for SQL is coming from.
-Boris
I think you hit the nail on the head - the core issue here is hard is
it for web developers to adapt server-side logic to run client side.
If their core datastore is stored using fundamental similar structures
and semantics it really lowers the bar to entry.
I'd really plead with everyone to invent as little as possible here.
Yes there are SQL variants but these days they differ (unless you are
using more obscure vendor features) very little in ways that matter.
Most webapps these days use a very basic feature set of SQL. Yes
ORM's are great - but anyone who's used one to build a complex app
have found places where you have to drop into raw SQL for
functionality or performance reasons.
More than anything it would be awesome to have something shipping in
browsers *soon* - and developing a new data API that's right takes
time, effort, and hopefully feedback from the developer community.
On 9/3/09 7:13 AM, Schrep wrote:
> I think you hit the nail on the head - the core issue here is hard is
> it for web developers to adapt server-side logic to run client side.
> If their core datastore is stored using fundamental similar structures
> and semantics it really lowers the bar to entry.
>
> I'd really plead with everyone to invent as little as possible here.
> Yes there are SQL variants but these days they differ (unless you are
> using more obscure vendor features) very little in ways that matter.
Unfortunately SQLite differs in a major way from other SQL
implementations, by not imposing type constraints on columns.
Rob
You can build your own indexes using localStorage as your block storage
layer!
Of course, it would be better to provide such functionality in the
browser, especially since we already have full-text search functionality
in SQLite...
Then again, that's not perfect either and may not match all application
needs.
Rob
Mozilla's SQLite currently only ships with a whitespace-based tokenizer
and a Porter stemmer (which is English-specific) variant. This is
probably too limiting to expose until resolved.
https://bugzilla.mozilla.org/show_bug.cgi?id=414102
Thunderbird's bug/tentative strategy:
https://bugzilla.mozilla.org/show_bug.cgi?id=472764
Andrew
I think this is a pretty strong argument. In particular I think that
locking in SQLite's typeless columns as a requirement for the Web for
all time would be a mistake.
If it was possible to add a parsing and validation layer on top of
SQLite to force typechecking and to limit the exposed language to a
fixed subset, that would be a good solution IMHO. I've no idea whether
it is possible.
> And other argument from Shawn Wilsher: "SQL doesn't feel like a very
> DOM-oriented API. We should be able to do better."
I don't agree. ORM is complex and I don't think there is any obvious
"right approach" that fits all applications. On the other hand the basic
SQL model is well understood by a huge number of developers. The basic
structured data API should stick to that model; people can build ORM
libraries on top of it and if one approach emerges that would benefit
from browser integration, we can pave that cowpath.
C#'s LINQ is a wildly popular way of handling relational data in an OO
language and it isn't really ORM at all. Getting database rows back as
language objects with named fields is simple, obvious and useful --- and
provided by the WHATWG database spec.
I think we really do need to do something in this space ASAP, or people
are just going to use Webkit or Gears and SQLite and its quirks will
become the de facto standard for the Web. If we can wrap around SQLite
to enforce a sane SQL subset --- and describe that subset in the HTML5
spec --- then that would be my preference.
Rob
Right. I know a bunch of applications which would very much profit from
a fuzzy (fulltext) search, but so far I don't know an SQL database that
would build a fuzzy search index...
Robert Kaiser
If it's possible to write Django[1] support for SQLite, Postgres,
MySQL, and Oracle, those SQL dialects might be close enough to get
useful work done.
Maybe lack of type enforcement is a big problem, but I worry that
we're creating stumbling blocks for ourselves without a clear problem
statement. The WHATWG spec includes prepared statements, so it should
be possible for applications to enforce type constraints if the need
arises.
- Rob
[1] http://docs.djangoproject.com/en/dev/ref/databases/#ref-databases
I just want data represented as Objects. I have been utterly spoiled by
Django, Turbo Gears and other frameworks. The ORMs are not perfect, and
I understand that they are a little bloated, but I write SQL when
needed, and all is good.
Most ORMs within frameworks do not completely force themselves on
developers. In Django, for instance, you are never more than 1 line of
code away from a cursor.
cursor = db.connection.cursor()
You get the best of both worlds.
I have 2 apps in mind. One of which I literally have tens of thousands
of textual blobs in MySQL on a server, and would like to pull it all
down into an offline app (with the option of transferring it back).
Another is a messaging tool where I might have many hundreds or
thousands of messages. A hashtable would be messy and slow for this. Of
course these are extension and application ideas.
Personally, coming from the world of web frameworks and server-side
development, you get lazy, but good lazy. You can add features to your
app quickly - or at least iterate on ideas and get feedback fast.
I would like to see some kind of ORM for Firefox developers just for the
sake of be able to quickly try out ideas. At the very least get a
conversation started about what kind of api should we strive for - for
those 75% (or less) of instances when we should be able to just do:
var result = ComputerInventory.filter("name__startswith__Apple II");
and get back:
[{id:789,name:'Apple II e',year:1978},{id:145,name:'Apple II c',year:1979}]
And the rest of the time be able to get a cursor with one line of code.
Cheers,
David
The requirement is "the spec should allow for implementations that
reject field values of the wrong type". This might be desirable for
performance reasons, or just to allow an existing non-SQLite engine to
be plugged in. Of course, if we adopt this requirement then for the sake
of interop the spec should *require* typechecking, not just allow it.
Maybe we should discuss whether this requirement is actually wanted.
Rob
> Take anything I say here with a grain of salt of salt, by the way; I'm
> not exactly a database expert. ;)
Oh, does such a thing exist? ;-)
--
Warning: May contain traces of nuts.
How is a simple relational database possibly sufficient for full-text
indexing of email? Full-text indexing requires building a special
index, which is generally done with separate software, or add-on
modules to a database. There's no reason you couldn't implement an
inverted index using the existing globalStorage/localStorage APIs,
storing the full contents of messages in separate URIs which are
cached for offline use. You're not seriously suggesting that they
should store the entire contents of my mailbox in a database, are you?
-Ted
--> Dr E.F. Codd
Phil
--
Philip Chee <phi...@aleytys.pc.my>, <phili...@gmail.com>
http://flashblock.mozdev.org/ http://xsidebar.mozdev.org
Guard us from the she-wolf and the wolf, and guard us from the thief,
oh Night, and so be good for us to pass.
[ ]Memory is a thing we forget with.
* TagZilla 0.066.6
These performance issues are no different than current performance
issues that exist in the DOM that developers have to deal with.
Cheers,
Shawn
>On Mon, 09 Mar 2009 10:25:24 +0000, Neil wrote:
>
>
>>Boris Zbarsky wrote:
>>
>>>Take anything I say here with a grain of salt of salt, by the way; I'm not exactly a database expert. ;)
>>>
>>>
>>Oh, does such a thing exist? ;-)
>>
>>
>--> Dr E.F. Codd
>
>
He said "a" expert, not "the" ;-)
It would be interesting, albeit painful, to inspect offline GMail to see
how it actually works.
Rob
> It would be interesting, albeit painful, to inspect offline GMail to
> see how it actually works.
It may be easier to try and get in touch with the GMail team and have
them get involved in this conversation. So, too, with:
- WordPress
- Remember The Milk
- Feedly
- Google Apps team, basically
- Penny-Arcade
- phpBB?
- Flickr?
cheers,
mike
I and many others are strongly against ORM, LINQ and any other layer
over _a_ pure SQL. It has never been accepted in really wide praxis. As
I said before I have had an experience with such layers. The API was
huge and really heavy to use. To build things like INNER or OUTER JOIN,
use BETWEEN or IN operators, multiple WHERE clauses and ORDER BY, alter
a table, was impossible or very scabrous. And trust me, those are things
needed to build a reliable and highly responsible application. My former
colleague Ondrej Brablc with very deep experience of SQL databases,
product servers and client desktop applications has the same clean-cut
opinion.
My suggestion is to create a parser with a grammatic (a translator) of
our "MozSQL" language to any SQL implementation. I am about to create an
experimental extension for this as I have knowledge of compilers coding.
Our goal now is to select subset of features that are present in all
widely used SQL implementations and introduce things like type checking
and DATETIME typing. Someone probably made that work before us, Ondrej
also pointed me to http://dibiphp.com/cs/. It is, as I understand,
freely distributable PHP implementation for any underlaying database
engine, made by another Czech guy ;) It could be good inspiration for
us, and for me to rewrite this to a C++ component and expose it to JS.
Other good pointer is W3C SQL tutorial that seems to be quit general
with explanations of some differences.
Personally I would for sure include e.g.:
* ALTER TABLE
* INNER JOIN
* RIGHT or LEFT OUTER JOIN
* BETWEEN
* IN
* LIKE or GLOB
* UNION
* registering a new function
* indexing (and index altering) of one or more columns, have to
figure out how deep are differences in this area, for example
ordering requirement of SQLite to let a particular index work, and
construct some requirement for MozSQL if even limited.
Date-time type can always be internally turned to linux 8-byte time,
it's supported by all SQL engines, AFAIK. The enumeration of other types
should be limited, as I have never found useful to use integers of
different lengths as 4 or 8 bytes. This sounds like making a
mozilla-portable-sql-runtime to me :)
Andrew Sutherland wrote:
> Mozilla's SQLite currently only ships with a whitespace-based
> tokenizer and a Porter stemmer (which is English-specific) variant.
> This is probably too limiting to expose until resolved.
>
> https://bugzilla.mozilla.org/show_bug.cgi?id=414102
>
> Thunderbird's bug/tentative strategy:
> https://bugzilla.mozilla.org/show_bug.cgi?id=472764
>
> Andrew
I don't think it would be anyhow blocking to what I suggest here, but
exposing these features might be useful, I can imagine usage for it quit
a lot.
-hb-
> My suggestion is to create a parser with a grammatic (a translator) of
> our "MozSQL" language to any SQL implementation. I am about to create an
> experimental extension for this as I have knowledge of compilers coding.
WebSQL, you mean? It's important that we use a language that's compatible
with other browsers.
> Our goal now is to select subset of features that are present in all
> widely used SQL implementations and introduce things like type checking
> and DATETIME typing. Someone probably made that work before us, Ondrej
I think somebody needs to come up with a *detailed* proposal, with test
cases, in order to realistically consider this. I think the proposal you
have is very large, and we should start with something much smaller, i.e.:
* TEXT/INTEGER/DOUBLE data types, all unsized (no need for datetime, just
use an integer which the user can convert to/from a JS Date() object)
* only NOT NULL constraints. No foreign-key constraints
* strong typing with no automatic conversion
* no table or index alterations. Any schema changes should be done with
add/drop (and maybe rename?)
* indexes do the obvious thing: CREATE INDEX _name_ on _tablename_(col [,
col2...])
* only cartesian joins
* no LIKE or GLOB
* no unusual operators
* no subqueries
* no user functions
--BDS
Are you working on this now? I hope you are :-)
Rob
On Mar 8, 7:20 am, Boris Zbarsky <bzbar...@mit.edu> wrote:
> Igor Bukanov wrote:
> > On the server side one pretty much forced to use a database even if
> > using filesystem would be easier because it is so hard to get scalable
> > locking semantics right with concurrent access.
>
> > On the other hand desktop applications typically assumes that only one
> > user can access the persistent data and do not need to deal with
> > locking etc. So it is not surprising that IDE's, word processors,
> > graphical and video editors etc. do not use databases and rely on a
> > file system for their persistentstorage.
>
> Not necessarily the case. Desktop apps use databases for their
> persistentstoragetoo. Examples that come to mind off-hand are Kodak's
> EasyShare photo-management software (stores all image metadata in an
> EasyShare database; just opening the file shows some of the SQL used to
> create the tables) and Firefox (stores bookmarks, history, etc, in an
> SQLite database).
In this case, the applications are working against a definite object
model which hides the full power of SQL. In effect, whether it is SQL
or something else, the abstraction does a good job of hiding the
implementation.
>
> > Since offline applications more closer in nature to their desktop
> > equivalents, I do not see why it is so essential to expose SQL
> > database interface for them.
>
> Offline app datastorageis meant to replace server access, and in
> particular to replace the server data store they can't get to while
> offline.
This is so well put, yet many times we lose sight of this. If all we
are trying to do is provide a means of replacing server access, then
we should be trying to emulate a REST resource inside the browser and
not providing a contentious and omni-potent SQL interface to a
database embedded in the browser. I am referring to my BITSY proposal
to provide JavaScript interceptors to HTTP requests as a standard
primitive in WebStorage [1]. This way, a local response can be
generated to an HTTP request in just the same way that someone may
scour some database (relational or otherwise) on the server. Once
again, the abstraction would do a pretty good job of hiding the
implementation, which may, in fact, use SQL storage. IMHO, for offline
purposes, standardizing SQL for WebStorage before getting these
interceptors standardized is exactly the opposite of what will produce
a stable deck of standards.
> As you've pointed out, that server-side data store is SQL...
> Of course it's probably a different SQL from whatever SQL browsers ship,
> so it's not like you can use the same queries on client and server.
> Nevertheless, I can see where the desire for SQL is coming from.
[1] http://o-micron.blogspot.com/2009/04/bitsy-050-develop-seamlessly-on-lineoff.html