Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion SQLite and how to ensure proper (and fast) synchronization with in-memory objects.
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Landon Fuller  
View profile  
 More options Jan 31 2011, 7:28 pm
From: Landon Fuller <land...@plausible.coop>
Date: Mon, 31 Jan 2011 19:28:42 -0500
Local: Mon, Jan 31 2011 7:28 pm
Subject: Re: SQLite and how to ensure proper (and fast) synchronization with in-memory objects.

On Jan 30, 2011, at 12:25 AM, Brent Simmons wrote:

>> Hi guys, hi Brent :)

>> Planning to start a project that kind of behaves like an RSS reader,
>> I've come to the point where one has to ask himself "SQLite or Core
>> Data?".

>> My app is going to be doing quite a lot of (complex) SQL-style
>> querying, which can be quite painful with Core Data and certainly not
>> fast, once you hit the higher n-thousands.
>> This and other reasons led me to the conclusion I should rather use
>> SQLite than Core Data.

>> And then there also was this famous article by you, Brent, on your
>> decisions to dump Core Data in favor of SQLite & FMDB:
>> http://inessential.com/2010/02/26/on_switching_away_from_core_data

>> All four listed arguments against Core Data would totally apply to my
>> own app. And I can totally see the penalties with Core Data in these
>> four scenarios.

> Nevertheless, I *strongly* recommend using Core Data. As I pointed out in my initial post on the subject, Core Data is awesome. There are a few things that Core Data does slowly -- but devices and Macs are getting faster, and it's a good bet Core Data will keep improving. (Remember the old hockey adage: skate to where the puck is going to be.)

> I myself use FMDB when I'm working with non-object things. But for something like an article from a feed (or tweet from Twitter, or post from Facebook, or similar), I highly recommend using Core Data. Any time you're dealing with things that are more object-y than database-y, go with Core Data.

I actually argue the opposite view: I never use Core Data. Here's the rational behind my position:

In any given application that serializes data to disk, there are really two different models in an application:
        - The in-memory application object model.
        - The on-disk serialized data model.

The in-memory application object model services the needs of the application; it is a literal representation of the state of your application, in memory. The API and data it provides need only be maintained in the context of that specific version of the application, during that single runtime. You're free to modify the application model during development without constraint, as the specific concepts it is modeling do not need to be shared across release of the application -- if you add a bit of data or API to your application model and remove it later, there's no concern about long-term maintenance of that data, migrating it across versions, etc.

In contrast to the application model, the on-disk serialized data model is a high-level, abstract representation of your application's data that must be maintained across iterations of your application, and when optimally expressed, will likely not even map 1:1 with the optimal application's model. It requires unique consideration on a number of fronts:
        - Data longevity, specifically in ensuring that the concepts as modeled are maintainable across the lifetime of your data.
        - Data validity -- care to avoid duplicate, corrupt, or invalid data. Whereas there's often little harm to maintaining multiple read-only in-memory data records, on-disk data should be updated carefully.
        - Atomic or transactional updates. Often data changes must be implemented as an all or none transactional/atomic update.

CoreData attempts to tightly weld these two different abstract representations together, attempting to allow developers to leverage their single application model as both an on-disk serialization as well as an in-memory representation. Unfortunately this abstraction is extremely leaky, and Core Data's requirements spread pervasively through the application's in-memory model:
        - Every managed object must be a subclass of NSManagedObject, which results in objects inheriting a large number of non-overridable methods and strict requirements that provide object behavior as per Core Data's internal requirements, preventing the application author from expressing an API more suited to their in-memory model representation (such as overriding -hash, -isEqual, etc).
        - Managed objects are not thread-safe -- special case must be taken when using GCD or threads directly:
                - Managed objects should not be shared across threads (as per Apple's recommendation)
                - If shared across threads, it is the application author's responsibility to or to implement extremely complex and difficult locking to allow sharing of instances.
                - Care must be taken to safely merge data changes made in multiple managed object contexts on multiple threads, which potentially may occur asynchronously.
        - The most lightweight transactional/atomic update mechanism is the NSManagedObjectContext, however, when using multiple managed object contexts, as noted above, care must be taken to merge data correctly between contexts.

The leakiness of this abstraction is extremely similar to that of distributed objects, where the idea was that complexity of network communications could be hidden behind a simple object model; your application's model could also be your network model. This led to the same problems that Core Data has today, but instead with the network protocol poorly welded into the application model.

These approaches towards unifying models fail (in my opinion) because they fail to take into account that the models are genuinely different, and actually express different data:
        - The application model represents the application's in-memory state. It is transient and thus may be iterated on freely as to suite the application.
        - The on-disk serialization model is an abstract representation of the data the application operates on using the in-memory model. It is not the in-memory model, and must be maintained across versions of the application (or even potentially across different applications). The data must be updated according to transactional requirements of the application, errors must be safely handled in disk serialization, etc.
        - In the case of distributed objects, which I posit is extremely similar to Core Data, the network protocol defines not just a means of accessing the peer's state, but a dialog between independent applications with their own, potentially very different internal application model/state. The protocol must hand errors, retries, and other complexities that can not be adequately represented through seemingly transparent method calls.

This may be an unpopular view -- I honestly don't know, since this is the first time I've taken any time to explain it -- but I think that ultimately, application implementations benefit from a separation of application and serialization models in terms of implementation time, complexity, cost, and stability. It may be possible to achieve this by maintaining distinct application and Core Data-specific object hierarchies, but such an approach would seem to discard most of the advertised value of using Core Data in the first place.

Cheers,
Landon


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.