Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Places Data Model

29 views
Skip to first unread message

Todd Agulnick

unread,
Apr 21, 2006, 2:21:19 PM4/21/06
to dev-apps...@lists.mozilla.org
Hi,

I'm an extension developer looking at modifying an extension that
currently works with the old bookmarks API to work with Places. I wanted
to start by understanding the data model (I'm old fashioned that way)
that's been implemented for Places, but I couldn't find any documentation.

Rather than complain about that lack, I thought it would be more in the
community spirit to create some doc, and by doing so also gain a better
of understanding of what's going on. The paltry fruits of my efforts can
be found here: http://www.foxcloud.com/Places/ -- the original is a
Visio document (yikes) but there's also a PDF.

I was hoping that someone from the Places development team (or anyone
else who understands what's been built) could review and comment.
(Brett? Annie? Ben?) Does this look right?

I've got a slew of questions, but I want to establish an understanding
of the current state first -- I hate to look more like a rube than my
ancestry dictates.

Thanks,

-Todd

Brett Wilson

unread,
Apr 21, 2006, 11:03:01 PM4/21/06
to
It looks correct to me. But the data model doesn't affect users of the
API, and may change at any time.

The thing to understand about bookmarks is that they are all uniquely
identified by URL, and each bookmark lives in one or more folders. You
got it right with the data model that most of the information about
bookmarks is in the history system. The bookmark system just associates
entries in the history table with bookmark folders.

There is a 10% chance that the URL-identity model will change, resulting
to fairly major changes to the bookmarks API.

To get data out of the bookmarks and history system, see:
http://developer.mozilla.org/en/docs/Places:Query_System

It's on my list to write more bookmarks and history system documentation
soon.

Brett

tod...@gmail.com

unread,
Apr 23, 2006, 10:54:34 PM4/23/06
to
Brett Wilson wrote:
> It looks correct to me. But the data model doesn't affect users of the
> API, and may change at any time.

>From what I can tell, the data model is the soul of the Places
architecture. It will ultimately determine what functionality can be
delivered, so it's worth getting it right. And inasmuch as the API is a
pretty thin wrapper of the DM, I can't really agree with your assertion
that the DM doesn't affect clients of the API.

> The thing to understand about bookmarks is that they are all uniquely
> identified by URL

This strikes me as a decision with some nasty implications that will
indeed reach all the way out to (and slap) the end user, so I'm glad to
hear that there's some discussion about changing it. fwiw, I hear that
there was a parallel debate years ago during the implementation of the
original FF bookmark system; url-as-id was implemented as it is now in
Places and then thrown overboard in favor of an independent id. I
strongly favor the latter, but that's a whole other discussion.

For now, I've got a few superficial comments about the DM that I wanted
to articulate here, with more substantive comments in a later post.

1) Best practice suggests that the name for a table's primary key
should be derived from the name of the table itself. Naming a primary
key "id" isn't really helpful, and leads to a proliferation of names
when that primary key is referenced elsewhere to support joins. The
primary key for table moz_history is "id", but is variously referenced
in other tables as "page", "page_id", and "item_child." Call it
moz_history_id everywhere, and you've just eliminated the need to
document a handful of relationships among tables because the name makes
the connection obvious.

2) moz_history is clearly the star of the show; it contains an entry
for every url I've ever visited. And the data model suggests that some
subset of those url's might be bookmarked in various ways. But what
about bookmarks for url's that I haven't yet visited? Presuming that I
can enter an arbitrary url as a bookmark (or edit the url of a bookmark
I've already created), I think that means that moz_history contains
some urls that I've never visited. Which suggests to me that
moz_history is the wrong name for this table of urls. How 'bout
moz_urls instead? Again, if the name points to the function, you have
less to document.

3) Is there a good reason to represent the bookmark/folder hierarchy as
three tables, moz_bookmarks_roots, moz_bookmarks_folders, and
moz_bookmarks? Whatever motivated this aspect of the design is not
obvious. If you combine all three of them into a single table
moz_bookmarks, you could, for instance, find the root folders by
selecting all records with a null parent. (If your defnition of a root
needs to include, for some reason, folders that do have a parent,
create an attribute called is_root). "item_child" (as per #2 above)
becomes moz_url_id, and "folder_child" disappears entirely. Life is
good. Definitely simplifies the structure, and makes for two fewer
tables to document.

I'm curious to hear your thoughts about this. I'm sitting pretty far
outside the design process, so I may be working with a substantially
different set of assumptions.

Coming up next: Places Giveth and Places Taketh Away: what got lost on
the way to the new data model.

Brett Wilson

unread,
Apr 23, 2006, 11:58:28 PM4/23/06
to
tod...@gmail.com wrote:
>>The thing to understand about bookmarks is that they are all uniquely
>>identified by URL
>
> This strikes me as a decision with some nasty implications that will
> indeed reach all the way out to (and slap) the end user, so I'm glad to
> hear that there's some discussion about changing it. fwiw, I hear that
> there was a parallel debate years ago during the implementation of the
> original FF bookmark system; url-as-id was implemented as it is now in
> Places and then thrown overboard in favor of an independent id. I
> strongly favor the latter, but that's a whole other discussion.

The previous implementation with URL identity that was problematic
restricted you to one bookmark per URL. We support multiple locations
per URL, which is much less problematic. The only time I have heard of
people noticing this is when they import bookmarks that have more than
one title for a URL.

The design is this way because it was originally designed to be a
tagging-style feature. Whether this changes depends on what happens with
respect to tagging. There is still some desire to do this, and with the
current design it is easy. Even if we never do anything with tagging,
I'm currently skeptical it would be worth the significant effort
required to change the API, implementation, and all callers. It also
poses some problems of its own.


...some good suggestions (omitted): If we change a table, we should keep
in mind that the names should be made more consistent. Writing the
migration code is annoying, so I doubt we would do it unless it had to
be done for another reason.

> 3) Is there a good reason to represent the bookmark/folder hierarchy as
> three tables, moz_bookmarks_roots, moz_bookmarks_folders, and
> moz_bookmarks?

*_roots is just a list of which folders have predetermined meanings
(menu, toolbar, etc.) so we know which ID to associate with each
meaning. They can appear anywhere in the hierarchy.

*_folders associates a name with each folder ID and also lists all
folders that exist so we can create new unique IDs.

*_bookmarks contains the hierarchy.

Brett

Mike Shaver

unread,
Apr 24, 2006, 8:28:59 AM4/24/06
to Brett Wilson, dev-apps...@lists.mozilla.org
On 4/23/06, Brett Wilson <bre...@gmail.com> wrote:
> The previous implementation with URL identity that was problematic
> restricted you to one bookmark per URL. We support multiple locations
> per URL, which is much less problematic.

If we can have multiple bookmarks associated with a single URL, how
can bookmarks be uniquely identified by URL?

> The only time I have heard of
> people noticing this is when they import bookmarks that have more than
> one title for a URL.

People also sometimes import bookmarks from multiple sources, and can
therefore end up with multiple entries that are identical, but
distinct. (I've had multiple bookmarks that were identical except for
the keyword in the past, as well.)

> The design is this way because it was originally designed to be a
> tagging-style feature. Whether this changes depends on what happens with
> respect to tagging. There is still some desire to do this, and with the
> current design it is easy. Even if we never do anything with tagging,
> I'm currently skeptical it would be worth the significant effort
> required to change the API, implementation, and all callers. It also
> poses some problems of its own.

Could you elaborate on those problems? The implications of this
system for reliable synchronization are important, and given that I
don't think anyone has yet completed the API review that Ben asked for
earlier, it seems unwise to be freezing the whole system. This API
and data model will to a large degree determine what we can support
from extensions and future enhancements to the core
bookmarking/history capabilities, and I think it behooves us to spend
some time tuning it while all the callers are still in our tree.

With assistance from people like Todd, who have built significant user
value on top of our bookmarks system in the past, I think we can
develop some good developer use cases against which to validate (and,
I hope, minimize) the APIs we expose.

> ...some good suggestions (omitted): If we change a table, we should keep
> in mind that the names should be made more consistent. Writing the
> migration code is annoying, so I doubt we would do it unless it had to
> be done for another reason.

Given that we're now slipping Places out to Firefox 3, I think we
would do well to take the time to do a more comprehensive API review,
and fix these names. I think there's less pressure to do migration
between different prerelease Places data models at this point too, and
doubly so once import/export comes online.

> > 3) Is there a good reason to represent the bookmark/folder hierarchy as
> > three tables, moz_bookmarks_roots, moz_bookmarks_folders, and
> > moz_bookmarks?
>
> *_roots is just a list of which folders have predetermined meanings
> (menu, toolbar, etc.) so we know which ID to associate with each
> meaning. They can appear anywhere in the hierarchy.

Perhaps a name that doesn't imply that the folders won't have parents?
moz_bookmarks_special_folders?

Mike

Daniel Brooks

unread,
Apr 24, 2006, 8:59:57 AM4/24/06
to
"Mike Shaver" <mike....@gmail.com> writes:

> On 4/23/06, Brett Wilson <bre...@gmail.com> wrote:
>> The previous implementation with URL identity that was problematic
>> restricted you to one bookmark per URL. We support multiple locations
>> per URL, which is much less problematic.
>
> If we can have multiple bookmarks associated with a single URL, how
> can bookmarks be uniquely identified by URL?
>

Just because the back end uses the url as a key doesn't mean the url can't show up in multiple places in the ui and multiple folders. They're called 'folders' in the ui, but really they're more like tags shown in a tree view. Instead of imagining a directory of files called 'Foo', imagine a collection of files, some of which are tagged with the string 'Foo'.

db48x

Mike Shaver

unread,
Apr 24, 2006, 9:09:56 AM4/24/06
to Daniel Brooks, dev-apps...@lists.mozilla.org

I understand the model (I think), I just don't understand how I can
UNIQUELY identify a bookmark by URL given that model, which is what I
thought Brett was saying. A bookmark isn't just a target location,
it's also metadata like title, microsummary choice, keyword,
description, folder location(s), etc.

Mike

Daniel Brooks

unread,
Apr 24, 2006, 9:32:13 AM4/24/06
to
"Mike Shaver" <mike....@gmail.com> writes:

Oops, meant to send this to the group.

Do you need several bookmarks pointing to the same location but with different descriptions? What problem does that solve for you?

It's easy enough to make a single bookmark show up in multiple folders, or to have multiple keywords. Probably the most straight forward way to implement it would be to have a second table with two fields, url and folder (or url and keyword in the second case). The url field references the history/bookmarks data, the folder references the list of folders the user has created. Then it's a simple join.

db48x

Mike Shaver

unread,
Apr 24, 2006, 9:53:05 AM4/24/06
to Daniel Brooks, dev-apps...@lists.mozilla.org
On 4/24/06, Daniel Brooks <db...@yahoo.com> wrote:
> Do you need several bookmarks pointing to the same location but with different descriptions? What problem does that solve for you?

As I said in the email to which you originally replied, I have had
multiple bookmarks with the same URL and different keywords, as well
as bookmarks that were identical in all respects but their folder
location due to importing from another bookmark file.

The most common cases for that are multiple importing from IE, or
getting the "Getting Started" and "Latest Headlines" default bookmarks
pulled in repeatedly. Sync operations can result in duplicates as
well, where the algorithm can't be sure that it's safe to coalesce
them and needs to avoid losing user data.

People use these characteristics *today*, as with Foxmarks, to
preserve the integrity of synchronization and gracefully handle
conflict cases.


> It's easy enough to make a single bookmark show up in multiple folders, or to have multiple keywords. Probably the most straight forward way to implement it would be to have a second table with two fields, url and folder (or url and keyword in the second case). The url field references the history/bookmarks data, the folder references the list of folders the user has created. Then it's a simple join.

Then you need to add a column in the table for every field that
"might" be distinct in different bookmarks, and a bookmark might
differ from another only in one of _any_ of the fields as the result
of a sync or import operation. Or only by position in the hierarchy
(where there is one at all -- given the merger with history, there can
be entries here that are not in the hierarchy...yet).

But those bookmarks need to have distinct identities, so that
resolution of conflicting edits to the different 'facets' can be
performed deterministically, so I really do think we're back where we
started.

What was the goal of moving to this model? Brett alluded to some
problems with the unique-bookmark-ID model, the big tease, so I hope
he'll clue me in when he gets to this thread later.

Mike

Daniel Brooks

unread,
Apr 24, 2006, 10:14:57 AM4/24/06
to
"Mike Shaver" <mike....@gmail.com> writes:

> On 4/24/06, Daniel Brooks <db...@yahoo.com> wrote:
>> Do you need several bookmarks pointing to the same location but with different descriptions? What problem does that solve for you?
>
> As I said in the email to which you originally replied, I have had
> multiple bookmarks with the same URL and different keywords, as well
> as bookmarks that were identical in all respects but their folder
> location due to importing from another bookmark file.
>
> The most common cases for that are multiple importing from IE, or
> getting the "Getting Started" and "Latest Headlines" default bookmarks
> pulled in repeatedly. Sync operations can result in duplicates as
> well, where the algorithm can't be sure that it's safe to coalesce
> them and needs to avoid losing user data.
>
> People use these characteristics *today*, as with Foxmarks, to
> preserve the integrity of synchronization and gracefully handle
> conflict cases.
>

Sure. People end up with several bookmarks that point to the same location but have different titles, but isn't that just a bug?


>
>> It's easy enough to make a single bookmark show up in multiple folders, or to have multiple keywords. Probably the most straight forward way to implement it would be to have a second table with two fields, url and folder (or url and keyword in the second case). The url field references the history/bookmarks data, the folder references the list of folders the user has created. Then it's a simple join.
>
> Then you need to add a column in the table for every field that
> "might" be distinct in different bookmarks, and a bookmark might
> differ from another only in one of _any_ of the fields as the result
> of a sync or import operation. Or only by position in the hierarchy
> (where there is one at all -- given the merger with history, there can
> be entries here that are not in the hierarchy...yet).
>
> But those bookmarks need to have distinct identities, so that
> resolution of conflicting edits to the different 'facets' can be
> performed deterministically, so I really do think we're back where we
> started.
>
> What was the goal of moving to this model? Brett alluded to some
> problems with the unique-bookmark-ID model, the big tease, so I hope
> he'll clue me in when he gets to this thread later.
>
> Mike

I think you're making a mountain out of a molehill here. Instead of having two bookmarks pointing to the same location but in different folders, you instead have one bookmark that's in both folders. It's not two seperate entities inside of two folders, it's one entity with two tags applied to it. That way if you want to change the title you only have to do it once and it shows up in both spots. I also fail to see how having a bookmark that's not in any folders is a problem.

All of this feels really natural to me, but maybe that's a result of my twisted upbringing. I used to do a lot of programming on the LambdaMOO, which is like a mud. Each object on the server has both a location property and a contents property. Now, some objects represent the players themselves, some represent the rooms they wander around in, some represent the objects they find in those rooms and interact with. If a user has wandered in to some room, then his location property will be equal to the id of that room. Likewise, that room's contents list will have the player's object number in it. However, because all objects have a location property, even the rooms can be locating inside of things. Because all objects have a contents property, things that aren't normally considered containers can actually contain things. Players can pick things up and carry them around, rooms can contain other rooms, rooms can be contained by players or other objects, etc. Also, any object's location could potentially be #-1, which represents the Void, or the lack of an object. Most rooms get placed in the Void because it's weird to walk through a door and find a room full of other rooms.

Anyway, all that aside, I really don't think it's necessary to have one bookmark with multiple titles, or several bookmarks pointing to the same url but with different titles. Sure, conflict resolution is hard, and no matter what you have to punt to the user at times. Either you punt by making two seperate bookmarks with the same url (like we do today), leaving the user to clean up the mess, or you punt by giving the user a list of problems to fix is an implementation detail.

Also, how does having one bookmark with two keywords function differently than two bookmarks pointing at the same url, each with a different keyword? In either case you can activate it with either keyword.

db48x

Mike Shaver

unread,
Apr 24, 2006, 10:46:55 AM4/24/06
to Daniel Brooks, dev-apps...@lists.mozilla.org
On 4/24/06, Daniel Brooks <db...@yahoo.com> wrote:
> Sure. People end up with several bookmarks that point to the same location but have different titles, but isn't that just a bug?

You have a bookmark to http://slashdot.org with the title "slashdot".

I have a bookmark to http://slashdot.org with the title "./".

We share our bookmarks with each other, by export/import or a shared
Foxmarks account or some other more sophisticated mechanism. I now
have

Bookmarks ->
Daniel's Bookmarks -> "slashdot"=http://slashdot.org
"./"=http://slashdot.org

Where's the bug? What would be different if the bug were fixed?

(I used to admin a MOO too, but I don't see the relevance to the issue
at hand. Of course "location" and "contents" are different...)

Even if the only thing that differs about two bookmarks is their
location in the hierarchy, and we optimize by having that be two
references to the same bookmark, I maintain that we need to hide that
optimization from the API consumers. Otherwise, there's no
deterministic way to know what the model will be when the bookmarks
diverge via copy-on-write or some similar thing.

There will possibly be cases in which a single bookmark wants to
appear in multiple places in the hierarchy, and maybe that's a common
enough use case (the "ln" case) to warrant some exposure in an API for
alias manipulation. But I think the "cp" case where the "contents" of
the bookmarks/history/etc. entry are the same is a very important case
to support well, and more common than "ln". (I would, for example,
expect to be able to Edit->Copy bookmarks in whatever manager UI there
is, but I wouldn't be surprised if there was no Edit->Make Alias entry
to match.)

Mike

Daniel Brooks

unread,
Apr 24, 2006, 11:42:14 AM4/24/06
to
"Mike Shaver" <mike....@gmail.com> writes:

> On 4/24/06, Daniel Brooks <db...@yahoo.com> wrote:
>> Sure. People end up with several bookmarks that point to the same location but have different titles, but isn't that just a bug?
>
> You have a bookmark to http://slashdot.org with the title "slashdot".
>
> I have a bookmark to http://slashdot.org with the title "./".
>
> We share our bookmarks with each other, by export/import or a shared
> Foxmarks account or some other more sophisticated mechanism. I now
> have
>
> Bookmarks ->
> Daniel's Bookmarks -> "slashdot"=http://slashdot.org
> "./"=http://slashdot.org
>
> Where's the bug? What would be different if the bug were fixed?
>

I'm just saying that I don't think that's the best way to do it. Wouldn't it be better to unify the bookmarks, asking the user to resolve the conflict? That's what my address book program does, and I love it because at any given time the address database is consistant. No need to go back later and figure out which records are actually duplicates. If I've added you to my address book but listed your name as just Mike, and I read a message from you where you specify your name to be Mike Shaver, it'll ask me if I want to update my records to use the name you specify, keeping what I specified as an alias. Similarly if it sees a message from "Mike Shaver" <mi...@somewhereelse.com>, it'll ask if both addresses refer to the same person. Something similar could easily be done when you merge my bookmarks with yours, and vice versa.

> (I used to admin a MOO too, but I don't see the relevance to the issue
> at hand. Of course "location" and "contents" are different...)
>
> Even if the only thing that differs about two bookmarks is their
> location in the hierarchy, and we optimize by having that be two
> references to the same bookmark, I maintain that we need to hide that
> optimization from the API consumers. Otherwise, there's no
> deterministic way to know what the model will be when the bookmarks
> diverge via copy-on-write or some similar thing.
>

If the API hides that fact then the back end no longer gets any benefit from it because it has to store some information to make them unique again. Still, I'm not saying I'd like this feature because it's a nice optimization of the bookmarks storage code.

> There will possibly be cases in which a single bookmark wants to
> appear in multiple places in the hierarchy, and maybe that's a common
> enough use case (the "ln" case) to warrant some exposure in an API for
> alias manipulation. But I think the "cp" case where the "contents" of
> the bookmarks/history/etc. entry are the same is a very important case
> to support well, and more common than "ln". (I would, for example,
> expect to be able to Edit->Copy bookmarks in whatever manager UI there
> is, but I wouldn't be surprised if there was no Edit->Make Alias entry
> to match.)
>
> Mike

Yea, I thought you might say that. Yes, people expect bookmarks to work a certain way. Personally, I can think of a number of ways to make my bookmarks list better for me. All of the bookmarks with keywords on them I'd move out of any folders so that the don't show up in the bookmarks menu. They're just clutter there because to use them you have to type them into the urlbar. In a couple of places I want bookmarks to show up in two folders, and therefore I currently use two seperate bookmarks. If/when one of those sites changes addresses or reorganizes or whatever, I'll have to change twice as many urls as I would otherwise, etc. Just because bookmarks have always worked one particular way doesn't mean that they always have to work that way. (Of course, if too many users fail to understand the change, and think it's a bug then the feature is just doomed)

db48x

tod...@gmail.com

unread,
Apr 24, 2006, 12:48:36 PM4/24/06
to
My poor brain. I had enough trouble internalizing the structures
involved here that I had to draw a picture
(http://www.foxcloud.com/Places/ for those who missed this upthread).
And now we're talking about end-user behavior and LamdaMOO. Hang on
while I strap on my mental whiplash collar.

Okay, here are two cases to consider. I don't think these qualify as
molehill -> mountain candidates. Do you?

I've got a gmail account, and I've got it bookmarked in two places:
once on the toolbar and once in my "Webmail" folder deep inside my
bookmark hierarchy. On the toolbar, I want to display no name, as I'm
using it as a kind of Quick Launch; in the folder, I want the full
name. The current model doesn't allow me to do this.

That's a static case. More twisted is the dynamic case: suppose I have
two urls, A & B. They're each bookmarked in a few places; let's call
those bookmarks A1, A2, A3, and B1, B2, B3.

What happens when I do something to one of these bookmarks? Does the
model support the desired/expected behavior or thwart it?

If I delete A1, presumably nothing happens to A2 and A3. As a user, I
think that's good.

If I change A1's name, A2 and A3 change, too. I think that's bad, as
the model has now forced me to understand the sense in which these
entities are linked -- and the answer is that they're linked sometimes,
but not always. Deleting one doesn't delete the other, but changing
one's name does change the other. Shudder.

But it gets worse: If I edit A1 and change the url to be
(coincidentally) that of B's url, then A1 is now, in reality, B4.
What's B4's display name? Either it's whatever name was associated with
A (in which case all of the B's acquire A's name; bad) or it acquires
B's name, but that's incomprehensible, too, as all I wanted to do was
change the url, not the name.

Either way, the user is left scratching his head trying to figure out
what just happened. With this model, you won't want to support a simple
dialog box that allows the user to edit the name & url of a bookmark
simultaneously, else you'll be sending users directly into these
shark-infested waters.

Peter Lairo

unread,
Apr 24, 2006, 3:10:09 PM4/24/06
to
tod...@gmail.com said on 24.4.2006 18:48:

> Okay, here are two cases to consider. I don't think these qualify as
> molehill -> mountain candidates. Do you?
>
> <snip *excellent* reasoning why the "new" bookmarks model should die>

Bravo! Finally, a reasoned voice of concern.

I've been having a really bad feeling about this whole new model (and
the current Places UI too, BTW) and just didn't have time to articulate
it. You've done it better than I could anyhow. So, thank you!

One bookmark = one ID
(not one URL = one ID)

That's how the *user* sees it.
--
Regards,

Peter Lairo

The browser you can trust: www.GetFirefox.com
Reclaim Your Inbox: www.GetThunderbird.com

tod...@gmail.com

unread,
Apr 24, 2006, 6:00:20 PM4/24/06
to
As promised, I'm continuing here with a list of things that used to be
part of the old bookmarks system but are now missing in Places. For
simplicity, I'm going to refer to the old system (i.e., what's
implemented in FF1.5) as the RDF system.

I'm coming at this from the perspective of synchronization, where I
have some direct experience, but the limitations discussed here
undoubtedly would impact people trying to implement any number of
extensions with different purposes.

- Working with the RDF system is kind of like stepping into the ring
with a circus bear: it's ungainly; it usually can be cajoled into doing
what you want; and you never know when you're going to get hurt.
Nonetheless, the RDF system provides a useful backstop, for if the
first-class API's don't do what you need, you still can (more or less)
have your way with the RDF datastore directly. That's gone in Places,
which means the API's need to provide really rich access to what's
underneath. Currently, that's not the case. If the API is missing
something you need, you'll have to implement it in C++, with all that
entails (not being cross-platform being the primary problem).

- Implementing synchronization cleanly depends on having strong id's
for every entity that you want to sync. By strong, I mean that the id
has the following properties:

1) it's created at the time that the entity it identifies is created,
and persists for the entire duration of the entity's lifecyle

2) it's relatively unique, which in this case means a user with a
handful of machines is unlikely to generate duplicate id's for
different entities on those machines

3) it's not visible/accessible/editable to the end user in any fashion,
but it is accessible (both read & write) programmatically

The RDF system, conveniently, provides strong id's for all entity types
except for separators (for which condition [1] above doesn't hold).
Strong id's allow you to apply the changes a user makes on one machine
to the same set of resources on another machine without constantly
asking the user to help you resolve issues of identity.

The Places data model, as currently implemented, provides a variety of
id's, depending on datatype (bookmark, folder, or separator), but
unfortunately none of them are strong id's. Folder id's violate
property [2], bookmark id's (urls) violate property [3], and the
oft-neglected separator, in fact, has no id at all. For this reason,
Places isn't well suited to support synchronization.

- The RDF system allows separators to have names. I'm not advocating
that this feature be maintained -- I'm sure it's little known and less
used -- but I wanted to point out that the Places data model no longer
supports it; it's not clear that the omission was intentional.

- The RDF system maintains both a create date and a last modified date
for each entity. These have both been dropped in Places. The last
modified date is particularly useful as it allows a client to ask the
datastore, "Who's changed since we last synchronized?" Its absence is
another substantial impediment to implementing good synchronization for
Places.

Finally, here are a couple of things that neither the RDF system nor
Places support, but that would be nice to have:

- I mentioned that the RDF system supports the notion of a last
modified date, which is true, but there are definitely some rough edges
in that support. A Livemark, for instance, updates its last modified
date each time it updates its contents. It'd be better if the Livemark
maintained that bit of information elsewhere (say a lastUpdated field)
and only altered the last modified field when the user induced some
kind of change (like altering its name or feedurl).

- Similarly, tweaking the semantics of creation date would yield some
positive results: copying and pasting a bookmark currently results in
the creation of a new entity with a new id, but with the old creation
date. Similarly, importing bookmarks causes new id's to be created for
the imported resources, but the original creation date is preserved.
Tying creation date to the act of creating a new id would square away
some otherwise thorny issues.

- And finally, a controversial suggestion: it would be cool if the
bookmarking system implemented tombstoning, where deleted resources
were marked as such but not actually removed from the datastore. This
would provide to a synchronizer a positive assertion that a resource
should be deep-sixed, rather than relying on inference by observing
that the resource in question has gone missing.

-Todd

Justin Wood (Callek)

unread,
Apr 24, 2006, 7:53:54 PM4/24/06
to
Peter Lairo wrote:
> tod...@gmail.com said on 24.4.2006 18:48:
>> Okay, here are two cases to consider. I don't think these qualify as
>> molehill -> mountain candidates. Do you?
>>
>> <snip *excellent* reasoning why the "new" bookmarks model should die>
>
> Bravo! Finally, a reasoned voice of concern.
>
> I've been having a really bad feeling about this whole new model (and
> the current Places UI too, BTW) and just didn't have time to articulate
> it. You've done it better than I could anyhow. So, thank you!
>
> One bookmark = one ID
> (not one URL = one ID)
>
> That's how the *user* sees it.

It could easily happen that a checkbox (unchecked by default) when
changing a bookmark url, to be "...and change all occurrances of the
original url to this one in my bookmarks" (or some more abridged
wording) to accomodate some of Daniel's issues.

~Justin Wood (callek)

Peter Lairo

unread,
Apr 25, 2006, 3:44:37 PM4/25/06
to
Justin Wood (Callek) said on 25.4.2006 01:53:

1. It would be UI for a feature few would need (or want).

2. It would confuse most users ("WTF 'other' bookmarks is Firefox going
to change on me?")

Myk Melez

unread,
Apr 26, 2006, 4:30:26 AM4/26/06
to tod...@gmail.com
tod...@gmail.com wrote:

> 1) Best practice suggests that the name for a table's primary key
> should be derived from the name of the table itself. Naming a primary
> key "id" isn't really helpful, and leads to a proliferation of names
> when that primary key is referenced elsewhere to support joins. The
> primary key for table moz_history is "id", but is variously referenced
> in other tables as "page", "page_id", and "item_child." Call it
> moz_history_id everywhere, and you've just eliminated the need to
> document a handful of relationships among tables because the name makes
> the connection obvious.

I agree that we should name foreign keys consistently, but I think
calling all primary keys "id" has the advantage of allowing simpler,
less redundant queries, since you only have to include the name of the
table when joining across other tables with "id" primary keys.

In other words, given these tables (in pseudo SQL DDL):

CREATE TABLE foo {
id INT PRIMARY KEY,
...
}

CREATE TABLE bar {
id INT PRIMARY KEY,
FOREIGN KEY (foo_id) REFERENCES foo,
...
}

CREATE TABLE baz {
FOREIGN KEY (foo_id) REFERENCES foo,
...
}

Your queries would look like this:

SELECT id FROM foo WHERE ...
SELECT foo.id FROM foo JOIN bar ON foo.id = bar.foo_id WHERE ...
SELECT id FROM foo JOIN baz ON foo.id = baz.foo_id WHERE ...

If you derive primary key names from their table names, on the other
hand, your queries would all bear the naming redundancy cost, i.e.:

SELECT foo_id FROM foo WHERE ...
SELECT foo_id FROM foo JOIN bar ON foo.foo_id = bar.foo_id WHERE ...
SELECT foo_id FROM foo JOIN baz ON foo.foo_id = baz.foo_id WHERE ...


> 2) moz_history is clearly the star of the show; it contains an entry
> for every url I've ever visited. And the data model suggests that some
> subset of those url's might be bookmarked in various ways. But what
> about bookmarks for url's that I haven't yet visited? Presuming that I
> can enter an arbitrary url as a bookmark (or edit the url of a bookmark
> I've already created), I think that means that moz_history contains
> some urls that I've never visited. Which suggests to me that
> moz_history is the wrong name for this table of urls. How 'bout
> moz_urls instead? Again, if the name points to the function, you have
> less to document.

Agreed. In fact, I'd go even farther and say that the table should be
called moz_places, since its rows represent the central concept behind
the new architecture--the "place"--and they may well contain not only
history and bookmarks but also other URLs with which the user has some
different kind of relationship.

-myk

tod...@gmail.com

unread,
Apr 26, 2006, 1:15:13 PM4/26/06
to
Myk Melez wrote:
>
> I agree that we should name foreign keys consistently, but I think
> calling all primary keys "id" has the advantage of allowing simpler,
> less redundant queries, since you only have to include the name of the
> table when joining across other tables with "id" primary keys.
>
> In other words, given these tables (in pseudo SQL DDL):
>
> CREATE TABLE foo {
> id INT PRIMARY KEY,
> ...
> }
>
> CREATE TABLE bar {
> id INT PRIMARY KEY,
> FOREIGN KEY (foo_id) REFERENCES foo,
> ...
> }
>
> CREATE TABLE baz {
> FOREIGN KEY (foo_id) REFERENCES foo,
> ...
> }
>

+1 on the simplification of queries. I'll note that SQLite at present
doesn't enforce the foreign key reference
(http://www.sqlite.org/omitted.html), but it at least it can parse it
without choking. The goal of this exercise was to make the
relationships more obvious, and your approach does just that.

(as an aside, I'm not really familiar with the syntax, but shouldn't it
be:

FOREIGN KEY (foo_id) references foo(id)

)

>
> Agreed. In fact, I'd go even farther and say that the table should be
> called moz_places, since its rows represent the central concept behind
> the new architecture--the "place"--and they may well contain not only
> history and bookmarks but also other URLs with which the user has some
> different kind of relationship.
>

Yes! This is a vast improvement over moz_history in terms of accurately
reflecting the table's contents and now also makes its central role
more obvious.

Thanks for the good suggestions.

Myk Melez

unread,
Apr 26, 2006, 3:26:12 PM4/26/06
to tod...@gmail.com
tod...@gmail.com wrote:

> (as an aside, I'm not really familiar with the syntax, but shouldn't it
> be:
>
> FOREIGN KEY (foo_id) references foo(id)

Yes, you're right, it should be this syntax.


> Thanks for the good suggestions.

Another issue that tends to come up when designing a database schema is
whether to give tables singular or plural names (i.e. moz_place vs.
moz_places). I've tended to give them plural names, but it's worth
considering the argument for singular names, f.e. on this site:

http://justinsomnia.org/writings/naming_conventions.html

(FWIW, that site also advocates "<table>_id" primary keys. In fact, it
suggests that all column names include the name of the table, which
seems silly to me.)

The argument for plural names is here:

http://weblogs.asp.net/jamauss/articles/DatabaseNamingConventions.aspx

Currently the Places code uses a mix of singular names (f.e. moz_anno
and moz_favicon) and plural names (f.e. moz_bookmarks and moz_keywords).
We should probably make these more consistent.

-myk

Nickolay Ponomarev

unread,
May 14, 2006, 5:10:45 PM5/14/06
to
Todd Agulnick wrote:
> Rather than complain about that lack, I thought it would be more in the
> community spirit to create some doc, and by doing so also gain a better
> of understanding of what's going on. The paltry fruits of my efforts can
> be found here: http://www.foxcloud.com/Places/ -- the original is a
> Visio document (yikes) but there's also a PDF.
>
Is it something that should be added to
http://developer.mozilla.org/en/docs/Places:Design ? Is it possible to make a
PNG version?

Nickolay

0 new messages