A few questions on Fleetdb

Tim Robinson

unread,

Jul 12, 2010, 9:24:30 PM7/12/10

to FleetDB

Smooth. Took me under an hour to get FleetDB up and running, Both
running as a service and embedded (+ I am running rings embedded too).
Pretty cool. The idea that I have direct access the data as the app
developer, yet can allow other services to access the data via http is
pretty powerful, in my mind.

That being said, after playing around I do have a few questions:

* I notice when I add records via the api, I see them show up in the
dbf, but if I delete the records directly from the dbf file and run a
api select query they are gone. Shouldn't the data structure be
independent of the file? I would have thought it would just log the
occurrences to the dbf, but leave the actual data in memory. Otherwise
it's not really an in memory-db, it's doing a file read every time -
correct? To me, this means as the file becomes bigger the performance
will degrade correct? Oddly, If I add a record to the file, not using
the api it does not show up in select queries and the record is later
removed from the file. Is this expected? Normal?

* I see, glancing at the code, you're writing to a tmp file then, upon
completion, re-naming the file to replace the dbf. How atomic is this?
At what point would there be concurrency conflicts? Do you have any
guidance for what kind of projects this kind of a database system
would be good for?

* If durability is achieve via file writes would it not be best to
leave the data in memory, then write a separate file for each record?
Queries could go against the in-memory data-structure to be fast and
file records just become over written (all small files). Not appended.
Reboots could optionally load records (maybe even in a range).

This is all really new to me so I could be missing the knowledge to
know what's better, or even what's going on.
Or maybe I've mucked up implementing it?

Thanks,
Tim

P.S. Allowing values to be vectors or maps would be a great addition.

Mark McGranaghan

unread,

Jul 12, 2010, 11:12:10 PM7/12/10

to fle...@googlegroups.com

On Mon, Jul 12, 2010 at 6:24 PM, Tim Robinson <tim.bl...@gmail.com> wrote:
> Smooth. Took me under an hour to get FleetDB up and running, Both
> running as a service and embedded (+ I am running rings embedded too).
> Pretty cool. The idea that I have direct access the data as the app
> developer, yet can allow other services to access the data via http is
> pretty powerful, in my mind.

I'm glad you've liked it.

> That being said, after playing around I do have a few questions:
>
> * I notice when I add records via the api, I see them show up in the
> dbf, but if I delete the records directly from the dbf file and run a
> api select query they are gone. Shouldn't the data structure be
> independent of the file? I would have thought it would just log the
> occurrences to the dbf, but leave the actual data in memory. Otherwise
> it's not really an in memory-db, it's doing a file read every time -
> correct? To me, this means as the file becomes bigger the performance
> will degrade correct? Oddly, If I add a record to the file, not using
> the api it does not show up in select queries and the record is later
> removed from the file. Is this expected? Normal?

The behavior of FleeDB is undefined if you modify the .fdb file
yourself from under the database. You can always read it but should
never write it. I would expect that if you deleted a record from the
.fdb file and then reloaded the database then the record would not
appear. I'm not sure what to expect if you edited the file under a
running database; you just shouldn't do that.

> * I see, glancing at the code, you're writing to a tmp file then, upon
> completion, re-naming the file to replace the dbf. How atomic is this?
> At what point would there be concurrency conflicts?

> Do you have any
> guidance for what kind of projects this kind of a database system
> would be good for?

FleetDB is suitable for many types of projects. It is specifically
designed for those that need a powerful and flexible data model,
robust concurrency and durability stories, ease of use, and high
performance. Many "OLTP" type applications fit into this category,
though I have found FleetDB useful in other areas as well.

> * If durability is achieve via file writes would it not be best to
> leave the data in memory, then write a separate file for each record?
> Queries could go against the in-memory data-structure to be fast and
> file records just become over written (all small files). Not appended.
> Reboots could optionally load records (maybe even in a range).

FleetDB is designed to handle databases of up to at least 10s of
millions of records; that is too many records for a 1-file-per-record
persistence model.

> This is all really new to me so I could be missing the knowledge to
> know what's better, or even what's going on.
> Or maybe I've mucked up implementing it?

Thanks for sharing your thoughts and asking these questions. Let us
know if you have any others.

> Thanks,
> Tim
>
> P.S. Allowing values to be vectors or maps would be a great addition.

I'm strongly considering this for post 0.2.0. If you have specific use
cases in mind I'd love to hear about them as an aide to designing the
API.

Thanks,
- Mark

Tim Robinson

unread,

Jul 13, 2010, 12:10:34 AM7/13/10

to FleetDB

Thanks for the reply.

> I'm not sure what to expect if you edited the file under a
> running database; you just shouldn't do that.

Lol. I know. I only did so to test it out (to see side effects).
I would never do that in production.

> FleetDB is designed to handle databases of up to at least 10s of
> millions of records; that is too many records for a 1-file-per-record
> persistence model.

Understood.

> If you have specific use cases in mind...

In most cases I can't see needing to query the non-atom data-
structures and wonder how much of a performance hit that would take.

As an example:

A blog/forum, a user table with a field that contain the id of the
last 'x' sorted ids comments made by the user.
The kind of thing where I could get the list then push an item on it:

{"id" "some-person", "most-recent-comments" [ 72 64 56 24 36 16 10 ]}

In this case I would just yank out the list and use it, but not query
it.

The only other value I see is to push JSON data out such that other
libraries/plugins can consume it.
I don't have a spec on hand, but there's a tonne of jQuery plugins
that consume JSON. JqGrid as an example.

Being able to store a document in a close to expected format makes for
less if any at all, data parsing/combining as an intermediate step.
Some plugins can be lightly configured to handle some customization,
but generally they expect nested lists and arrays.

Thanks again for your replies.
Tim

Eric Lavigne

unread,

Jul 13, 2010, 7:48:38 AM7/13/10

to fle...@googlegroups.com

>> P.S. Allowing values to be vectors or maps would be a great addition.
>
> I'm strongly considering this for post 0.2.0. If you have specific use
> cases in mind I'd love to hear about them as an aide to designing the
> API.
>

I expect to store hierarchical data with nested vectors and maps, but
FleetDB is already suitable for this when combined with Clojure's str
and read-string.

What I would find even more interesting is a query language that would
make it easy to extract hierarchical data from a table so that a bunch
of records like {:id 398752 :parent-id 9872367 :content "Node13"}
could be used to build a query result like

{:id 320857 :content "Node1"
:children [{:id 269364 :parent-id 320857 :content "Node2"}
{:id 9763655 :parent-id 320857 :content "Node3"
:children [{:id 67463 :parent-id 9763655 :content
"Node4"}]}]

Recursive queries? Move the query language towards resembling Scheme
more than SQL?

Anyway, I already love FleetDB and expect to use it for my next project.

Tim Robinson

unread,

Jul 13, 2010, 12:45:43 PM7/13/10

to FleetDB

Hi Eric,

> I expect to store hierarchical data with nested vectors and maps, but
> FleetDB is already suitable for this when combined with Clojure's str
> and read-string.

Storing a data structure in a string is only suitable if you are the
only consumer and know what to look for.
It may make it suitable for your specific needs, but I don't think
it's suitable for FleetDB, which can act as
a service for mulitple consumers. IMHO.

Tim

Tim Robinson

unread,

Jul 17, 2010, 12:40:32 PM7/17/10

to FleetDB

Hi Mark,

As per your request.

I gave it a shot using FleetDB for a blog app I am writing
and I found an example for you today.

Each posting in the blog has a variable set of tags input by a user,
where I jammed them in as
a string because I couldn't put a list in.

For example a posting that's tagged as follows:

{"tags" "Clojure Unix FleetDB"}

However try to run a query that returns postings having a specific
tag:

(client ["select" "postings"
{"order" ["id" "desc"] "where" ["=" "tags" "Clojure"]}])

Obviously this will not work.
Nor would storing the data structure as a string help.
i.e. :tags "[Clojure Unix FleetDB]"

Regards,

Mark McGranaghan

unread,

Jul 18, 2010, 3:29:31 PM7/18/10

to fle...@googlegroups.com

Hi Tim,

Thanks for sharing this example. I can definitely see how records with
multi-valued attributes could be better supported in FleetDB.

- Mark

Tim Robinson

unread,

Jul 18, 2010, 10:32:04 PM7/18/10

to FleetDB

No problem.

Thought I'd mention, while I am looking at it, that substring searches
would be really useful too.

(client ["select" "postings"
{"order" ["id" "desc"] "where" ["substring" "tags" "Clojure"]}])

I can see wanting to allow users to search my blog site for general
text snippets. This kinda of searching
would be much slower, so I'm not sure if it's inline with your vision
of the product.

Thanks for making FleetDB I look forward to using it more.
Tim

Mark McGranaghan

unread,

Jul 18, 2010, 11:06:56 PM7/18/10

to fle...@googlegroups.com

Hi Tim,

Yeah, full-text search would be a much more ambitious feature (: I've
thought about it before, but its not at the top of the list right now.
I'll let you know if start working on it though.

- Mark

Reply all

Reply to author

Forward