Session management with multiple databases

ed

unread,

Sep 3, 2010, 1:28:33 AM9/3/10

to Squeryl

Currently the standard way to make squeryl to a db is through setting
SessionFactory.concreteFactory with a closure that creates new
Sessions, from which transaction blocks grab all the session they
use. While this is nice for abstracting away db access it also makes
it difficult to work with multiple DB's in one application. If I have
a webapp that needs access to multiple DB's its not really possible to
reset concreteFactory in a concurrent environment, unless I wanted to
lock it down for the duration of the request, which isn't much of an
option. I took a look at the code ended up making my own transaction
block that takes in a Session as a parameter and uses that instead of
SessionFactory.newSession call the regular transaction block does.
This works fine for my purposes now but my real question is 1) is
there currently a better way to handle mutliple session from different
sources? This would also be useful in implementing sharding and
loading balance across master/slave setups 2) if there isn't is there
work being put into making a more robust option or are there any ideas
about how such a thing would be implemented?

Maxime Lévesque

unread,

Sep 3, 2010, 5:37:13 PM9/3/10

to squ...@googlegroups.com

I would breakdown the concept of "talking to multiple databases"
into two categories, based on the motivation for doing it :

1) One has to interact with data that lives in different schemas and/or applications.

2) The need for higher throughput, two common ways of doing this are :

    i) Master Slave replication
        This has been discussed here : http://groups.google.com/group/squeryl/browse_thread/thread/6fff74ab17f598f9/fce0a977564bbb7f?lnk=gst&q=master#fce0a977564bbb7f
        Pros: simple to implement
        Cons: won't buy you much if you need a high write throughput

    ii) Sharding
        In my opinion, sharding is the only way to get a relational database to attain "web scale" throughput,
        but it requires that the data model be intrinsincally "shardable", by this I mean that cross shard
        queries are rare, or preferably non existent. The more "cross shard" queries you have, the less benefit
        one will gain from sharding.

    I am currently implementing a system that will eventually need to be sharded, so I have been
    reflecting for a while about how Squeryl could provide facilities for it.

    Any decent sharding mechanism needs to adress the following :

    a) redundancy (a shard must have at least one replica that lives on a different dababase on a different machine)
    b) ideally the function that maps an object to its shard must be decentralized (i.e. to be replicatable while not not
       relying on shared state).
    c) if the host of a shard goes down, no rehashing is needed.

    To my knowledge, there is only one known algorythm that has all those properties, Consistent Hashing :

    http://en.wikipedia.org/wiki/Consistent_hashing
    http://www.tomkleinpeter.com/2008/03/17/programmers-toolbox-part-3-consistent-hashing/

    Like I wrote earlyer, I don't think that sharding is a generalized solution for scaling any data model,
    some can easily benefit from it, some can't, depending on the inherent characteristics of the data.
    However, the cases where it can applies form an interesting niche, because one can scale like some NoSql database will,
    without sacrificing the mathematical model of the relational paradigm.

    I've just scratched the surface of the sharding question, I've not discussed about the "how" a sharding mechanism should
    be implemented in Squeryl, I leave that for a future post.

Cheers

abc123ninja

unread,

Sep 18, 2010, 11:40:37 AM9/18/10

to Squeryl

On Sep 3, 7:37 pm, Maxime Lévesque <maxime.leves...@gmail.com> wrote:
> I would breakdown the concept of "talking to multiple databases"
> into two categories, based on the motivation for doing it :
>
> 1) One has to interact with data that lives in different schemas and/or
> applications.
>

What is your recommended way of doing 1)?
Many don't need an immense scalability but rather a separation for
security and architectural reasons.
One database per customer for instance is, in my experience, a rather
common model.

Ed it would be awesome if you could give some details to your
implementation. Does your solution require changing the source or can
something simply be extended?

Thanks to both of you and your friends.

Maxime Lévesque

unread,

Sep 19, 2010, 1:51:41 AM9/19/10

to squ...@googlegroups.com

A "one database per customer" setup can be implemented like follows :

1) You store your customerId in a thread local variable, in a web app the first thing
you do is make sure that this variable is set before you do any database interaction.

ex.: this method

CustomerIdHolder.id

always returns the correct customer Id

2) You have a map that associates a connection url for each Id :

customerId2DatabaseUrlMap

SessionFactory.concreteFactory = Some(
()=>new Session(customerId2DatabaseUrlMap.get(CustomerIdHolder.id), new YourDBAdapter)
)

Then provided that CustomerIdHolder.id is maintained, each customer
will be taking with it's own database.

Cheers

abc123ninja

unread,

Sep 20, 2010, 8:45:23 PM9/20/10

to Squeryl

Is there a more general approach?
My use case is this;
I have a master db and several client dbs.
When user from a client signs we first check in master db for client's
id.
After retrieving it user will use his client's db for the rest of his
session.
Is something like that doable with squeryl?
This is a concurrent environment where users from different clients
can login at the same webserver and operate simultaneously-

On Sep 19, 3:51 am, Maxime Lévesque <maxime.leves...@gmail.com> wrote:
> A "one database per customer" setup can be implemented like follows :
>
> 1) You store your customerId in a thread local variable, in a web app the
> first thing
> you do is make sure that this variable is set before you do any database
> interaction.
>
> ex.: this method
>
> CustomerIdHolder.id
>
> always returns the correct customer Id
>
> 2) You have a map that associates a connection url for each Id :
>
> customerId2DatabaseUrlMap
>
> SessionFactory.concreteFactory = Some(
> ()=>new Session(customerId2DatabaseUrlMap.get(CustomerIdHolder.id), new
> YourDBAdapter)
> )
>
> Then provided that CustomerIdHolder.id is maintained, each customer
> will be taking with it's own database.
>
> Cheers
>

abc123ninja

unread,

Sep 21, 2010, 1:09:49 AM9/21/10

to Squeryl

Didn't notice the function type!
Hehe, I see now concreteFactory is a closure :)

Maxime Lévesque

unread,

Sep 21, 2010, 1:19:51 AM9/21/10

to squ...@googlegroups.com

I will assume that you have some mechanism to know

Before the loggin you manually create a session

val session = new Session(...url to master DB...)

using(session) {
... log in with the master db...
}

and when you are logged in, you use the mechanism from my earlyer post.

Cheers

Reply all

Reply to author

Forward