Sorry for the late response. I posted one earlier, but it looks like
it got moderated and I'm not sure why. Anyways, thanks for the
suggestions. It is a big concern for me that I put this into the right
place in the rails framework, whether it be directly integrated or
handled as an attachment. I would love to hear other peoples opinions
on this. I think there are pro's and con's to both approaches.
One of the biggest pro's I can think of for it being integrated though
is that people will be more likely to use it, because it won't have to
be maintained separate from rails. That means if I want to upgrade my
rails version, I won't be stuck with the version I am on until
somebody updates the scalability aspects to support my setup. I think
this one thing has a high appeal to companies, especially those who
are apprehensive about using rails, because of its lack of scalability
when communicating with distributed databases.
Of course all of your points are valid, and I don't see myself fully
making a decision until I've talked to my sponsor and gotten more feed
back from the rest of the rails community. You and I are 2 developers
out of dozens and I could implement a solution that fits us (not to
say you were suggesting that), but completely miss the requirements of
the vast majority of the rails community.
I do have a couple of additional comments on some of the things you
mentioned, which I'll state next to your comments below. Forgive me if
I sound attacking, I don't mean to. However, I would like to encourage
you to attack my ideas. That way only the strongest will survive.
On Mar 31, 7:45 am, Cezary Bagiński <
cezary.bagin...@gmail.com> wrote:
> Hi Allen,
>
> First of all, forgive me my ignorance about databases and scalability.
>
> The main question I have is: should handling multiple connections really
> be in rails/ActiveRecord?
>
> I imagine most of the work (routing queries, maintaining connections,
> proxying the results) could be done on the driver/adapter side and the
> current ActiveRecord implementation could be left as it is.
That is very possible, could you point me to some more information on
adapters? I think if I were to go with this approach there would still
be a lot of functionality that I could reuse from active record, so I
think an adapter would be a better approach than a driver. In the case
that I do extend active record, I think I may need to modify it to
allow some of the extensions that would be required. This is purely
theoretical though.
> By driver/adapter I mean either an ActiveRecord adapter or and external
> native library with an AR adapter interface.
>
> If at all, hints about which table a query is for could be passed to the
> driver, if parsing the queries on the driver side is too big of a
> problem. The whole configuration IMHO should be on the binary driver
> side anyway to make use of database-specific support for scalability
> related features on a case-by-case basis.
I think the first two milestones are for the most part database
agnostic (I could be wrong). The third milestone, may however benefit
from database-specific support, so maybe this should be externalized
into an adapter. I would argue against a binary driver approach only
because not all databases support sharding and I think rails can still
provide a solution, though I think you are right that database
specific stuff should not be in rails.
So I guess what I am suggesting would be somewhat of a hybrid. The
first two options would be built into rails. The third option could be
a vendor specific adapter or a generic non-vendor specific adapter
(probably what I would write). In that case I could write a generic
framework for a database sharding adapter, which other people could
extend to make database specific adapters.
> It kind of seems like pulling the configuration of vendor specific
> options into rails, when this can all be handled by a special driver.
> One that might have it's own yaml file for a richer set of configuration
> options that are overhead, as you mentioned.
You are right that I don't want to produce overhead, but I was talking
specifically about when an application is first created. There is
almost a 0% chance that you want to set up scalable database
interaction when you first create a rails application. That is why I
mentioned that the configuration should be invisible until it is
needed (though I may not have done a very good job of it). So it is
only when you do need to implement something scalable that the richer
set of configuration options need to be used and at that time it won't
be overhead, it will be necessary. And again, not to be pedantic, but
I don't think this is vendor specific as I stated above. Perhaps you
can explain further how you think it is?
> If an external driver/adapter is created, this can be used in any DB
> related project, relational also, a not just rails projects. And not
> just one DB vendor at a single time. This could also allow custom
> fine-tuning by anyone without patching rails.
I do think that would be cool, to have it supportable by more
frameworks. However, I think that may be out of the scope of what I am
suggesting. What I am really talking about is an intermediary between
the actual adapters and rails who's sole purpose is to direct queries
to the right place. That may sound simplistic, but determining the
right place to send a query is a very complicated task. I hope that
makes sense. I explicitly don't want to do something vendor specific,
rather I want it to work for any sort of database. I think the drivers/
adapters already available do a much better job that I could at
handling vendor specific databases.
> Since this is only relevant for production environments, this will
> require special database setups that rails wouldn't really be able to
> automate to begin with.
I disagree that this only relevant to production environments. Despite
the work that I would be doing, there would be some configuration that
a developer would have to do to get such a complicated setup working.
Basically joining between databases and other things that arise with
such a configuration would be sub-obtimal and the developer would need
to be able to test that functionality to ensure those type of things
don't happen.
You are right that rails wouldn't be able to automate this sort of
setup, but I would argue that you wouldn't want it to. The type of
setup you use would depend on how your data is laid out and I think it
would be a task beyond rails ability to take the number of databases
available and determine an appropriate setup based on how your data is
laid out.
> My guess is putting this into rails might break a few things very
> quickly, e.g:
> - migration
> - associations
> - synchronizing data at ORM level
I absolutely agree with this assertion, but I think this is true of
extending the functionality of almost anything. I fully plan to fix
these as part of my gsoc, though.
> What I do see as useful to change in Rails for the problems you mentioned:
> - passing the model table along with the query to the
> "aggregating/routing" driver, so you don't have to extract it from the
> query.
> - creating an adapter for this driver that will additionally support
> table->connection options and pass them to the driver
I definitely would like to pass the model table. I want to keep this
DRY as possible.
Here is where I think our ideas are disconnected, if I am
understanding what you're saying correctly. You think the driver
should handle the multiple connections and if I were to implement such
a thing, it would have to be vendor specific. However, I think it
would be much more beneficial if rails handled the multiple
connections and submitted queries to the driver specified. I think
this approach is more generic and would allow all drivers currently
available to be used and still offer scalability options.
> The driver could just be an adaptor that makes use of other ActiveRecord
> adapters - which will keep the fun of writing everything in Ruby.
I guess I restated what you said here up above (whoops!).
> But then again, I might be completely missing the point.
If you missed the point its only because I didn't explain myself well
enough.
> ...
>
> read more »