Re: Question regarding OOP and database access

Message has been deleted

topmind

unread,

Jul 5, 2005, 8:39:53 PM7/5/05

to

I don't know if this fully addresses your issue, but I would like to
point out that schema is probably a poor design. There are probably not
enough difference between cars and trucks to make them separate tables
(and perhaps some vehicles that could be classified either way). There
should be a single Vehicle table. If there are lots of columns that
only trucks or only cars have (and likely to stay that way), then make
a separate Truck-features or Car_features table with a foriegn key to
the Vehicles table.

Somebody probably got category-happy when making those tables.

-T-

leodip...@gmail.com

unread,

Jul 5, 2005, 8:50:23 PM7/5/05

to

topmind, thanks for the reply.

My business is not dealing exactly with cars and vehicles (it's about
management of people, processes, etc, for the government).

My point was just to illustrate the situation where I have an object A
with an associated object B, and also other objects C, D and E that
have a B associated.

If B entity changes, what do I do with the SQL code spread all over A,
C, D and E data access layers? I'd have to track and change them all,
in the SQL joins...

If I have a lot of entities, and modules (in other words, a big
system), with entities changing now and then, I will have a cumbersome
maintenance in the DB layers.

What are the best practices to avoid these problems?

Thanks for all.

Phlip

unread,

Jul 5, 2005, 9:25:38 PM7/5/05

to

leodippolito wrote:

> What are the best practices to avoid these problems?

Identify any kind of duplication in your program, even the duplication
notion "build an SQL string that...", and merge the duplication into
abstractions.

http://www.c2.com/cgi/wiki?PerniciousIngrownSql

--
Phlip
http://www.c2.com/cgi/wiki?ZeekLand

leodip...@gmail.com

unread,

Jul 6, 2005, 10:02:50 AM7/6/05

to

Philip, I think I don't get your point...

Do you have an example of merging these duplications into abstractions?

Phlip

unread,

Jul 6, 2005, 10:30:43 AM7/6/05

to

leodippolito wrote:

> Do you have an example of merging these duplications into abstractions?

You said:

> If B entity changes, what do I do with the SQL code spread all over A,
> C, D and E data access layers? I'd have to track and change them all,
> in the SQL joins...

When code is spread all over several modules, you should refactor to merge
it together. But you don't just refactor anything, you refactor the
duplications.

If C says "SELECT socks from sockDrawer WHERE color = blue", and D says
"SELECT tie from tieDrawer WHERE color = green", the duplication is "SELECT
? from ?Drawer WHERE color = ?".

Ordinarily, when we seek duplication to merge into abstractions, we attempt
to approach the advice from the book /Design Patterns/, to abstract the
thing that varies. In this case, we throw the definition of duplication as
wide as possible, to pull in things that we might have missed. "socks" is
not a code statement to morph into a virtual method call. But it's still
duplication to merge.

--
Phlip
http://www.c2.com/cgi/wiki?ZeekLand

topmind

unread,

Jul 6, 2005, 1:07:00 PM7/6/05

to

SQL is fairly compact in most cases. Putting wrappers around minor
duplication is not going to save much code or reduce change-points. It
may just result in more interfaces that have to be changed down the
road. I would suggest only wrapping duplicate clauses at factor of
about 8 repeats or so. A utility to help generate UPDATE and INSERT
clause strings may helpful.

-T-

H. S. Lahman

unread,

Jul 6, 2005, 1:28:52 PM7/6/05

to

Responding to Leodippolito...

> My problem: if the Wheel table changes in the database, I have
> wheel-related SQL code spread all over different DB layers (those DB
> layers that 'join' with the wheel table). For example, the Truck DB
> layer. If this happens I would have to track where I have these joins
> and modify the sql code.

Isolate everything related to accessing the RDB into a single subsystem
or layer. That layer has semantics for things like Schema, Dataset,
Table, and Tuple. The paradigm for talking to the RDB itself is SQL
queries. Note that the semantics of Car, Wheel, etc. is not relevant in
this subject matter. That semantics is represented purely in terms of
identity for the instances.

OTOH, your business problem does not care whether your data is stored in
an RDB, an OODB, flat files, or crop circles. The semantics are pure
business semantics like Car, Truck, and Wheel. You want that logic to
be completely independent of the semantics of storage mechanisms so you
don't want anything remotely resembling a SQL query polluting your
business logic.

Separate the subject matters with subsystem interfaces that are
message-based and pure by-value data transfer. That decouples to two
disparate views of semantics. Then all you need to do is encode/decode
message data packets on each side of the boundary. Of course each side
needs its own view of identity for the data packets. That view may be
1:1 for simple situations or it may be more complex (e.g., an object on
the business side is an amalgam of data from two tables combined via a
join on the RDB access side).

Let the DB access subsystem/layer provide the mapping of interface
identity to tables via joins and whatnot. Because schemas, joins, and
whatnot can be expressed in terms of data, if you get clever about the
DB access the same subsystem can be reused across applications by simply
modifying the configuration data that maps identity to the RDB.

>
> I could solve this by making the DB layers touch only data related to
> the object (Car DB layer access only Car table and Wheel DB layer
> access only Wheel table), and making different trips in the business
> layer to build the object:
>
> // CAR BUSINESS LAYER
>
> public Car getCarById(int p_id)
> {
> CarDataLayer carDB = new CarDataLayer();
> Car objCar = carDB.getCarById(p_id);
>
> WheelDataLayer wheelDB = new WheelDataLayer();
> Wheel objWheel = wheelDB.getWheelById(objCar.aWheel.ID);
>
> objCar.aWheel = objWheel;
>
> return objCar;
> }

This is basically the right idea. This is the sort of code one would
have in a factory object that creates a Car instance. (You probably
want to create both the Car and Wheel instances in the same method scope
to ensure data and referential integrity.) However, rather than having
distinct carDB and wheelDB interfaces, you can do something more generic
like (I'm not sure what language you are using, so I'll use C++),

public Car CarFactory::getWholeCarByID(in p_id)
{
WholeCarDataSet* ds;
Car* car;
Wheel* wheel;

ds = DBAccess.getWholeCar(p_id); // get data from DB

wheel = new Wheel (ds.wheelID, ds.wheelSize, ...)
car = new Car (p_id, wheel, ds.carModel, ...)
// the wheel reference is used to instantiate the relationship
// between Car and Wheel.

return car;
}

Here WholeCarDataSet is an aggregate of all the data obtained from the
DB related to a Car and its Wheels. In effect it just maps a by-value
data packet. That data packet is loaded by the DB access
subsystem/layer in response to the getWholeCar(p_id) request. The DB
access subsystem knows that for this request it requires a particular
join between the Car and Wheel tables.

Note that the CarFactory needs to know nothing about how the data is
stored or organized. It just asks for what /it/ needs, which is all the
data related to a whole Car with a particular ID. It relies on the DB
access subsystem to get it so all it has to do is pick the data out of
the returned data packet.

>
> This way, if the Wheel table changes I am ok with the sql code. I don't
> need to track anything in DB layers. If I ask for a car or truck
> object, it will return a car or truck with the new-version wheel.

Exactly. In addition, in my version you don't even depend on the DB
recognizing Car and Wheel as separate database entities (tables).

> The cost of this is the n-round-trips that I have to make to the DB
> layer in the business layer. I lose the power of SQL join.

The SQL join is still there. It is just fully encapsulated in the DB
access subsystem. An obvious application-specific way to address that
is to have a set of Query subclasses, one for each pile of data the
business layer needs. Then the Query instance knows how to formulate
its SQL query and pack its dataset. Then the getWholeCar interface
method just dispatches to the right Query instance.

The next step is to realize that the SQL join string for each unique
Query can be stored externally and read in at startup to create the
Query instances. Now the only application-specific thing is the packing
of the data set for the request return. That's a bit more complicated
but it can also be specified parametrically using external data. [My
blog has some examples in the category on persistence.] At that point,
as I indicated above, the access layer no longer has encoded business
semantics (it is now all in the configuration data) so it can be reuse
as-is across applications.

Bottom line, if you have multiple applications that access DB, you
should be able to engineer that access once (albeit at a somewhat higher
cost than for a single application).

*************
There is nothing wrong with me that could
not be cured by a capful of Drano.

H. S. Lahman
h...@pathfindermda.com
Pathfinder Solutions -- Put MDA to Work
http://www.pathfindermda.com
blog: http://pathfinderpeople.blogs.com/hslahman
(888)OOA-PATH

Daniel Parker

unread,

Jul 7, 2005, 10:56:54 AM7/7/05

to

leodip...@gmail.com wrote:
>
> This way, if the Wheel table changes I am ok with the sql code. I don't
> need to track anything in DB layers. If I ask for a car or truck
> object, it will return a car or truck with the new-version wheel.
>

There may be specific cases where you might want to do something like
this, but not in general. You don't want to approach software design
in a way that rules out using features of the RDBMS. In general,
prefer joins.

> The cost of this is the n-round-trips that I have to make to the DB
> layer in the business layer. I lose the power of SQL join.
>

Right, not that it always matters. But there are examples of projects
that have failed because less gifted developers made all database
access work this way.

> Another possibility I can see would be to use an O/R mapping library.
> The cost would be runtime processing.

These tools bind result sets to program variables, which does incur
runtime processing, but is useful, and the cost is small relative to
the cost of the query. Unfortunately, they often do a lot more which
is not helpful, like caching, mapping data members directly to tables,
generating SQL which can not then be tuned, etc.
>
The following approach is widely used, and works well.

One, move all of your SQL text out of your software and into
configuration files. Parameterize it with placeholders for arguments.
In Java, the simplest approach is to use property files, which support
parameterized values. Access the SQL values by name in your
application.

Two, make sure that the queries and updates in your app use names
rather than position, and that insert statements specify the column
names to receive values.

You now have a certain amount of flexibility where you can isolate
changes to the configuration file, without immediately changing your
code. You can rename columns in select statements, since you can alias
them to the old names. You can modify and fine tune the SQL statement
as long as select statements return no fewer columns. You can add
columns to select statements. Insert and update statements can be
modified to take default values for new columns, if table structures
change.

If you can find an O/R mapping tool that allows you to specify your SQL
statements in XML files, and doesn't do much more apart from binding
the results by name to data members, that would probably be the best
choice.

Regards,
Daniel Parker

frebe

unread,

Jul 9, 2005, 11:27:14 AM7/9/05

to

First, trying to ensapsulate access to one table inside one class is
impossible as you pointed out by yourself. You should not stop using
joins. Otherwise performance will be reduced.

> if the Wheel table changes in the database, I have
> wheel-related SQL code spread all over different DB layers

There are two kind of changes that can be made in a database schema:
Backward compatible and non-backward compatible. If you add a table or
column, all old SQL statements will still work. The fact that SQL
statements is spread is not a problem.

If you remove tables or columns or change primary keys, you will have
to do major refactoring in you application, whatever architecture you
are using. It is not only your DB layer that would be affected.

Separating database access into DB classes will not make it easier for
you to database schema changes. You have to consider the database
schema in the same way as a API. If you add classes or methods, all old
code will be unaffected. But if you start to remove classes or methods,
you have some work to do.

Fredrik Bertilsson
http://butler.sourceforge.net