Multiple Fénix instances

Damián

unread,

Jun 18, 2009, 8:03:33 AM6/18/09

to Fénix Framework

Hi all,

After numerous exchanges with João via private e-mail, I am finally
using this new communication channel, as it was suggested some time
ago.

I was wondering how hard would it be to handle several Fénix instances
in the same JVM process. That would be most useful for the work I'm
doing around domain object replication.

Thanks,

Damián

Stephane Le Dorze

unread,

Jun 18, 2009, 8:59:56 AM6/18/09

to fenix-f...@googlegroups.com

We've already had a lengthy discussion about that with Joao; Here's the relevant part :

> On Fri, May 8, 2009 at 11:51 AM, Joao Cachopo <joao.cachopo@inesc-id.pt> wrote:
>

> So, what you are saying is that you need to get data from somewhere else
> and add it to your domain model. That can be done via a Java program
> (or someting compilable to the JVM), right?
>
> 2 differents use case:
> 1) merge a database (same dml) but changes updates.

OK, I had not understood this previously, but now I see what you want.
I apologize if sometimes I make some confusions, but discussing these
things only by email leads to these...

In the Fénix web application we have similar needs for a very particular
part of our domain, but the solution that we have is not applicable in
general.

Except for that particular part of the domain, we don't have this use
case, because we don't have multiple instances of the web application
running against different databases.

> 2) change the dml and transform the related domain instances in the DB.

This is what we do on a daily-basis in the Fénix web application.

Most of the times that the DML file describing the domain model changes,
we need to transform the data in the DB representing our domain entities
to become in sync with the latest DML.

Currently, we transform the domain entities via SQL scripts that we run
on each deploy. You can see examples of those in subdirectories of

https://fenix-ashes.ist.utl.pt/open/trunk/fenix/etc/database_operations/

I don't like this solution, though, and it will become more difficult to
use when we have multiple versions of the data on the database.

Also, according to your description of your usage of the application,
things are much more complex for you because you may have several
instances of the database, each one of them running on a different
version of the application, and you want to migrate data from one
instance to another (eventually, from a more recent version of the
application to an older version?).

> Well, we don't have all the things worked out yet, and there are a few
> corner cases to deal with, but the general idea was laid out by my
> student in a very short paper that he will present in a month:
>
> Ok, I've read it and it remembers me something similar I've seen few
> month ago (can't remember what but it was on the JVM also and based on
> STM).

If you remember it, please send me a link, as we are obviously very
interested in seeing what other approaches exist.

> I would say this looks very, very nice: however I can see some issues; namely:
>
> - (minor/avoidable) name generation clash with scala generated naming
> (uses the same "$x" approach to name anonymous classes (for functions
> / closures))

Yes, this should be trivial to change.

> - (The big one:) Even if the system will handle well dml objects
> versions; it handles internal upgrades of dml objects. So it handles
> a subset of possible upgrades; the super set would contain the ability
> to change the very strcuture of the data graph. (different type of
> objects and different relations); think about what can be changed with
> XSLT.

No. The solution that we are designing intends to allow all kinds of
transformations. For instance, a use case that we are using for
discussing this is the following: imagine that we have a class Person
with lots of slots (name, address, email, mobile phone, work phone,
etc), and that we want to refactor this to introduce a new Contact class
that has a relationship of 1-to-many with Person. The idea is that the
class Person will not have all the address, email, phones, etc, but will
have a set of contacts, instead. Our Dynamic Software Upgrade (DSU)
system should be able to handle this.

Nevertheless, this DSU system will not be ready for production on time
for you to use it once you go live. Moreover, it does not deal with the
merging of data. So, you will have to use something else.

> I think you answer show I have not explained our use case correctly.

Yes, I was thinking along other lines, but now I believe that we are in
sync.

I have a few questions, though...

> When doing a major upgrade;
> - We want to introduce new stuff done by our artists (in one to
> multiple studio).
> - Make them available to our designers (in another studio) (so we need
> to migrate the new objects and update the changed ones)

And how do you deal with different changes made to the same entity by
two different sites. For instance, what if studio1 changed the color of
object A from red to blue, and studio2 changed it from red to green?

What do you want to do when you merge both?

> So you see there's a lot of version and to migrate data changes from
> one to another; I wonder what we better use. (I think we could apply
> change sets of all transactions done since the last synchronization) A
> possibility we would be to listen to all transaction and log the
> changes to create the data for replication; however this introduces a
> weak point if this data is lost or the machine shut down when the
> Feniw Framework is still on; so a natural solution would be to keep
> all transactions informations within some database rows for this
> special purpose (would it be easy?).

Keeping the write-sets (that describe what changed within each
transaction) on the database should be quite easy. The major difficulty
that I see is in dealing with them later and knowing how to apply them
if they correspond to different versions of the DML source code.

> My main concerns are automation; safety; and the fact that upgrading
> to a newer Fenix system should be transparent.

If you are working at the database level, looking into how domain
entities are represented in the database, I can't promise you that
upgrading to a newer fenix-framework version would be transparent. In
fact, given that we are working precisely in changing the data
representation in the persistent store, most probably it will not.

The idea is that the upgrades should be transparent as long as people
use the DML to describe their domain entities, and stick to Java for
manipulating the public interface of the classes generated.

> I think it would also be valuable to handle this kind of pattern.

Maybe, but I don't have many ideas now on how to do it (other than
dealing with upgrades of source code).

> Alos the Id collisions need to be adressed; we think that if a
> database is assigned with a uid; it could be used as a prefix to
> generate uid at object creation.

Yes, that may be a solution, but once again it is dealing with the
internals of the fenix-framework, that may collide with later changes to
it.

However, probably ids need not to remain the same when objects go from
one site to another. If you know that a new object was created, you may
recreate it in another site with a different id.

Ids of objects may be used for two things:

1. internally, to maintain relationships between objects
2. externally, to export references to objects in such things as URLs
and web pages

In the first case, if when you create an object (that comes from another
site) you also connect it consistently to the remaining of the graph of
objects, then everything should be fine.

In the second case, keeping the same ids is important only if you want
to use an URL (or something like it) that was generated for one site for
accessing another site. Even in this case, maybe you could externalize
the ids with the site prefix, and keep a map on each site that maps ids
from other sites to ids of the current site.

Note, though, that this is just me talking from the top of my head.
I've not given much though into this.

>
> [... some parts elided ...]

>
> Writing this I am just realizing that be just need to rename new dml
> definition to avoid collision for both dml an run the transformation
> localy. A second pass being to rename the dml object afterwhile.

If I understood you correctly, that's what we do for bigger refactorings
to the Fénix web application.

For instance, if we want to significantly change a portion of the domain
model that deals with classes A, B, and C, and that will now be
transformed in a changed version of A, three new classes D, E, and F,
and B and C disappear, then we create a class ANew with the new
relationships to D, E, and F, and write Java code that creates instances
of ANew (and D, E, and F) from the old instances of A, B, and C.

This approach is better in the sense that it is safer to do, but is also
a little bit awkward because the domain model becomes a little more
cluttered with names. Of course that we may remove A later and rename
ANew to A, but often that is not done and old classes still linger
around.

> I am sorry for not having been able to answer quicker; I am waiting
> for you on this very subject which is the last big step to enable us
> to produce our datas

Well, I hope that my email helped you (despite not having given any
solutions), but I must confess that I don't see this as very easy.
Whatever solution I envision right now seems to require extreme
discipline to prevent errors from creeping in, and there are issues,
such as knowing how to merge two colliding changes, that I don't know
how to solve elegantly at all.

But, please, tell me if you think that I can help..

Joao Cachopo

unread,

Jun 19, 2009, 5:16:12 AM6/19/09

to fenix-f...@googlegroups.com

Damián <damian....@gmail.com> writes:

> After numerous exchanges with João via private e-mail, I am finally
> using this new communication channel, as it was suggested some time
> ago.

Great! Thanks Damián.

> I was wondering how hard would it be to handle several Fénix instances
> in the same JVM process. That would be most useful for the work I'm
> doing around domain object replication.

By multiple Fénix instances, I suppose that you mean having more than
one domain model and transactional system (connected to a given
database) working at the same time in the same process, but not in the
same way as is done by Tomcat, for instance, where you have different
classloaders that isolate the various applications from one another,
right?

You want to have code that handles simultaneously instances of both
domain models, is that it?

I haven't thought much about this, but from the top of my head it
doesn't seem very easy to do.

Just to make sure that I understood you correctly, you would like to be
able to initialize the FenixFramework more than once, with different
Config instances, each one corresponding to a different domain model and
database. To be able to distinguish among the various initializations,
we could pass an id to the initialize method. Something along the lines
of:

Config config1 = new Config() {{
domainModelPath = "/domain1.dml";
dbAlias = "//localhost:3306/db1";
dbUsername = "db1User";
dbPassword = "db1Pass";
rootClass = App1.class;
}};
FenixFramework.initialize("ctxt1", config1);

Config config2 = new Config() {{
domainModelPath = "/domain2.dml";
dbAlias = "//localhost:3306/db2";
dbUsername = "db2User";
dbPassword = "db2Pass";
rootClass = App2.class;
}};
FenixFramework.initialize("ctxt2", config2);

But how would you use this, then?

Would you have a transaction for each DB? I imagine that the we could
create a transaction for a given id, such as

@Atomic(context = "ctxt1")
void m() {
// do something that accesses stuff in db1
}

but probably you would want to access entities that come from different
DBs within the same transaction, right?

That means that probably you want transactions to be independent of the
FenixFramework's initializations made before, and we would know which DB
to use from the classes that we are using, assuming that the set of
classes is disjoint among the various domain models.

But if I recall correctly from our previous emails, both of your domain
models will have the same classes (more or less), meaning that you would
not know, from the classes alone, from which DB you should load them.

Once we have objects from different DBs, I imagine that it would be
possible for us to know which DB to use when writing the changes to an
object, if we kept track of which DB each object came from. It may not
be easy to do, but it may be possible (I don't know, I would have to
think much more about this).

But, what would happen if an object from DB1 was connected to an object
from DB2? That should not be possible, right? But then, the framework
would have to detect those cases and prevent them...

Also, what about new objects? Where would we write them?

This was just me "thinking out loud", and I must confess that I don't
have it much clear in my mind, yet. Do you have any clearer ideas of
how things should work?

Wouldn't it be much easier to have two independent JVM processes
communicating between them? How would that be harder than your
suggestion? Can you make a sketch of code for both approaches?

Comments/ideas from anyone else are welcomed...

--
João Cachopo

Joao Cachopo

unread,

Jun 19, 2009, 5:42:50 AM6/19/09

to fenix-f...@googlegroups.com

Stephane Le Dorze <stephane...@gmail.com> writes:

> We've already had a lengthy discussion about that with Joao; Here's
> the relevant part

Meanwhile, since this email that Stephane included in his message, I've
added to the Fenix Framework the TxIntrospector interface that helps in
solving some of the problems that were being discussed here. I believe
that Damián and Stephane are using this new interface now to capture
changes made in one server, so that they may reapply them in another
server.

I would like to add, as a followup to what I wrote before:

> This is what we do on a daily-basis in the Fénix web application.
>
> Most of the times that the DML file describing the domain model changes,
> we need to transform the data in the DB representing our domain entities
> to become in sync with the latest DML.
>
> Currently, we transform the domain entities via SQL scripts that we run
> on each deploy. You can see examples of those in subdirectories of
>
> https://fenix-ashes.ist.utl.pt/open/trunk/fenix/etc/database_operations/
>
> I don't like this solution, though, and it will become more difficult to
> use when we have multiple versions of the data on the database.

Yesterday I talked with Luis Cruz (I believe that he is reading this
also) again about this same subject and we are more or less coming to
the conclusion that this approach will not work much longer.

As the schema of the DB becomes more complex and farther away from the
traditional relational schema, it becomes harder to change manually.

But I think that continuing to refrain from adding new features to the
Fenix Framework to remain within the comfort zone of the relational
approach is not worth it. We have many things planned or in development
that are continuously set back because of the database. So, I think
that it is time to move on and leave behind the relational schema.

This means that we have to deal with the changes in the persistent data,
when the code changes, entirely at the Java level, rather than at the DB
level.

Best regards,
--
João Cachopo

Damián Arregui

unread,

Jun 19, 2009, 5:54:31 AM6/19/09

to fenix-f...@googlegroups.com

2009/6/19 Joao Cachopo <joao.c...@ist.utl.pt>

You want to have code that handles simultaneously instances of both
domain models, is that it?

That would give the more flexibility, but even being able to sequentially access different DBs would be an improvement. Something like:

FF.initialize(config1)

// do some stuff with DB1

FF.initialize(config2)

// do some stuff with DB2

Of course, transactional operations on Domain Objects would only be possible while the corresponding configuration is active.

I haven't thought much about this, but from the top of my head it
doesn't seem very easy to do.

That's what I guessed ;-)

Just to make sure that I understood you correctly, you would like to be
able to initialize the FenixFramework more than once, with different
Config instances, each one corresponding to a different domain model and
database. To be able to distinguish among the various initializations,
we could pass an id to the initialize method. Something along the lines
of:

...

Yes, you did understand correctly what I was thinking about.

But how would you use this, then?

Would you have a transaction for each DB? I imagine that the we could
create a transaction for a given id, such as

@Atomic(context = "ctxt1")
void m() {
// do something that accesses stuff in db1
}

but probably you would want to access entities that come from different
DBs within the same transaction, right?

Mmmh, not sure what would be the exact semantics of that, transactionally speaking. Simultaneous access to Domain Objects coming from differents DBs in the same transaction is not required. We could have the following patten:

@Atomic(context = "ctxt1")

void m() {
// do something that accesses stuff in db1

}

That means that probably you want transactions to be independent of the
FenixFramework's initializations made before, and we would know which DB
to use from the classes that we are using, assuming that the set of
classes is disjoint among the various domain models.

But if I recall correctly from our previous emails, both of your domain
models will have the same classes (more or less), meaning that you would
not know, from the classes alone, from which DB you should load them.

Right, that would definitely be a use case.

Damián Arregui

unread,

Jun 19, 2009, 6:28:19 AM6/19/09

to fenix-f...@googlegroups.com

I accidentally sent my last email while it still was incomplete. Here's the rest of it.

2009/6/19 Damián Arregui <damian....@gmail.com>

2009/6/19 Joao Cachopo <joao.c...@ist.utl.pt>

Mmmh, not sure what would be the exact semantics of that, transactionally speaking. Simultaneous access to Domain Objects coming from differents DBs in the same transaction is not required. We could have the following patten:

@Atomic(context = "ctxt1")
void m() {
// do something that accesses stuff in db1

return usefulValues
}

@Atomic(context = "ctxt2")

void m2(usefulValues) {
// do something that accesses stuff in db2

}

That means that probably you want transactions to be independent of the
FenixFramework's initializations made before, and we would know which DB
to use from the classes that we are using, assuming that the set of
classes is disjoint among the various domain models.

...

This was just me "thinking out loud", and I must confess that I don't
have it much clear in my mind, yet. Do you have any clearer ideas of
how things should work?

Again, the semantics of those "cross-DB" transactions are still unclear to me. In my mind, you need to set a transactional context which gives you access to the associated Domain Objects. Accessing Domain Objects from another context should raise an Exception.

The problem which remains is the one you raised about class name collisions, in particular if we are migrating data between two different versions of a DML definition. Including a version number in the package name, at least for the migration process, could be a workaround.

Wouldn't it be much easier to have two independent JVM processes
communicating between them? How would that be harder than your
suggestion? Can you make a sketch of code for both approaches?

At the moment, I'm doing the following to test a simple Master/Slave setting:

1. Init db1 and db2

2. Update db1

3. Export db1 changelog to file

4. Replay changelog on db2 from file

5. Dump db1 to file

6. Dump db2 to file

7. Compare dumps

I'd like to be able to do all that in-memory. That would not require simultaneous access to both db1 and db2 on the same transaction.

Going further, in a Master/Master setting we'll need both DBs to agree upon the changesets they need to replay, and eventually to detect and solve conflicts. Access to both DBs from the same process would come-in handy then.