Model proposal

0 views
Skip to first unread message

Dimitris Glezos

unread,
Jul 22, 2008, 8:57:22 PM7/22/08
to Transifex devel list
Ah, the dreadful subject of database-mapped models.

I spent a few hours today thinking about Transifex models, and how we
can have a basic set which can be used to incrementally add features
on top of it. Existing ideas I studied include Stéphane's ones at
Development/DataModel and Asgeir's -lies suggestions.

Here's an idea then:

http://transifex.org/attachment/wiki/Development/DataModel/glz_model_v1.png

The drawing has some notes, but here are some more design principles used:

* Python Classes can exist without related DB tables/rows. Eg. are builders.
* Abstraction isn't particularly useful in databases (and causes
chaos), so similar models with different can have different tables
with no common parent (eg. repository types).
* Packaged applications can have their own models and functionality
(think: plugins), which make sense in the package itself. Thinking of
Tx as packaged functionality will help keeping things organized.
* Certain models are common in relations to other models. Think for
example, comments: Ideally you could attach comments to any other
model. You can achieve this by 'abstract' relations, ie. no foreign
keys. Just a reference to the app:table:row it refers to.
* Good functionality *can* replace hierarchy. Google search is a good
example: it beats up organized directories. In our context, linking a
module:branch combination to a release might mean that with every new
release you'll have to re-add them, but a clever application could do
that automatically anyway (think: checkbox saying "auto-add to next
Collection Release").
* Simplicity over complexity: A PO file is identifiable from its
ComponentInstance and the filename, and some of its attributes are the
translation statistics themselves.

I believe a model of this kind can serve us with many features added,
and can be tweaked with new features landing.

Migration from DL shouldn't be a problem using this model.

Comments welcome.

-d


Dimitris Glezos
Jabber ID: gle...@jabber.org, GPG: 0xA5A04C3B
http://dimitris.glezos.com/

"He who gives up functionality for ease of use
loses both and deserves neither." (Anonymous)
--

Asgeir Frimannsson

unread,
Jul 28, 2008, 5:35:28 PM7/28/08
to transif...@googlegroups.com
On Wed, Jul 23, 2008 at 10:57 AM, Dimitris Glezos <dimi...@glezos.com> wrote:
Ah, the dreadful subject of database-mapped models.

Thanks for your thoughs on the subject :)
 
I spent a few hours today thinking about Transifex models, and how we
can have a basic set which can be used to incrementally add features
on top of it. Existing ideas I studied include Stéphane's ones at
Development/DataModel and Asgeir's -lies suggestions.

Here's an idea then:

 http://transifex.org/attachment/wiki/Development/DataModel/glz_model_v1.png

The drawing has some notes, but here are some more design principles used:

 * Python Classes can exist without related DB tables/rows. Eg. are builders.
 * Abstraction isn't particularly useful in databases (and causes
chaos), so similar models with different can have different tables
with no common parent (eg. repository types).

Just a note that we might want to consider: SqlAlchemy allows a single class to represent a join. So say for example we have a common VCSRepository table, and then the cvs-plugin contributes e.g. a CVSRepository class that shares primary key + a type='cvs' attribute with VCSRepository. Here, say the Module table had a repository_id foreign key, and a 'repository' relation.

c = session.query(Component).filter_by(id=1)

repo = c.repository # in sqla this could be returned as a VCSRepositoryJoinedWithCVSRepository class

This might be more efficient than the proposed disconnected approach.
 
* Packaged applications can have their own models and functionality
(think: plugins), which make sense in the package itself. Thinking of
Tx as packaged functionality will help keeping things organized.
+1.
 

 * Certain models are common in relations to other models. Think for
example, comments: Ideally you could attach comments to any other
model. You can achieve this by 'abstract' relations, ie. no foreign
keys. Just a reference to the app:table:row it refers to.

Interesting concept, but I'm wondering how efficient this approach is when using sqlalchemy. Now, the speed of the web-app is very dependent on how many queries are made to the database, and how efficient these queries are. I'm doubting if you could do e.g. eager loading with this strategy. Say for example you want to list the project together with their maintainers in a table. With foreign keys you can get away with a single ORM query for this relation.
 
* Good functionality *can* replace hierarchy. Google search is a good
example: it beats up organized directories. In our context, linking a
module:branch combination to a release might mean that with every new
release you'll have to re-add them, but a clever application could do
that automatically anyway (think: checkbox saying "auto-add to next
Collection Release").

Good idea!

* Simplicity over complexity: A PO file is identifiable from its
ComponentInstance and the filename, and some of its attributes are the
translation statistics themselves.

I believe a model of this kind can serve us with many features added,
and can be tweaked with new features landing.

Migration from DL shouldn't be a problem using this model.

Comments welcome.

I've found Damned Lies to be a great test for how a model works, and a script that imports all relations to this model would be a great test of how feasible it is. DL has some crazy inheritance rules (default modules/branches) that makes porting to a relational model a very interesting experience :)

cheers,
asgeir

Stéphane Raimbault

unread,
Jul 28, 2008, 6:07:51 PM7/28/08
to transif...@googlegroups.com
I like your data model because I can port the Vertimus features to that one.
We only need some use cases to test it.

2008/7/28 Asgeir Frimannsson <asg...@gmail.com>
On Wed, Jul 23, 2008 at 10:57 AM, Dimitris Glezos <dimi...@glezos.com> wrote:

Just a note that we might want to consider: SqlAlchemy allows a single class to represent a join. So say for example we have a common VCSRepository table, and then the cvs-plugin contributes e.g. a CVSRepository class that shares primary key + a type='cvs' attribute with VCSRepository. Here, say the Module table had a repository_id foreign key, and a 'repository' relation.

c = session.query(Component).filter_by(id=1)


c = session.query(Component).get(1)

;)



 * Certain models are common in relations to other models. Think for
example, comments: Ideally you could attach comments to any other
model. You can achieve this by 'abstract' relations, ie. no foreign
keys. Just a reference to the app:table:row it refers to.

Interesting concept, but I'm wondering how efficient this approach is when using sqlalchemy. Now, the speed of the web-app is very dependent on how many queries are made to the database, and how efficient these queries are. I'm doubting if you could do e.g. eager loading with this strategy. Say for example you want to list the project together with their maintainers in a table. With foreign keys you can get away with a single ORM query for this relation.

I'm sceptical too.
On the performance side, I used a table named category_stats to do the sum of statistics of its modules. With this intermediate table it's fast to show the numbers of a category. Vertimus only handles the PO files so the problem was easier. It's a bit premature to add a such table for now but we really need to have speed in mind, SA is convenient but add an overhead.

Stéphane


 


Christos Τrochalakis

unread,
Jul 29, 2008, 4:32:56 AM7/29/08
to transif...@googlegroups.com
On Tue, Jul 29, 2008 at 1:07 AM, Stéphane Raimbault
<stephane....@gmail.com> wrote:
> I like your data model because I can port the Vertimus features to that one.
> We only need some use cases to test it.
>
> 2008/7/28 Asgeir Frimannsson <asg...@gmail.com>
>>
>> On Wed, Jul 23, 2008 at 10:57 AM, Dimitris Glezos <dimi...@glezos.com>
>> wrote:
>>
>> Just a note that we might want to consider: SqlAlchemy allows a single
>> class to represent a join. So say for example we have a common VCSRepository
>> table, and then the cvs-plugin contributes e.g. a CVSRepository class that
>> shares primary key + a type='cvs' attribute with VCSRepository. Here, say
>> the Module table had a repository_id foreign key, and a 'repository'
>> relation.
>>
>> c = session.query(Component).filter_by(id=1)
>
> c = session.query(Component).get(1)
>
> ;)
>

lol, I'd write it as
c = Component.query.get(1)
It is more object oriented :)

So, we have 3 diffent ways to write the same think! I am sure dimitris
and diego would choose something different!
It seems there is an urgent need to close this ticket in trac
(http://transifex.org/ticket/12) about using a consistent coding style
:D:D

>
>>>
>>> * Certain models are common in relations to other models. Think for
>>> example, comments: Ideally you could attach comments to any other
>>> model. You can achieve this by 'abstract' relations, ie. no foreign
>>> keys. Just a reference to the app:table:row it refers to.
>>
>> Interesting concept, but I'm wondering how efficient this approach is when
>> using sqlalchemy. Now, the speed of the web-app is very dependent on how
>> many queries are made to the database, and how efficient these queries are.
>> I'm doubting if you could do e.g. eager loading with this strategy. Say for
>> example you want to list the project together with their maintainers in a
>> table. With foreign keys you can get away with a single ORM query for this
>> relation.
>

Hm, yes eager loading would be impossible probably. However I don't
think that this introduces a serious performance overhead. There are
other things that you can do to increase performance more efficiently,
like template caching/ reducing the huge disk IO in widgets
(getFiles()) etc.

My concern is if this kind of relation can be represented in sqla.
That is, can we do a `my_projects.comments` easily?

Christos Τrochalakis

unread,
Jul 29, 2008, 5:04:33 AM7/29/08
to transif...@googlegroups.com
>
> My concern is if this kind of relation can be represented in sqla.
> That is, can we do a `my_projects.comments` easily?
>

* Sqlalachemy supports this relation, it names it Polymorphic
Assosiation but is also known as Generic Relation in the django world.

Here is an example included in the sqla repository:
http://www.sqlalchemy.org/trac/browser/sqlalchemy/trunk/examples/poly_assoc/poly_assoc_generic.py

and the rationale behind this code: http://techspot.zzzeek.org/?p=13

* Elixir also supports Pol. Assosiations(using almost the same code
underneath):
http://elixir.ematia.de/apidocs/elixir.ext.associable.html

the api for elixir is really clear!

Reply all
Reply to author
Forward
0 new messages