Stateless Trytond

94 views
Skip to first unread message

Mikhail Savushkin

unread,
Nov 15, 2016, 6:35:03 AM11/15/16
to tryton-dev
Hi there!
I'm a developer from semilimes. We're trying to use Tryton as a cloud ERP. 
So, unchaining trytond from some particular DB on launch was our first problem, which was solved thanks to Cedric's advices during the TUB'16 (thank you a lot for that, greatly appreciated!!)

Now we have another blocker on the road - the Pool building process is too long to be fired on every request. And we cant afford collecting Pools for all of the DBs in memory, since it will eat all the available memory pretty quick. So we have an idea: rewrite a small part of the pool.py module, so that it will build only the currently needed classes for the particular request, i.e. patch the "Pool.get()" method mostly. And throw the data away after request. So, we hope that building only a small part of all the classes out there will cut the building time a lot, and will allow us to do so on each request without significant response time lags. There is a difficulty with building only a specific class, not all at once, since right now Pool just loops through all of them. We think of building some sort of "maps" for each class, that will allow us to quickly understand, what exactly and how to build for each needed class.

So, the question is - what do you guys think about it? Are we just plain crazy, or this is a possible way to go? 
Or maybe you know some simplier solution for this problem, do you?

Really, just interested in your opinion, guys. 
Thank you!

Cédric Krier

unread,
Nov 15, 2016, 7:20:03 AM11/15/16
to tryton-dev
On 2016-11-15 03:14, Mikhail Savushkin wrote:
> Now we have another blocker on the road - the Pool building process is too
> long to be fired on every request. And we cant afford collecting Pools for
> all of the DBs in memory, since it will eat all the available memory pretty
> quick.

Do you have measures for that?
Maybe a solution will be to have a way to clear the pool for some
database which were not accessed since a long time.

> So we have an idea: rewrite a small part of the pool.py module, so
> that it will build only the currently needed classes for the particular
> request, i.e. patch the "Pool.get()" method mostly. And throw the data away
> after request. So, we hope that building only a small part of all the
> classes out there will cut the building time a lot, and will allow us to do
> so on each request without significant response time lags.

Some requests may require to access many classes in such case it will
not be very efficient to build the class on the fly.
Also to build a class, you need to query the database to know which
modules are activated and build the dependency graph, so this will
generate a lot of queries.

--
Cédric Krier - B2CK SPRL
Email/Jabber: cedric...@b2ck.com
Tel: +32 472 54 46 59
Website: http://www.b2ck.com/

Jean Cavallo

unread,
Nov 15, 2016, 9:19:24 AM11/15/16
to tryto...@googlegroups.com
2016-11-15 13:19 GMT+01:00 Cédric Krier <cedric...@b2ck.com>:
On 2016-11-15 03:14, Mikhail Savushkin wrote:
> Now we have another blocker on the road - the Pool building process is too
> long to be fired on every request. And we cant afford collecting Pools for
> all of the DBs in memory, since it will eat all the available memory pretty
> quick.

Do you have measures for that?

I would be interested as well to get some metrics ! How many DBs are you
talking about ?
 
> So we have an idea: rewrite a small part of the pool.py module, so
> that it will build only the currently needed classes for the particular
> request, i.e. patch the "Pool.get()" method mostly. And throw the data away
> after request. So, we hope that building only a small part of all the
> classes out there will cut the building time a lot, and will allow us to do
> so on each request without significant response time lags.

Some requests may require to access many classes in such case it will
not be very efficient to build the class on the fly.
Also to build a class, you need to query the database to know which
modules are activated and build the dependency graph, so this will
generate a lot of queries.

A hybrid solution could maybe be to select some "core" models that will
always be loaded in memory (typically all ir / res models), and load the
others on demand. But I agree with Cedric that I am not sure the gain would
be that interesting even in the best case scenario (i.e. only one or two models
to build).

Jean Cavallo
Coopengo

Mikhail Savushkin

unread,
Nov 15, 2016, 10:20:04 AM11/15/16
to tryton-dev
Metrics are:

"""
DBs in Pool RSS, kBytes Added to memory, kBytes
0 65,020 -
1 86,576 +21,500
2 105,520 +19,000
3 114,520 + 9,000
4 123,708 + 9,000
5 132,444 + 9,000
"""

And regarding the efficiency - thats why we wanna use "maps", possibly stored in some Redis.
Btw, we would be totally happy to just store the entire Pool in Redis and fetch it from Redis per request, but as far as I get, serializing Pool is practically impossible.

As long as we're talking about millions of DBs, this straightforward approach wont work.

вторник, 15 ноября 2016 г., 17:19:24 UTC+3 пользователь Giovanni написал:

Cédric Krier

unread,
Nov 15, 2016, 11:10:03 AM11/15/16
to tryton-dev
On 2016-11-15 07:02, Mikhail Savushkin wrote:
> Metrics are:
>
>
> """
> DBs in Pool RSS, kBytes Added to memory, kBytes
> 0 65,020 -
> 1 86,576 +21,500
> 2 105,520 +19,000
> 3 114,520 + 9,000
> 4 123,708 + 9,000
> 5 132,444 + 9,000
> """
>
>
> And regarding the efficiency - thats why we wanna use "maps", possibly stored in some Redis.
>
> Btw, we would be totally happy to just store the entire Pool in Redis and fetch it from Redis per request, but as far as I get, serializing Pool is practically impossible.
>
>
> As long as we're talking about millions of DBs, this straightforward
> approach wont work.

On a standard Tryton setup, you should have about 500 classes and I
think the standard RPC call will use at least 20 classes. I do not think
you will have a huge gain by limiting the number of classes you setup in
the pool. Indeed I think you should focus on speeding the init of the
Pool for a database (and implement a garbage collection of unused DB).
Also you could try to dispatch request for the same DB to a subsets of
server.

By the way, did you test with this changeset:
http://hg.tryton.org/trytond/rev/f1cbb165eefa

Here are some places where I think there could be improvements:

- Cache the result of create_graph. This is build using only static
data.

- Cache the module_list to avoid querying the database, needs to be
cleared when a new module is activated (on update)

- Improve the __setup__ methods:

- find better than deepcopy:
http://hg.tryton.org/trytond/file/default/trytond/model/model.py#l46

- idem for ModelView

- Improve the __post_setup__ methods:

- _defaults could maybe be removed

- idem for _fields (or maybe filled on the first uses)


PS: Please do not top-post on this mailing list, see
https://groups.tryton.org/netiquette

Jean Cavallo

unread,
Nov 15, 2016, 11:10:42 AM11/15/16
to tryto...@googlegroups.com

2016-11-15 16:02 GMT+01:00 Mikhail Savushkin <kul...@gmail.com>:
As long as we're talking about millions of DBs, this straightforward approach wont work.

Maybe another approach could be to group the pools per module combinations ?
i.e. the pool for DB1 which has the same installed modules than DB2 has could be
the same one.

I do not think there are that many "usable" combinations, and you could probably
make it work. You could also dispatch on different servers depending on the installed
modules on the target DB, you would just have to maintain a cache on your load
balancer to match db_name / server.

Jean Cavallo
Coopengo

Albert Cervera i Areny

unread,
Nov 17, 2016, 3:55:57 AM11/17/16
to tryto...@googlegroups.com
I'm not sure of what I'll say, but AFAIU deepcopy is used to support
several databases (with different modules). Would it be possible to
pass a "--single-database" parameter to trytond, so it would just
support a single database but it would also avoid deepcopying? Or is
there much more work than that?

>
> - idem for ModelView
>
> - Improve the __post_setup__ methods:
>
> - _defaults could maybe be removed
>
> - idem for _fields (or maybe filled on the first uses)
>
>
> PS: Please do not top-post on this mailing list, see
> https://groups.tryton.org/netiquette
>
> --
> Cédric Krier - B2CK SPRL
> Email/Jabber: cedric...@b2ck.com
> Tel: +32 472 54 46 59
> Website: http://www.b2ck.com/
>
> --
> You received this message because you are subscribed to the Google Groups "tryton-dev" group.
> To view this discussion on the web visit https://groups.google.com/d/msgid/tryton-dev/20161115160526.GP16659%40tetsuo.



--
Albert Cervera i Areny
http://www.NaN-tic.com
Tel. 93 553 18 03

Sergi Almacellas Abellana

unread,
Nov 17, 2016, 4:01:55 AM11/17/16
to tryto...@googlegroups.com
El 17/11/16 a les 09:55, Albert Cervera i Areny ha escrit:
> 2016-11-15 17:05 GMT+01:00 Cédric Krier <cedric...@b2ck.com>:
>> > On 2016-11-15 07:02, Mikhail Savushkin wrote:
>>> >> As long as we're talking about millions of DBs, this straightforward
>>> >> approach wont work.
>> >
>> >
>> > - Improve the __setup__ methods:
>> >
>> > - find better than deepcopy:
>> > http://hg.tryton.org/trytond/file/default/trytond/model/model.py#l46
> I'm not sure of what I'll say, but AFAIU deepcopy is used to support
> several databases (with different modules). Would it be possible to
> pass a "--single-database" parameter to trytond, so it would just
> support a single database but it would also avoid deepcopying? Or is
> there much more work than that?
>

Indeed there is an issue to remove multiple database support [1], so if
deepcopy is only used to support several databases it should be removed
with this issue and it will be great to have a note on it to not forget
about it.

[1] https://bugs.tryton.org/issue5694


--
Sergi Almacellas Abellana
www.koolpi.com
Twitter: @pokoli_srk

Cédric Krier

unread,
Nov 17, 2016, 4:30:03 AM11/17/16
to tryto...@googlegroups.com
On 2016-11-17 09:55, Albert Cervera i Areny wrote:
> > - find better than deepcopy:
> > http://hg.tryton.org/trytond/file/default/trytond/model/model.py#l46
>
> I'm not sure of what I'll say, but AFAIU deepcopy is used to support
> several databases (with different modules). Would it be possible to
> pass a "--single-database" parameter to trytond, so it would just
> support a single database but it would also avoid deepcopying? Or is
> there much more work than that?

It is not only for multi-database. For example, a common pattern (which
probably should be removed) is to have a global variable STATES which is
used in Selection. If another module extend the Selection field, we do
not want it to change the global STATES variable.

Indeed what I was suggesting is not to remove the feature because it is
quite convenient (remember old version where copy should be done on any
modified field attribute). I think the deepcopy could be improved by
limiting what is copied. This could be done by creating __copy__ method
on fields (like: Function and Property).
But first, it has to be proven that deepcopy on this case is slow and
memory consuming.

Mikhail Savushkin

unread,
Nov 17, 2016, 8:10:03 AM11/17/16
to tryton-dev


четверг, 17 ноября 2016 г., 12:30:03 UTC+3 пользователь Cédric Krier написал:
I can confirm, that `deepcopy` is indeed slow. It takes about 95% of the whole running time of __setup__, and removing it boosts the whole Pool building process like for 60% (i.e. on my tests it was "before" - 1sec, "after" - 400ms). We're now trying to understand, how can we remove it, or make it lighter. Will definitely share our conclusions later on, when ready. 
Reply all
Reply to author
Forward
0 new messages