openerp2tryton : painfully long in migrate_account_balance

102 views
Skip to first unread message

Richard PALO

unread,
Jul 18, 2017, 7:01:35 AM7/18/17
to tryton
I'm looking for hints as to how to prod move.save()
to complete or at least to give some info as to why it isn't.

It appears to be cpu bound (indeed, the openerp base is quite large).

move.save is at the end of the following, from line 1065..1070 in migrate_account_balance
> move.lines.append(new_line)
> new_line.account = get_account(line.code, code2account)
> if new_line.account.party_required:
> new_line.party = code2party[str(line.partner_id)]
>
> move.save()

phppgadmin seems to believe that the following is cycling:
> SELECT "a"."rounding" AS "rounding", "a"."active" AS "active", "a"."code" AS "code", "a"."create_date" AS "create_date", "a"."create_uid" AS "create_uid", "a"."digits" AS "digits", "a"."id" AS "id", "a"."mon_decimal_point" AS "mon_decimal_point", "a"."mon_grouping" AS "mon_grouping", "a"."mon_thousands_sep" AS "mon_thousands_sep", "a"."n_cs_precedes" AS "n_cs_precedes", "a"."n_sep_by_space" AS "n_sep_by_space", "a"."n_sign_posn" AS "n_sign_posn", "a"."negative_sign" AS "negative_sign", "a"."numeric_code" AS "numeric_code", "a"."p_cs_precedes" AS "p_cs_precedes", "a"."p_sep_by_space" AS "p_sep_by_space", "a"."p_sign_posn" AS "p_sign_posn", "a"."positive_sign" AS "positive_sign", "a"."symbol" AS "symbol", "a"."write_date" AS "write_date", "a"."write_uid" AS "write_uid" FROM "currency_currency" AS "a" WHERE (("a"."id" IN (44)))

when I say long, it is already over 5 hours!
--

Richard PALO

Cédric Krier

unread,
Jul 18, 2017, 7:28:13 AM7/18/17
to tryton
On 2017-07-18 13:01, Richard PALO wrote:
> I'm looking for hints as to how to prod move.save()
> to complete or at least to give some info as to why it isn't.
>
> It appears to be cpu bound (indeed, the openerp base is quite large).
>
> move.save is at the end of the following, from line 1065..1070 in migrate_account_balance
> > move.lines.append(new_line)
> > new_line.account = get_account(line.code, code2account)
> > if new_line.account.party_required:
> > new_line.party = code2party[str(line.partner_id)]
> >
> > move.save()

How do you know it is the problem?

> phppgadmin seems to believe that the following is cycling:
> > SELECT "a"."rounding" AS "rounding", "a"."active" AS "active", "a"."code" AS "code", "a"."create_date" AS "create_date", "a"."create_uid" AS "create_uid", "a"."digits" AS "digits", "a"."id" AS "id", "a"."mon_decimal_point" AS "mon_decimal_point", "a"."mon_grouping" AS "mon_grouping", "a"."mon_thousands_sep" AS "mon_thousands_sep", "a"."n_cs_precedes" AS "n_cs_precedes", "a"."n_sep_by_space" AS "n_sep_by_space", "a"."n_sign_posn" AS "n_sign_posn", "a"."negative_sign" AS "negative_sign", "a"."numeric_code" AS "numeric_code", "a"."p_cs_precedes" AS "p_cs_precedes", "a"."p_sep_by_space" AS "p_sep_by_space", "a"."p_sign_posn" AS "p_sign_posn", "a"."positive_sign" AS "positive_sign", "a"."symbol" AS "symbol", "a"."write_date" AS "write_date", "a"."write_uid" AS "write_uid" FROM "currency_currency" AS "a" WHERE (("a"."id" IN (44)))

I do not think it could be a problem, I guess it just show you the last
query run by the session.

> when I say long, it is already over 5 hours!

Yep, proteus is not specially designed for performance.
--
Cédric Krier - B2CK SPRL
Email/Jabber: cedric...@b2ck.com
Tel: +32 472 54 46 59
Website: http://www.b2ck.com/

Cédric Krier

unread,
Jul 18, 2017, 7:35:06 AM7/18/17
to tryton
On 2017-07-18 13:27, Cédric Krier wrote:
> On 2017-07-18 13:01, Richard PALO wrote:
> > I'm looking for hints as to how to prod move.save()
> > to complete or at least to give some info as to why it isn't.
> >
> > It appears to be cpu bound (indeed, the openerp base is quite large).

It will be good to have an estimation of the accounts, partner.
Because the query for the balance could return a huge number of lines
and it may not fit well in the memory of the host. In this case, it may
be good to create the move without lines and create line per small
bunch.

Richard PALO

unread,
Jul 18, 2017, 7:35:30 AM7/18/17
to try...@googlegroups.com
Le 18/07/2017 à 13:27, Cédric Krier a écrit :
> On 2017-07-18 13:01, Richard PALO wrote:
>> I'm looking for hints as to how to prod move.save()
>> to complete or at least to give some info as to why it isn't.
>>
>> It appears to be cpu bound (indeed, the openerp base is quite large).
>>
>> move.save is at the end of the following, from line 1065..1070 in migrate_account_balance
>>> move.lines.append(new_line)
>>> new_line.account = get_account(line.code, code2account)
>>> if new_line.account.party_required:
>>> new_line.party = code2party[str(line.partner_id)]
>>>
>>> move.save()
>
> How do you know it is the problem?
>
I added print statements to trace ensuring that the 'for line in cur' loop was completed.

> ...
> I do not think it could be a problem, I guess it just show you the last
> query run by the session.
>
yeah, seems probably the case.

>> when I say long, it is already over 5 hours!
>
> Yep, proteus is not specially designed for performance.
>
yikes, guess I'll need to try breaking things down a bit differently,
that is instead of putting all move lines in a monster 'migration' move.

cheers,
--

Richard PALO

E. Boer

unread,
Jul 18, 2017, 4:16:34 PM7/18/17
to tryton, richar...@free.fr

>> when I say long, it is already over 5 hours!
>
> Yep, proteus is not specially designed for performance.
>
yikes, guess I'll need to try breaking things down a bit differently,
that is instead of putting all move lines in a monster 'migration' move.

We ran into the same problem and created a client script which talks over JSON-RPC with the tryton-server. We are now migrating 300.000 account_move_lines within 1:45 hour. It's a bit more work to get it running. Another option is to create a migration_module which does the job.

Richard PALO

unread,
Jul 20, 2017, 11:32:00 AM7/20/17
to try...@googlegroups.com
A bit more info now, seems the French Account plan has some issues where some
accounts have "party_required == true" (such as 421, 4387, 4687, 486)
but these don't necessarily have parties identified.

This was the error:

> Traceback (most recent call last):
> File "migration.py3", line 1162, in <module>
> 'account': args.load_account,
> File "migration.py3", line 79, in main
> migrate_account_balance(installed, cur)
> File "migration.py3", line 1098, in migrate_account_balance
> move.save()
> File "/usr/lib/python3.6/site-packages/proteus-4.4.1-py3.6.egg/proteus/__init__.py", line 101, in newfunc
> File "/usr/lib/python3.6/site-packages/proteus-4.4.1-py3.6.egg/proteus/__init__.py", line 769, in save
> File "/usr/lib/python3.6/site-packages/proteus-4.4.1-py3.6.egg/proteus/config.py", line 172, in __call__
> File "/opt/trytond/trytond/modules/account/move.py", line 248, in write
> super(Move, cls).write(*args)
> File "/opt/trytond/trytond/model/modelsql.py", line 968, in write
> field.set(cls, fname, *fargs)
> File "/opt/trytond/trytond/model/fields/one2many.py", line 217, in set
> Target.create(to_create)
> File "/opt/trytond/trytond/modules/account/move.py", line 1454, in create
> lines = super(Line, cls).create(vlist)
> File "/opt/trytond/trytond/model/modelsql.py", line 604, in create
> cls._validate(sub_records)
> File "/opt/trytond/trytond/model/modelstorage.py", line 1015, in _validate
> field_name)
> File "/opt/trytond/trytond/model/modelstorage.py", line 999, in required_test
> error_args=cls._get_error_args(field_name))
> File "/opt/trytond/trytond/error.py", line 74, in raise_user_error
> raise UserError(error)
> trytond.exceptions.UserError: The field "Party" on "Account Move Line" is required. -

I'll file an issue, but in the meanwhile, I presume I'll need to see if I can programatically
set "party_required" to false for these accounts.

Can't directly explain why it takes so long before throwing this error, but oh well,
it may be because there are between 15K and 25K move lines per fiscal year. So far I split
by year, but might as well go for year/period/journal/move why I'm at it (for a full, deep migration).
--

Richard PALO

E. Boer

unread,
Jul 20, 2017, 5:00:54 PM7/20/17
to tryton, richar...@free.fr


On Thursday, 20 July 2017 17:32:00 UTC+2, Richard PALO wrote:
A bit more info now, seems the French Account plan has some issues where some
accounts have "party_required == true" (such as 421, 4387, 4687, 486)
but these don't necessarily have parties identified.

You are running into the same problems as we had because OpenERP is very forgiving when you put garbage in. Tryton is very strict which we realy like because it keeps the data"clean". I think you will also run into trouble when migration sales and invoices because maybe you added the wrong taxes or a wrong account to the invoice or sale order.

 
 
Can't directly explain why it takes so long before throwing this error, but oh well,
it may be because there are between 15K and 25K move lines per fiscal year. So far I split
by year, but might as well go for year/period/journal/move why I'm at it (for a full, deep migration).

Like Cedric already said, Proteus is not ment to be used for a huge amount of data. You can create a script like explained in the (very) old wiki: http://hg.tryton.org/deprecated/tryton.wiki/file/tip/RemoteCalls.wiki

Cédric Krier

unread,
Jul 20, 2017, 6:45:08 PM7/20/17
to tryton
On 2017-07-20 14:00, E. Boer wrote:
> Like Cedric already said, Proteus is not ment to be used for a huge amount
> of data. You can create a script like explained in the (very) old wiki:
> http://hg.tryton.org/deprecated/tryton.wiki/file/tip/RemoteCalls.wiki

Indeed such call is doable with proteus by using the raw function of the
Model like Model.create or Model.write.

Richard PALO

unread,
Jul 21, 2017, 9:47:26 AM7/21/17
to try...@googlegroups.com
Le 21/07/2017 à 00:42, Cédric Krier a écrit :
> On 2017-07-20 14:00, E. Boer wrote:
>> Like Cedric already said, Proteus is not ment to be used for a huge amount
>> of data. You can create a script like explained in the (very) old wiki:
>> http://hg.tryton.org/deprecated/tryton.wiki/file/tip/RemoteCalls.wiki
> Indeed such call is doable with proteus by using the raw function of the
> Model like Model.create or Model.write.
>
E.Boer, nice to know you've been able to get such thoughput, which seems much more reasonable.
You don't happen to have a version of your script that could work with oerp6.1->tryton4.4?

Cedric, since I'm already in proteus, the Model.write() seems quite worthwhile to try out first.

Would it be possible to have a very simple example snippet with a move record including a number of move lines
such that I can better understand the syntax of what is expected?

as to .save(), it appears to seriously suffer, even exponentially, with large numbers of move lines.

for info, I placed the move.save() inside the loop to see via the database how things advanced with 15K lines
(having already split by fiscal year), and it would do a couple hundred relatively fast, but a thousand was slowing down
fast, and many hours later only a total of roughly 2500 move lines were saved so extrapolating 15K is probably close to
infinity,timewise. *I presume* multiple moves would help already greatly (but I haven't had a chance yet to try it).

cheers

--

Richard PALO

Sergi Almacellas Abellana

unread,
Jul 21, 2017, 9:59:38 AM7/21/17
to try...@googlegroups.com
El 21/07/17 a les 15:47, Richard PALO ha escrit:
> Le 21/07/2017 à 00:42, Cédric Krier a écrit :
>> On 2017-07-20 14:00, E. Boer wrote:
>>> Like Cedric already said, Proteus is not ment to be used for a huge
>>> amount
>>> of data. You can create a script like explained in the (very) old wiki:
>>> http://hg.tryton.org/deprecated/tryton.wiki/file/tip/RemoteCalls.wiki
>> Indeed such call is doable with proteus by using the raw function of the
>> Model like Model.create or Model.write.
>>
> E.Boer, nice to know you've been able to get such thoughput, which seems
> much more reasonable.
> You don't happen to have a version of your script that could work with
> oerp6.1->tryton4.4?

If you are using proteus 4.4, it's already doing one single call to the
server using the create and the list of all lines.

IIUC, you are creating a move with 15k lines. So I'm wondering if
increasing the record cache size [1] to a number higher than 15k for the
migration will speedup things a little bit. Of course, this will
increase the memory usage, so just take care you don't get out of memory.

Hope it helps.

[1]
http://doc.tryton.org/4.4/trytond/doc/topics/configuration.html?highlight=cache#record

--
Sergi Almacellas Abellana
www.koolpi.com
Twitter: @pokoli_srk

Richard PALO

unread,
Jul 23, 2017, 2:14:12 PM7/23/17
to try...@googlegroups.com
Le 21/07/2017 à 15:59, Sergi Almacellas Abellana a écrit :
> If you are using proteus 4.4, it's already doing one single call to the
> server using the create and the list of all lines.
>
> IIUC, you are creating a move with 15k lines. So I'm wondering if
> increasing the record cache size [1] to a number higher than 15k for the
> migration will speedup things a little bit. Of course, this will
> increase the memory usage, so just take care you don't get out of memory.
>
> Hope it helps.
>
> [1]
> http://doc.tryton.org/4.4/trytond/doc/topics/configuration.html?highlight=cache#record
>

This was a nice tip, and it allowed me to get much much further. Thanks.

In the end, once I created a tryton move per opererp move, it seemed I
could live without modifying the cache and get 5 1/2 years in under a
couple of hours (fiscal year by fiscal year).

--

Richard PALO
"Ne croyez rien, peu importe où vous l'avez lu, ou qui l'a dit,
à moins que cela ne s'accorde avec votre raison."

E. Boer

unread,
Jul 25, 2017, 6:18:32 AM7/25/17
to tryton, richar...@free.fr
E.Boer, nice to know you've been able to get such thoughput, which seems much more reasonable.
You don't happen to have a version of your script that could work with oerp6.1->tryton4.4?

We are migrating from openerp v5 to tryton 4.0. Yes we are working on it for a long time, but we want to do it right. After migration we are going to update to Tryton 4.4. I can send you parts of the script so you can have an idea how we did it, but just to mention that it's very specific for our needs. The main thing is how we connect to the openerp database and connect to Tryton.
 
So please let me know if you want parts of the script.


Richard PALO

unread,
Sep 20, 2017, 1:25:56 PM9/20/17
to tryton

Thought I would mention that I'm seeing roughly a 2,5x speedup using pgbouncer
over vanilla postgresql socket connections (via tryton.conf) where pg and
proteus are running on the same iron.

Pulling complete moves/lines by period by fiscalyear I'm averaging ~100 seconds
per period (roughly 30 minutes per fiscalyear with a rough average of 21-22K lines/year)

cheers,

--

Richard PALO

Albert Cervera i Areny

unread,
Oct 6, 2017, 5:32:25 AM10/6/17
to try...@googlegroups.com
2017-09-20 19:25 GMT+02:00 Richard PALO <richar...@free.fr>:

Thought I would mention that I'm seeing roughly a 2,5x speedup using pgbouncer
over vanilla postgresql socket connections (via tryton.conf) where pg and
proteus are running on the same iron.

I'm surprised you're see this speedup using pgbouncer on the same machine. Pgbouncer is just a connection pooler and Tryton already has a connection pooler itself, so it would be great if you could investigate it further. Some things that come to mind:

- Tryton is not handling correctly the pool of connections and thus creating a new connection more frequently than it should
- You used TCP/IP sockets when you worked with PostgreSQL but use UNIX sockets now that you talk to pgbouncer (so the difference would come from the type of socket, not pgbouncer itself)
- You changed other parameters in Postgres (or Tryton). For example, you should usually change the following default postgresql.conf parameters:

  - Increase work_mem -> This is the amount of memory to be used per connection so you may put one or two hundred megabytes
  - Increase shared_buffers -> Depending on database size, but you may want 1GB, for example (if you've got enough memory, of course)
  - Consider using synchronous_commit = off. -> If you use it in production I recommend you try to understand its implications (we use it in our installs)
  - If you run the migration process on a non-production machine you can use "fsync =off". This can have a huge impact on performance but do NOT use it in production EVER. But I recommend it for development environments if you know that no database you use is critical. It will also have a huge impact when restoring databases.

 
Pulling complete moves/lines by period by fiscalyear I'm averaging ~100 seconds
per period (roughly 30 minutes per fiscalyear with a rough average of 21-22K lines/year)

Don't have numbers to compare, so it's just intuition, but it does not sound specially fast.
 

cheers,

--

Richard PALO

--
You received this message because you are subscribed to the Google Groups "tryton" group.
To view this discussion on the web visit https://groups.google.com/d/msgid/tryton/3bf37461-4aad-346c-ea90-7f334153f13f%40free.fr.

Richard PALO

unread,
Oct 6, 2017, 8:39:41 AM10/6/17
to try...@googlegroups.com
Le 06/10/2017 à 11:32, Albert Cervera i Areny a écrit :
> 2017-09-20 19:25 GMT+02:00 Richard PALO <richar...@free.fr <mailto:richar...@free.fr>>:
>
>
> Thought I would mention that I'm seeing roughly a 2,5x speedup using pgbouncer
> over vanilla postgresql socket connections (via tryton.conf) where pg and
> proteus are running on the same iron.
>
>
> I'm surprised you're see this speedup using pgbouncer on the same machine. Pgbouncer is just a connection pooler and Tryton already has a connection pooler itself, so it would be great if you could investigate it further. Some things that come to mind:
>
> - Tryton is not handling correctly the pool of connections and thus creating a new connection more frequently than it should
> - You used TCP/IP sockets when you worked with PostgreSQL but use UNIX sockets now that you talk to pgbouncer (so the difference would come from the type of socket, not pgbouncer itself)

This is not quite true.
Initially I used directly unix sockets:
> uri = postgresql://tryton@/run/postgresql/
and now as well, but underneath pgbouncer:
> uri = postgresql://tryton@:6432

pgbouncer.ini uses unix_socket_dir = /run/postgresql

> - You changed other parameters in Postgres (or Tryton). For example, you should usually change the following default postgresql.conf parameters:
>
>   - Increase work_mem -> This is the amount of memory to be used per connection so you may put one or two hundred megabytes
work_mem = 256MB # min 64kB

>   - Increase shared_buffers -> Depending on database size, but you may want 1GB, for example (if you've got enough memory, of course)

shared_buffers = 4GB # min 128kB
NB: 32GB main memory on my workstation

>   - Consider using synchronous_commit = off. -> If you use it in production I recommend you try to understand its implications (we use it in our installs)

This is indeed interesting, will look into it more deeply.

>   - If you run the migration process on a non-production machine you can use "fsync =off". This can have a huge impact on performance but do NOT use it in production EVER. But I recommend it for development environments if you know that no database you use is critical. It will also have a huge impact when restoring databases.

never use these...

In any event, once I get all the gritty conversion details completed, I can focus a bit more on runtime tuning.
I'm also wondering if psycopg2 use needs to be looked at too. I came across this 'best practice':
>
> When should I save and re-use a cursor as opposed to creating a new one as needed?
> Cursors are lightweight objects and creating lots of them should not pose any kind of problem. But note that cursors used to fetch result sets will cache the data and use memory in proportion to the result set size. Our suggestion is to almost always create a new cursor and dispose old ones as soon as the data is not required anymore (call close() on them.) The only exception are tight loops where one usually use the same cursor for a whole bunch of INSERTs or UPDATEs.

So perhaps I'll start there first.
>
>
> Pulling complete moves/lines by period by fiscalyear I'm averaging ~100 seconds
> per period (roughly 30 minutes per fiscalyear with a rough average of 21-22K lines/year)
>
>
> Don't have numbers to compare, so it's just intuition, but it does not sound specially fast.
>
At this stage, I don't believe these imports are any longer comparable to 'vanilla' openerp2tryton
as I'm doing a 'deep' copy now, including journals, periods, reconciliations and all...
(in France, my aim is to be able to use exclusively Tryton for previous fiscal year reports,
including 'FEC'... I hope to heave OpenERP as far away as possible when fully migrated:P)

cheers,

--

Richard PALO
Reply all
Reply to author
Forward
0 new messages