trytond-cron sleeping forever

70 views
Skip to first unread message

Sergi Almacellas Abellana

unread,
May 30, 2017, 3:32:12 AM5/30/17
to try...@googlegroups.com
Hi,

I have found some strange issue when using trytond-cron. The process
runs correctly, but after some random time (3-4 days) it starts sleeping
forever. After killing the process and restarting everything works well.

I'm using ubuntu 16.04 and I'm launching the trytond-cron process using
supervisord.

Has anyone found something similar? Any advice?

Thanks,
--
Sergi Almacellas Abellana
www.koolpi.com
Twitter: @pokoli_srk

Sergi Almacellas Abellana

unread,
May 30, 2017, 3:53:15 AM5/30/17
to try...@googlegroups.com
El 30/05/17 a les 09:32, Sergi Almacellas Abellana ha escrit:
> Hi,
>
> I have found some strange issue when using trytond-cron. The process
> runs correctly, but after some random time (3-4 days) it starts sleeping
> forever. After killing the process and restarting everything works well.
I just found one diference on the ps output. When the process is working
correcty it's in the S (sleeping) state, but when the process is
sleeping forever it gets on the Sl state. From the man page i see that
the "l" means that the process is multithread.

So it's possible that some cron thread did not end up correctly and this
causes the main process to not recover from sleep?

Any comments will be much appreciated.

Cédric Krier

unread,
May 30, 2017, 4:45:13 AM5/30/17
to try...@googlegroups.com
The cron code is full of logging, have you anything?
I guess it is a cron task for one database that is running indefinitely.
Maybe a dead lock?

--
Cédric Krier - B2CK SPRL
Email/Jabber: cedric...@b2ck.com
Tel: +32 472 54 46 59
Website: http://www.b2ck.com/

Sergi Almacellas Abellana

unread,
May 30, 2017, 5:29:47 AM5/30/17
to try...@googlegroups.com
El 30/05/17 a les 10:42, Cédric Krier ha escrit:
> On 2017-05-30 09:53, Sergi Almacellas Abellana wrote:
>> El 30/05/17 a les 09:32, Sergi Almacellas Abellana ha escrit:
>>> Hi,
>>>
>>> I have found some strange issue when using trytond-cron. The process
>>> runs correctly, but after some random time (3-4 days) it starts sleeping
>>> forever. After killing the process and restarting everything works well.
>> I just found one diference on the ps output. When the process is working
>> correcty it's in the S (sleeping) state, but when the process is
>> sleeping forever it gets on the Sl state. From the man page i see that
>> the "l" means that the process is multithread.
>>
>> So it's possible that some cron thread did not end up correctly and this
>> causes the main process to not recover from sleep?
>>
>> Any comments will be much appreciated.
> The cron code is full of logging, have you anything?

No, we did not have any logging for the cron. I've setup the logging
and added it to our task. Let's see.

> I guess it is a cron task for one database that is running indefinitely.

Yes, it is. The cron task is executing every minute and the process
scans a folder to import some files from it. If no files found, it does
nothing.

> Maybe a dead lock?

Still investigating, will provide more information when I have collected
it.

Thanks for the input.

Sergi Almacellas Abellana

unread,
Jun 2, 2017, 4:56:46 AM6/2/17
to try...@googlegroups.com
El 30/05/17 a les 11:29, Sergi Almacellas Abellana ha escrit:
>
> Still investigating, will provide more information when I have collected
> it.
I managed to reproduce it another time today and from the cron logs I
see that the thread task is not finishing correctly, so that's why new
task never run.

Still not sure about the details, but I believe it's on our side.

Sergi Almacellas Abellana

unread,
Jun 6, 2017, 8:55:24 AM6/6/17
to try...@googlegroups.com
El 02/06/17 a les 10:56, Sergi Almacellas Abellana ha escrit:
> El 30/05/17 a les 11:29, Sergi Almacellas Abellana ha escrit:
>>
>> Still investigating, will provide more information when I have
>> collected it.
> I managed to reproduce it another time today and from the cron logs I
> see that the thread task is not finishing correctly, so that's why new
> task never run.
>
> Still not sure about the details, but I believe it's on our side.
Now I'm sure about the cause of the error. Our processes, sends some
email notification, which may cause the process wait forever in case of
some network outage. Setting a timeout will fix the issue, and that's
why we created:

https://bugs.tryton.org/issue6540

Regards,

Raimon Esteve

unread,
Jun 7, 2017, 3:19:39 AM6/7/17
to try...@googlegroups.com
HIe Sergi,

why not send email with two phase commit protocol?

Regards,

Sergi Almacellas Abellana

unread,
Jun 7, 2017, 3:59:46 AM6/7/17
to try...@googlegroups.com
El 07/06/17 a les 09:19, Raimon Esteve ha escrit:
> HIe Sergi,
>

Hi Raimon,

> why not send email with two phase commit protocol?

I don't see why it's related with the thread. The problem is when
performing some operation with the smtp server, which is exactly the
same using 2PC or not.

Cédric Krier

unread,
Jun 7, 2017, 5:40:07 AM6/7/17
to try...@googlegroups.com
On 2017-06-06 14:55, Sergi Almacellas Abellana wrote:
> El 02/06/17 a les 10:56, Sergi Almacellas Abellana ha escrit:
> > El 30/05/17 a les 11:29, Sergi Almacellas Abellana ha escrit:
> >>
> >> Still investigating, will provide more information when I have
> >> collected it.
> > I managed to reproduce it another time today and from the cron logs I
> > see that the thread task is not finishing correctly, so that's why new
> > task never run.
> >
> > Still not sure about the details, but I believe it's on our side.
> Now I'm sure about the cause of the error. Our processes, sends some
> email notification, which may cause the process wait forever in case of
> some network outage.

The sendmail was not designed to allow failure, especially with the 2PC.
If there is an exception, the emails are lost.
For me, if you want to send email with Tryton, you should have a email
server running on the same machine. This way, this server will be in
charge of the retry when the network comes back etc.
I suggest to use OpenSMTPd because it is very light and easy to
configure.

Sergi Almacellas Abellana

unread,
Jun 7, 2017, 6:06:32 AM6/7/17
to try...@googlegroups.com
El 07/06/17 a les 11:35, Cédric Krier ha escrit:
> On 2017-06-06 14:55, Sergi Almacellas Abellana wrote:
>> El 02/06/17 a les 10:56, Sergi Almacellas Abellana ha escrit:
>>> El 30/05/17 a les 11:29, Sergi Almacellas Abellana ha escrit:
>>>>
>>>> Still investigating, will provide more information when I have
>>>> collected it.
>>> I managed to reproduce it another time today and from the cron logs I
>>> see that the thread task is not finishing correctly, so that's why new
>>> task never run.
>>>
>>> Still not sure about the details, but I believe it's on our side.
>> Now I'm sure about the cause of the error. Our processes, sends some
>> email notification, which may cause the process wait forever in case of
>> some network outage.
>
> The sendmail was not designed to allow failure, especially with the 2PC.
> If there is an exception, the emails are lost.

For us losing the email is not a big problem, but the big problem is
blocking the process, leaving transactions opened and so on...

> For me, if you want to send email with Tryton, you should have a email
> server running on the same machine. This way, this server will be in
> charge of the retry when the network comes back etc.
> I suggest to use OpenSMTPd because it is very light and easy to
> configure.

Thanks for the suggestion, I was aware of such mechanism, but currently
we prefer to allow failures to keep the system simple. But I do not
discard to use a local smtp server on the (near) future.
Reply all
Reply to author
Forward
0 new messages