Database connection retry

1,115 views
Skip to first unread message

James Pic

unread,
Mar 1, 2017, 4:00:12 PM3/1/17
to django-d...@googlegroups.com
Hi all,

It seems like runserver won't retry to connect to the database after a failing connection. Once the db server is up, it looks like I have to restart runserver manually.

If this is correct, may I suggest that we make runserver retry connecting to the database if it fails ?

Thanks

Tim Graham

unread,
Mar 1, 2017, 4:21:41 PM3/1/17
to Django developers (Contributions to Django itself)
Could you explain the use case a bit more? Why is your database failing on a regular basis?

James Pic

unread,
Mar 1, 2017, 6:52:48 PM3/1/17
to django-d...@googlegroups.com
Sometimes it's not started because some modern orchestration tools such as ansible-container and docker-compose (perhaps more) start everything at once, and django might be faster than the db, or I have to fix something with the db orchestration tool.

I noted we might have the same issue with redis+channels (if I've not waited long enough before taking action instead of waiting for channels to retry). It seems like something reasonable to improve the development experience with Django while keeping up with the orchestration tools because I've heard about other users making tools in python on top of docker-compose just to add the orchestration that their Django project needed (I swear) already two years ago.

Tim Graham

unread,
Mar 1, 2017, 10:48:36 PM3/1/17
to Django developers (Contributions to Django itself)
I don't know. Can you propose a patch so we can see what's involved? How would a "production" web server (nginx, apache, etc.) handle the issue?

I'm more interested in moving runserver toward using gunicorn [0] (Windows support seems the main blocker to proceeding there) than adding more features to our own webserver, although it's not clear if the fix would actually involve the webserver.

[0] https://code.djangoproject.com/ticket/21978

Aymeric Augustin

unread,
Mar 2, 2017, 3:08:01 AM3/2/17
to django-d...@googlegroups.com
Hello James,

If I understand correctly, the problem is that runserver fails to boot because it cannot perform checks, some of which connect to the database (e.g. the check that all migrations were run).

Is that what you're talking about?

I suppose the fix involves something like:

while True:
    try:
        perform_system_checks()  # I made this up
    except Exception:
        log_exception()  # ditto
        time.sleep(1)
    else:
        break
    
Assuming the exception is clearly reported, that would be more beginner-friendly, which is a design goal of runserver. I think it's worth exploring, provided it doesn't create too much complexity.

Best regards,

-- 
Aymeric.



--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAC6Op19Q3nGUU1wCCsmdJbHhUvqEuaS-wNc9htAE1QXLQocsBw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

James Pic

unread,
Mar 5, 2017, 5:33:00 AM3/5/17
to django-d...@googlegroups.com
Thanks for your feedback, the use case i was talking about is not quite valid anymore, since docker-stack and docker-compose v3 ​do handle dependencies.

However, perhaps runserver could just exit if checks don't pass, which makes sense I think, allowing the optional use of a shell loop. I'd prefer that, what do you think is better, exit or retry ?

Shai Berger

unread,
Mar 5, 2017, 5:50:22 AM3/5/17
to django-d...@googlegroups.com, James Pic
On Sunday 05 March 2017 12:32:34 James Pic wrote:
>
> However, perhaps runserver could just exit if checks don't pass, which
> makes sense I think, allowing the optional use of a shell loop. I'd prefer
> that, what do you think is better, exit or retry ?

... then you can do that yourself already by running

manage.py check && manage.py runserver

My 2 cents,
Shai.

James Pic

unread,
Mar 5, 2017, 6:28:23 AM3/5/17
to Shai Berger, django-d...@googlegroups.com
until manage check; do sleep 1; done; manage runserver would work for me then, thanks Shai !

However, I'm still a bit puzzled by having a process that's just stuck when checks fail (if I understand correctly) is there any particular reason why it is this way ? If not, perhaps a retry or exit could improve the developer experience

Aymeric Augustin

unread,
Mar 6, 2017, 3:36:19 AM3/6/17
to django-d...@googlegroups.com, Shai Berger
On 5 Mar 2017, at 12:27, James Pic <jp...@yourlabs.org> wrote:

However, I'm still a bit puzzled by having a process that's just stuck when checks fail (if I understand correctly) is there any particular reason why it is this way ? If not, perhaps a retry or exit could improve the developer experience

To me this looks like a bug in the implementation of the auto-reloader.

Handling correctly both the initial invocation and subsequent invocations (after reloads) isn't as easy as it seems.

-- 
Aymeric.

James Pic

unread,
Mar 7, 2017, 6:32:45 AM3/7/17
to django-d...@googlegroups.com
Thanks for sharing some of your insight Aymeric, if I'm not mistaken then the auto-reload feature/case invalidates Shai's suggestion: would you recommend that the runserver process exits with non-zero when a check fails rather than being stuck waiting for another code change to trigger a reload, so that we could wrap it in an until loop in bash ?

Tim Graham

unread,
Mar 7, 2017, 7:00:42 AM3/7/17
to Django developers (Contributions to Django itself)
The behavior of runserver hanging on a check error seems fine to me. That gives you an opportunity to fix the error without having to manually restart the server afterward -- just the same as if you had a SyntaxError. Am I missing the reason why the behavior is problematic?

James Pic

unread,
Mar 7, 2017, 7:31:06 AM3/7/17
to django-d...@googlegroups.com
It works on SyntaxErrors because updating the code triggers a reload, but if the check fails for something that's not related to code (db conn, redis conn...) then it's stuck and we have to manually interrupt runserver to start it again, unless we touch some code just to trigger the reload as you mention. My question is: is there anything we can do to automate this ?

Aymeric Augustin

unread,
Mar 7, 2017, 8:04:15 AM3/7/17
to django-d...@googlegroups.com
Hello,

On 7 Mar 2017, at 13:30, James Pic <jp...@yourlabs.org> wrote:

My question is: is there anything we can do to automate this ?

I'm not seeing an obvious solution to this problem.

Django has no way to tell it's a temporary issue.

-- 
Aymeric.

James Pic

unread,
Mar 7, 2017, 8:30:27 AM3/7/17
to django-d...@googlegroups.com
It seems like we have 2 kind of issues:

- code broke runserver,
- network broke runserver.

In the first case, runserver waits for a code reload event which is perfect ;)
In the second case, runserver also waits for a code reload event, which is not very intuitive after fixing a network error.

So if we want to handle both case, we indeed need to detect if an error is caused by code or networking, which is defined by CACHES, DATABASES and CHANNEL_LAYERS.

Perhaps we could add a special attribute to the exception, so DatabaseWrapper.get_new_connection()'s call of:

    connection = Database.connect(**conn_params) 

Would become something like:

    try:
        connection = Database.connect(**conn_params)
    except Exception as e:
        e.network_error = True
        raise

Another way would be to inspect exc info or have a pre-defined list of exceptions that are to be considered as network error, which involves referencing to exceptions potentially defined by other packages such as redis.

While that may seem a lot for runserver, I've restrained myself from talking about what this could look like in production so far in the discussion, but I feel like even production deployment could somehow benefit from this at some point, so that might be worth the effort after all.

Chris Foresman

unread,
Mar 8, 2017, 3:23:03 PM3/8/17
to Django developers (Contributions to Django itself)
I'll chime in to say I've had a similar problem related to the shell and I couldn't sort out how to address it.

Our database servers will drop connections that last longer than 10 minutes. So basically can never do a task I might otherwise use the shell for that would take longer than 10 minutes of typing things in the shell. The workaround I eventually arrived at was copying all the data I pulled from previous runs into a doc in my text editor, and restart the shell every time the database connection dropped. Eventually I was able to just copy and paste enough stuff from that doc to get everything done in the 10 minute limit. Is there a way (e.g. the DatabaseWrapper mentioned above) to get the shell to reconnect without stopping the shell and restarting from scratch every time?

Adam Johnson

unread,
Mar 8, 2017, 3:26:29 PM3/8/17
to django-d...@googlegroups.com
Chris, whilst I'm sure you could work something out, it probably wouldn't generally work as database connections contain a lot of state, such whether or not we're in a transaction and variables.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Adam

Aymeric Augustin

unread,
Mar 8, 2017, 4:20:31 PM3/8/17
to django-d...@googlegroups.com
Hello,

On 8 Mar 2017, at 21:23, Chris Foresman <fore...@gmail.com> wrote:

I'll chime in to say I've had a similar problem related to the shell and I couldn't sort out how to address it.

In such situations, AFAIK, the following works:

from django.db import connection
connection.close()

Then Django will reopen the connection for the next database query.

HTH,

-- 
Aymeric.


Chris Foresman

unread,
Mar 9, 2017, 10:27:24 AM3/9/17
to Django developers (Contributions to Django itself)
Thanks Aymeric! I'll give that a try next time!

Jamesie Pic

unread,
Dec 15, 2018, 2:52:15 AM12/15/18
to django-d...@googlegroups.com
Hi all,

Sorry to bump this, but I didn't find another thread, and I'm pretty sure re-trying the database connection is the sane thing to do.

Otherwise, Django will remain in failed state after reboot in some cases: psycopg2.OperationalError: FATAL:  the database system is starting up

It would be better if Django could tolerate this situation and retry, then the server would be reboot proof.

At least, show a 500 page until the DB server has started, would be more reasonnable.

Are you sure Django should not be a bit a bit more tolerant with databases starting up ?

Or should we open a ticket ?

Thanks in advance for your reply

Have a great weekend

ludovic coues

unread,
Dec 15, 2018, 3:04:03 AM12/15/18
to django-d...@googlegroups.com
Are you using runserver in production ?

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.

Jamesie Pic

unread,
Dec 15, 2018, 3:04:46 AM12/15/18
to django-d...@googlegroups.com
Nope, uwsgi in docker...

Carlton Gibson

unread,
Dec 15, 2018, 3:42:51 AM12/15/18
to Django developers (Contributions to Django itself)
Hi Jamesie.

Two things stick out from the thread: 

Aymeric: might be more user-friendly, which is a goal. (If not too complex...)
Tim: can you show us a patch to see what it would look like. 

Fancy putting together a POC? 
C.
Reply all
Reply to author
Forward
0 new messages