So far there are two main ways I see we can implement this:
- Use existing backend utilities e.g mysqlpump instead of mysqldump
- Use a normal multiprocessing pool on top of our existing cloning code
--
Ticket URL: <https://code.djangoproject.com/ticket/31804>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* owner: nobody => Ahmad A. Hussein
--
Ticket URL: <https://code.djangoproject.com/ticket/31804#comment:1>
Old description:
> Parallelizing database cloning processes would yield a nice speed-up for
> running Django's own test suite (and all django projects that use the
> default test runner)
>
> So far there are two main ways I see we can implement this:
> - Use existing backend utilities e.g mysqlpump instead of mysqldump
> - Use a normal multiprocessing pool on top of our existing cloning code
New description:
Parallelizing database cloning processes would yield a nice speed-up for
running Django's own test suite (and all django projects that use the
default test runner)
So far there are three main ways I see we can implement this:
- Use a multiprocessing pool at the setup_databases level that'll create
workers which run ```clone_test_db``` for each method
- Use a pool at the ```clone_test_db``` level which parallelizes the
internal ```_clone_test_db``` call
- Scrap parallelizing the cloning in general, but parallelizing the
internals of specific backends (at least MySQL fits here)
In the first two options, we'd have to refactor MySQL's cloning process
since it has another call to ```_clone_db```. We have to because otherwise
we'd have a dump being created inside of each parallel process, slowing
the workers greatly.
In the last option, we could consider using mysqlpump instead of mysqldump
for both exporting the database and restoring it. The con of this approach
is that it isn't general enough to apply to the other backends.
Oracle's cloning process(although not merged in the current master) has
internal support for option 3 (users can specify a PARALLEL variable to
speed-up expdp/impdp utilities), and it can also use the first two
options.
The major con though with the first two options is forcing parallelization
--
--
Ticket URL: <https://code.djangoproject.com/ticket/31804#comment:2>
* stage: Unreviewed => Accepted
Comment:
Hi Ahmad. Yes: if you can get this going super.
One thing that's been bugging me about #31169 is how slow the DB cloning
appears on Windows. (I need to measure exact times...) So if we can speed
that up, it would be a big win.
Thanks.
--
Ticket URL: <https://code.djangoproject.com/ticket/31804#comment:3>
Comment (by Ahmad A. Hussein):
[https://github.com/django/django/pull/13217 PR]
Still needs more work
--
Ticket URL: <https://code.djangoproject.com/ticket/31804#comment:4>
* owner: Ahmad A. Hussein => (none)
* status: assigned => new
--
Ticket URL: <https://code.djangoproject.com/ticket/31804#comment:5>