Hi Jonathan,
It sounds like you were adding a column with a default value, since you were able to cancel it and it was still there. Adding default values means your database must rewrite every single one of your rows, although even if you add a NULLable column MySQL still does this, which is why it takes 10/15 minutes "normally".
MySQL is also just this slow, and there's no way of cleaning up once they die, since MySQL's support for schema-altering statements (DDL) is very poor. PostgreSQL not only guarantees to add a NULLable column without locking a table and in only a single operation (so it usually takes one or two milliseconds) it also allows wrapping of DDL statements inside transactions - if you were using it when you killed the command, it would have rolled back to the previous state (it may have been faster, but if you're adding default values PostgreSQL still has to rewrite every row, which takes some time).
Don't worry about memory usage - only a single query is being run, so South won't be getting slow because of that.
If you want my honest advice for changing schema on large databases:
- Use PostgreSQL. We moved Lanyrd to it because we had a ~10 million row table which we needed to add columns to, and it was taking 20 minutes per column under MySQL. It took less time to move the data across than to add our six new columns.
- If you must use MySQL, consider using OneToOneFields rather than adding columns. It's incredibly inefficient to store things this way and slow to query, but it won't lock the entire old table while you add new data.
- Generally, never, ever add columns with default values - especially not ones like NOW(). South does its best to stop you doing this - Django handles all defaults in its own ORM before things hit the database anyway, so defaults aren't needed on the DB's side - but if you add a null=False column (the default) South will have to add the column with a default, which is very very slow, and then it'll remove the default afterwards (which is fast), so that the new column has data in all of the existing rows.
Large datasets are generally a pain to handle, but these are my top tips. Most large MySQL installations aren't changing its schema - they're usually implementing a pseudo-database on top of it, with EAV or something like that. Your dataset may well be better suited to a different storage architecture than just plain relational, as well - perhaps JSON blobs if all you're doing is adding a lot of sparse, optional attributes a lot - but without a deeper knowledge of what you're trying to do it's hard to say anything there.
Andrew