I've had a deeper look at this now and think I have an API proposal. First, the state of supported vendors:
1. All vendors support adding a collation to text/varchar fields.
2. The syntax is more or less the same.
3. However, the collation names themselves are different.
4. PostgreSQL is the only vendor that allows creating custom collations at runtime.
So I'm thinking we add a new `collation` parameter to `CharField` and `TextField`, that simply takes a string of the collation ID. I'm not quite sure on the implementation as I don't know the ORM that well, but my naive approach would be to just add a new format string to the `data_types` dict that is calculated during the field __init__(), either an empty string to use the default collation, or e.g. ' collate <collation_id>'. There's may well be a better approach.
By using this - because the collation names are not the same across vendors - the user is saying "I'm okay with this only working on one database vendor", so there should be a warning in the docs. There is perhaps some scope in the future to make this take a callable that can figure out the collation per-database. This would be useful for getting case-insensitive lookups working across all backends, for example. But I want to keep that out of the scope because it's some extra work and I'm not sure on the implementation.
Another downside is that people like to use CharField as a base class for other column types that might not support collations, but I think this should be in the user's hands to make sure they aren't doing that.
We should also add a `CreateCollation` operation for Postgres, similar to the `CreateExtension` operation that currently exists. If the user wants to use a custom collation they must create it first, similar to using extensions currently.
The advantage to this is that users can use collations without having to make SQL migrations, which I think would be nice. The really nice thing is the ability to have case-insensitive lookups that work across all database vendors, rather than only Postgres as it currently is. And as I mentioned in the previous message, Postgres is discouraging our current method of using the citext extension in favour of this approach.
Cheers,
Tom