Customizing charset/collation per-column.

138 views
Skip to first unread message

Patrick Nelson

unread,
Sep 24, 2015, 8:57:41 PM9/24/15
to silverst...@googlegroups.com
Just wanted to throw this out there, due to my frustration today :) 

TL;DR: I needed to accomodate emoji's that were 4 bytes, MySQL is limited to 3 bytes by default for utf8 charsets/collation, had to go through rig-a-ma-role to actually hassle the framework into listening to me so I could accommodate utf8mb4 which can handle these characters.

---

We really should find a way to make it easier to custom collation on a per-column basis. I'm fine (and tried) doing this by defining my own version of "Text" (for example) which overrides  the "requireField()" method and then defines those two specialized parameters, but it appears that once it gets into the database abstraction, it doesn't utilize those settings. Is there a reason for that? 

Take this class for example:
class TextUTF8MB4 extends Text {
public function requireField() {
$parts = array(
'datatype' => 'mediumtext',
'character set' => 'utf8mb4', // Unfortunately MySQLDatabase->text() ignores this, so CustomMySQLDatabase was invented.
'collate' => 'utf8mb4_general_ci', // Same here.
'arrayValue' => $this->arrayValue
);

$values= array(
'type' => 'text',
'parts' => $parts
);

DB::requireField($this->tableName, $this->name, $values, $this->default);
}
}

In spite of that definition, it appears that MySQLSchemaManager (SS framework v3.2 and newer, methinks) or MySQLDatabase (framework v3.1) will still only ever use "utf8". Because the "text()" method (since this is a "text" type, as indicated in the method above) does this 100% of the time:
public function text($values) {
//For reference, this is what typically gets passed to this function:
//$parts=Array('datatype'=>'mediumtext', 'character set'=>'utf8', 'collate'=>'utf8_general_ci');
//DB::requireField($this->tableName, $this->name, "mediumtext character set utf8 collate utf8_general_ci");
return 'mediumtext character set utf8 collate utf8_general_ci' . $this->defaultClause($values);
}
So, you'd need to define your custom schema manager (as well as your own DBField child) which first defines the charset you need and then actually obeys it. So here's and overriding "text()" method that will accept your delightful emoji's:

public function text($values) {
// Only deal with charset/collate settings right now because that's all I need and currently have patience for . . . . .
$charset = "utf8_general_ci";
if (isset($values["character set"])) $charset = $values["character set"];

$collate = "utf8_general_ci";
if (isset($values["collate"])) $collate = $values["collate"];

return "mediumtext character set $charset collate $collate";
}

Question: Was this done in the "text()" method above on purpose for the default MySQL implementation? Just curious, since I wouldn't be surprised if maybe there were a DBField child that happened to be compatible with simply passing through the character set/collate settings.

- Patrick

Oliver Shaw

unread,
Sep 27, 2015, 3:38:50 PM9/27/15
to SilverStripe Core Development
I ran into this issue. Was also really surprised to see utf8 hardcoded. I submitted a PR to override the connection and column character set and collation with config for connection_charset, connection_collation, charset, collation.

Oliver Shaw

unread,
Sep 27, 2015, 3:39:53 PM9/27/15
to SilverStripe Core Development


On Friday, 25 September 2015 12:57:41 UTC+12, Patrick Nelson wrote:

Patrick Nelson

unread,
Sep 27, 2015, 10:00:16 PM9/27/15
to silverst...@googlegroups.com
Thanks for the effort! This is a necessary improvement. The work you did may need to be duplicated a bit since I know this abstraction has changed somewhat in master but shouldn't be significant. Anyway, I've posted a recommendation there but it's more related to separation of concerns (configuration options).

--
You received this message because you are subscribed to the Google Groups "SilverStripe Core Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to silverstripe-d...@googlegroups.com.
To post to this group, send email to silverst...@googlegroups.com.
Visit this group at http://groups.google.com/group/silverstripe-dev.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages