So I've been using hxScout to profile some things in OpenFL, since I had noticed that text field rendering was rather slow when using -Dnext.
Maybe I'm totally confused about how I'm supposed to be using Utf8 and stuff, but here are my observations nonetheless.
First, I found my bottleneck.
Turns out nearly ALL of the time was spent in this function in OpenFL's CairoTextField.hx:
private static function getLineBreaks (textField:TextField):Int {
//returns the number of line breaks in the text
var lines = 0;
for (i in 0...Utf8.length (textField.text)) {
var char = Utf8.charCodeAt(textField.text, i);
if (char == __utf8_endline_code) {
lines++;
}
}
return lines;
}
Specifically on this single line:
var char = Utf8.charCodeAt(textField.text, i);
According to hxScout ~90% of the entire text field rendering slowdown was coming from this line, almost all of it the result of GC's. The first thing I did was move the var declaration outside of the loop, and that seemed to help a little, but not much. Next I drilled down to see what Utf8.charCodeAt was doing, turns out it resolves to this function in String.cpp in hxcpp:
Dynamic String::charCodeAt(int inPos) const
{
if (inPos<0 || inPos>=length)
return null();
#ifdef HX_UTF8_STRINGS
// really "byte code at" ...
//const unsigned char *p = (const unsigned char *)(__s + inPos);
//return DecodeAdvanceUTF8(p);
return (int)( ((unsigned char *)__s) [inPos]);
#else
return (int)(__s[inPos]);
#endif
}
This was interesting, so I did some experiments. I put a printf in both of the #ifdef blocks to see which result was being called. printf("utf8 block!\n") in the #if and printf("not utf8 block!\n") in the other.
When I used this:
char = Utf8.charCodeAt(textField.text, i);
It would print "utf8 block!".
When I used this:
char = textField.charCodeAt(i);
It would print "utf8 block!".
So regardless of whether I use the Utf8 class or not, it resolved to the same line of C++ code. Which makes sense when you look at the C++ code -- it's an #ifdef driving which block it chooses, which is declared at compile time and never changes, right? So either you're using HX_UTF8_STRINGS or you aren't, and it no longer matters whether you invoke charCodeAt() with Utf8. or not. So that's one thing.
But get this!
Regular (non-Utf8. prefixed) textField.charCodeAt(i) is waaaaaay faster! When I switched to that my results were the same but there were no longer any GC's happening, and suddenly text rendering became lots faster.
So my observations:
- someString.charCodeAt(index) is apparently equivalent to Utf8.charCodeAt(someString,index), when the API makes me expect them to be different.
- the (uncommented) code in the UTF8 block doesn't seem to actually be UTF8-aware...
- Utf8.charCodeAt() generates lots of garbage but regular charCodeAt() doesn't, even though they resolve to doing the exact same thing
So if someone could explain what's going on here that'd be swell :)