Pretty strange that this would have ~zero size effect on ARM32, yet a large size effect on ARM64.
It appears the entire size is going into LineBreaker::NextLine()
My best guess is that there is some inlining / loop unrolling / vectorization going on due to the -O2 vs -Os difference. If this is hot code, maybe it's justified? It might go away with a NO_INLINE somewhere, but it's not worth spending more than ~an hour on (as I think it's also likely that toolchain changes could make this go away on its own at some point)