Hello Skia Team,
I am reporting with patches to fix a severe bug in the SkParagraph module related to Urdu Nastaliq text shaping for U+0600-U+06003 combined with digits when using a custom Flutter engine with SIL Graphite enabled in HarfBuzz. The issue results in a complete app crash (due to an assertion failure) and incorrect rendering of Arabic marks combined with digits.
My team reached out to the developer of the font that we are using, and after running several tests he concluded that it appeared to be in text segmentation where characters get segmented into right-to-left and left-to-right text. These unicode characters 0600-0603 need to be in the same segment as the digits that will combine with them, but he believes they are getting put into separate segments. With this information Copilot identified the two root causes and used two small patches that resolve the issues. We are providing the analysis and patches below as we do not have permission to file an issue directly in the Modules > Paragraph component.
1. Summary of IssuesCustom Flutter Engine: Custom Flutter engine built with SIL Graphite support (https://github.com/silnrsi/graphite) enabled in HarfBuzz (available via using FLUTTER_STORAGE_BASE_URL="https://storage.googleapis.com/flutter-graphite-builds").
Font: Awami Nastaliq (a SIL Graphite font https://software.sil.org/awami/ and https://github.com/silnrsi/font-awami).
Test Case: Render text containing Arabic marks/signs (–) combined with Arabic or Western digits. You can use my test app with the custom Flutter engine: https://github.com/socvid/awami_test
Observed Behavior: The application crashes due to an assertion failure, and the digits fail to combine with the Arabic marks.
The cluster index table (fClustersIndexFromCodeUnit) is sometimes left with EMPTY_INDEX gaps after shaping, causing crashes in subsequent lookups. This patch adds a simple two-pass fill to ensure every code unit maps to a valid cluster.
Proposed Patch (Applied after line 551 in modules/skparagraph/src/ParagraphImpl.cpp)
B. Patch 2: Robust UTF-16 to UTF-8 Mapping (SkUnicode_icu_bidi.cpp)The incremental mapping from UTF-16 indices (from ICU) to UTF-8 offsets (used by SkParagraph) is fragile for multi-byte codepoints, causing bidi regions to be mapped incorrectly. This patch replaces the incremental logic with an explicit, precomputed vector-based mapping.
Proposed Patch (Replaces bidi region extraction logic starting around line 117 in modules/skunicode/src/SkUnicode_icu_bidi.cpp)
Thank you for your time and consideration. Please let me know if you require a more information or access to our patched engine build.
Best regards,
David Hartman