getGlyphPositionAtCoordinate returns the wrong index with consecutive emojis

120 views
Skip to first unread message

Everdrone

unread,
May 4, 2023, 9:19:59 AM5/4/23
to skia-discuss
I'm using getGlyphPositionAtCoordinate() to get the glyph index from the mouse cursor position, but when there are two consecutive emojis '😂😂' it does as if they are a single glyph.

demo:

skbug1.gif

jlav...@google.com

unread,
May 4, 2023, 9:26:30 AM5/4/23
to skia-discuss
Thank you, it looks like a bug. As if these two emojis belong to a single grapheme cluster (and it's not true).
Just in case: a code example would be nice to have.

Everdrone

unread,
May 4, 2023, 10:12:42 AM5/4/23
to skia-discuss
Thank you, yes I will provide a code snippet asap.

In the meantime I might have found another bug in getLineNumberAt() which was added in chrome/m113 if I'm not wrong
Basically with the following text:

aaa
bbb

the line numbers for each glyph will be

000
011 <- should be 111

doesn't matter how many characters, it seems that at the beginning of a new line (shaped) the computed line number is wrong.

jlav...@google.com

unread,
May 4, 2023, 10:17:06 AM5/4/23
to skia-discuss
There must be something else involved, I need your code sample.
I tried an example:
"Lorem 😂😂 ipsum"

emojis.png
Indexes are utf16 (so for every emoji there are 2 indexes)
getGlyphPositionAtCoordinate(0.000000, 27.000000) = 0 (downstream)
getGlyphPositionAtCoordinate(20.959976, 27.000000) = 1 (upstream)
getGlyphPositionAtCoordinate(45.159958, 27.000000) = 2 (upstream)
getGlyphPositionAtCoordinate(60.879944, 27.000000) = 3 (upstream)
getGlyphPositionAtCoordinate(83.439926, 27.000000) = 4 (upstream)
getGlyphPositionAtCoordinate(120.839890, 27.000000) = 5 (upstream)
getGlyphPositionAtCoordinate(131.239883, 27.000000) = 6 (downstream)
getGlyphPositionAtCoordinate(177.978165, 27.000000) = 8 (upstream)
getGlyphPositionAtCoordinate(224.716446, 27.000000) = 10 (downstream)
getGlyphPositionAtCoordinate(235.116440, 27.000000) = 11 (upstream)
getGlyphPositionAtCoordinate(245.436432, 27.000000) = 12 (upstream)
getGlyphPositionAtCoordinate(270.036407, 27.000000) = 13 (upstream)
getGlyphPositionAtCoordinate(289.196411, 27.000000) = 14 (upstream)
getGlyphPositionAtCoordinate(313.916382, 27.000000) = 15 (upstream)
getGlyphPositionAtCoordinate(351.316345, 27.000000) = 16 (upstream)

jlav...@google.com

unread,
May 4, 2023, 10:20:50 AM5/4/23
to skia-discuss
getLineNumberAt does not operate on glyphs. It operates on codepoints (think "characters" in case of English text).
As far as I can see there is another codepoint "\n" between "aaa" and "bbb". This is your fourth 0 index.
You should see 7 indexes for the entire text: 0000111.

Everdrone

unread,
May 4, 2023, 10:51:21 AM5/4/23
to skia-discuss
You are correct, I didn't count them as utf16, i was translating them directly as "glyph index", so inside the first emoji it normally would give the emoji index+2, resulting in the second emoji's x coordinate. Thanks

Is skparagraph available on the fiddle? if so I can share the code there

As for the line number, here's a screenshot using the same string "Lorem 😂😂 ipsum"
Screenshot 2023-05-04 164354.png
the logs:
[2023-05-04 16:43:57.747] [multi] [trace] U16 index: 0, Line: 0 
[2023-05-04 16:43:57.747] [multi] [trace] U16 index: 1, Line: 0 
[2023-05-04 16:43:57.747] [multi] [trace] U16 index: 2, Line: 0 
[2023-05-04 16:43:57.747] [multi] [trace] U16 index: 3, Line: 0 <-- this one
[2023-05-04 16:43:57.747] [multi] [trace] U16 index: 4, Line: 1
[2023-05-04 16:43:57.747] [multi] [trace] U16 index: 5, Line: 1
[2023-05-04 16:43:57.747] [multi] [trace] U16 index: 6, Line: 1
[2023-05-04 16:43:57.747] [multi] [trace] U16 index: 8, Line: 2
[2023-05-04 16:43:57.747] [multi] [trace] U16 index: 10, Line: 3
[2023-05-04 16:43:57.747] [multi] [trace] U16 index: 11, Line: 3
[2023-05-04 16:43:57.747] [multi] [trace] U16 index: 12, Line: 4
[2023-05-04 16:43:57.747] [multi] [trace] U16 index: 13, Line: 4
[2023-05-04 16:43:57.747] [multi] [trace] U16 index: 14, Line: 4
[2023-05-04 16:43:57.747] [multi] [trace] U16 index: 15, Line: 4


part of the code:
void TextField::layoutText() {
  buildParagraph();

  // layout
  paragraph_->layout(rect_.width());
  paragraph_->ensureUTF16Mapping();

  // create mapping
  textToDisplaySequence_.clear();

  auto paraUnicode = paragraph_->getUnicode();

  for (size_t i = 0; i < text_.size(); i++) {
    auto isGraphemeStart = paragraph_->codeUnitHasProperty(
        i, SkUnicode::CodeUnitFlags::kGraphemeStart
    );

    if (isGraphemeStart) {
      auto lineNumber = static_cast<size_t>(paragraph_->getLineNumberAt(i));

      // find the utf16 end index
      size_t utf8End = i;
      bool isNextGraphemeStart = false;
      do {
        utf8End++;
        isNextGraphemeStart = paragraph_->codeUnitHasProperty(
            utf8End, SkUnicode::CodeUnitFlags::kGraphemeStart
        );
      } while (utf8End < text_.size() && !isNextGraphemeStart);

      auto mapping = TextToDisplayIndex{
          .textIndex = i,
          .utf16Start = paragraph_->getUTF16Index(i),
          .utf16End = paragraph_->getUTF16Index(utf8End),
          .lineNumber = lineNumber,
      };

      getLogger()->trace(
          "U16 index: {}, Line: {}", mapping.utf16Start, mapping.lineNumber
      );

      textToDisplaySequence_.push_back(mapping);
    }
  }
}

jlav...@google.com

unread,
May 4, 2023, 11:19:36 AM5/4/23
to skia-discuss
Yes, you are right: there is a bug in getLineNumberAt and I was able to reproduce it (will fix it, too).
There is SkParagraph in CanvasKit so you should be able to use it fiddle. 

Everdrone

unread,
May 4, 2023, 11:28:54 AM5/4/23
to skia-discuss
for context here's the header file contents:

struct TextToDisplayIndex {
  size_t textIndex = 0;  // beginning of utf8 grapheme
  size_t utf16Start = 0; // beginning of utf16 grapheme
  size_t utf16End = 0;   // end of utf16 grapheme
  size_t lineNumber = 0; // line number of grapheme
};

// TODO: move somewhere more generic
enum class Direction {
  Left,
  Right,
  Up,
  Down,
};

using TextToDisplaySequence = std::vector<TextToDisplayIndex>;

class TextField : public View {
public:
  struct Caret {
    size_t index = 0;
    bool afterLast = false;
    SkRect rect;
  };

  TextField(esd::AppContext *ctx);
  ~TextField() = default;

  TextField *setText(const std::u16string &text);
  TextField *setText(const std::string &text);

  void onDraw(SkCanvas *canvas, const SkRect &rect) override;

private:
  std::string text_;
  Caret caret_;
  SimpleTimer blinkTimer_;

  bool paragraphDirty_;

  skia::textlayout::TextStyle textStyle_;
  skia::textlayout::ParagraphStyle paragraphStyle_;
  std::unique_ptr<skia::textlayout::ParagraphImpl> paragraph_;
  std::unique_ptr<skia::textlayout::ParagraphBuilderImpl> builder_;

  TextToDisplaySequence textToDisplaySequence_;

  void layoutText() override;
  void buildParagraph();

  void drawCursor(SkCanvas *canvas);
  void moveCursor(const Direction &direction);
  void insertText(const std::string &utf8);
  void removeBackward();
  void removeForward();
};

Everdrone

unread,
May 4, 2023, 11:36:37 AM5/4/23
to skia-discuss
Thank you!
Which branch is going to contain the fix? is it chrome/m114?

Julia Lavrova

unread,
May 4, 2023, 12:04:30 PM5/4/23
to skia-d...@googlegroups.com
I submit changes to the Skia main branch. When it goes to the release branch (and what branch) I don't know, 
but I will try to fix it by next week, so in main it should appear on Monday... hopefully.

--
You received this message because you are subscribed to a topic in the Google Groups "skia-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/skia-discuss/_daFoUJryrI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to skia-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/skia-discuss/2ef7f22f-6986-41e5-80fe-02c0604e12d9n%40googlegroups.com.
Message has been deleted

Everdrone

unread,
May 5, 2023, 4:16:16 PM5/5/23
to skia-discuss
Hi, I just checked out the commit d749e1ca8c5 with the fix you pushed today

I'm seeing a problem with the following string 'Hel\nlo' and I'm not sure if it's on my end or not.
For '\n' at position 2 which is supposed to be on either line 0 or 1 I see a strange value.
I'm casting the return value to size_t so I assume that 18446744073709551615 means that getLineNumberAt is returning -1 for '\n'

jlav...@google.com

unread,
May 5, 2023, 4:32:47 PM5/5/23
to skia-discuss
SkParagraph treats '\n' (and some other controls) as formatting marks and does not consider them as text. 
I will try to think about a way to make it less inconvenient for a user but truly there is no place for '\n' on any line.
From your example I can see that you are trying to implement moving cursor/editing functionality.
I think you should edit the shaped text (meaning - operate on glyphs) rather than the initial representation (code points).
There is no glyph for '\n' in the drawn text.
Also, there are some glyphs that are presented by few bytes/characters/code points - whatever - and you are going to have a very hard time with them. 
SkParagraph is not exactly an ideal tool for implementing editing functionality but Flutter does that and I also tried it myself.

Everdrone

unread,
May 5, 2023, 5:02:10 PM5/5/23
to skia-discuss
Thanks for the insights

Yeah initially I tried looking at sktext but I cannot manage to make the demos (sktext/slides and sktext/editor)  run without a bunch of errors on windows

Moving the cursor around is indeed tricky using the code points
To keep it simple I tried creating an array that maps each glyph to its utf8 and utf16 index to translate indices back and forth for SkParagraph methods
So far so good except some gotchas (like you mentioned, multi-code-point glyphs) which I'm trying to manage

I'll see what I can do with the current implementation of SkParagraph, I also agree that '\n' has no real place
I think I'll assume the previous glyph's line number for "-1" values returned by getLineNumberAt.

Thanks again for your time
Reply all
Reply to author
Forward
0 new messages