canvaskit-wasm: memory leak when using paragraph builder `fontVariations` text style with some fonts

217 views
Skip to first unread message

Alan Song

unread,
Apr 15, 2024, 1:24:22 PM4/15/24
to skia-discuss

This is a strange bug that requires many conditions at the same time to reproduce

  • Builder have `fontVariations` textStyle set either through constructor or pushStyle() 
  • Builder is used to build at least one paragraph with layout() call
  • Certain fonts are used (currently found NotoSansCJK and SourceHanSans). Roboto works fine.

Reproduction:  https://alansongzoom.github.io/canvaskit-memleak-repro

Take a memory heap snapshot with dev tools after js execution finishes (“Loading…” text removed). Didn’t use jsfiddle because fonts can’t be uploaded, so please copy the script to change the used font.

Mem leak: 
When using SourceHanSans (file is variable font), js heap size is 2165MB
paragraph-sourcehansans.png
When using NotoSansCJK (file is not variable font), js heap size is 424MB
paragraph-notosanscjk.png

Normal:
When using Roboto (file is not variable font), js heap size is 138MB which is the initial size.
paragraph-roboto.png


-------------------

As a side note, performance with SourceHanSans variable font using simple input of “test” is extremely slow - taking 17 seconds to for the tests - seems to all be stuck on sfnt_init_face calls. 

paragraph-sourcehansans-profiling.pngNotoSansCJK takes ~240ms for the test while Roboto takes ~120ms. 

Alan Song

unread,
Apr 24, 2024, 3:26:49 PM4/24/24
to skia-discuss

Sorry I completely forgot to explain what I’m doing in the tests.

The test was:

  • Create a TypefaceFontProvider
  • In a loop that runs 500 times:
    • Create a builder with the provider, build a paragraph of simple string “test”, layout
    • Delete builder, delete paragraph

The expected js heap size should be the initial size (~150MB), because we’re deleting everything (builder/paragraph) every time, and we’re using the same small text input (“test”). 

The actual result is memory leak - js heap taking 400MB to 2GB depending on the font use when  `fontVariation` options was passed, and it grows with number of iterations. Both variable and non-variable fonts trigger the leak, so it might be related to the font formats.


The test script is located at https://github.com/alansongzoom/canvaskit-memleak-repro/blob/main/index.html , copying here for convenience:

<script type="text/javascript" src="https://unpkg.com/canvask...@0.39.1/bin/profiling/canvaskit.js"></script>
<script type="text/javascript">
const ckLoaded = CanvasKitInit({ locateFile: (file) => "https://unpkg.com/canvask...@0.39.1/bin/profiling/" + file });

/* Fonts that cause memleak when used with `fontVariations` */

const fontLoaded = fetch("./NotoSansCJK-Regular.ttc").then((response) =>
response.arrayBuffer(),
);
// From https://github.com/adobe-fonts/source-han-sans/releases -> Variable OTF/TTF/OTC/WOFF2
// const fontLoaded = fetch("./SourceHanSans-VF.otf.woff2").then((response) => response.arrayBuffer());

/* Font that works fine when `fontVariations` is passed */
// const fontLoaded = fetch("https://storage.googleapis.com/skia-cdn/misc/Roboto-Regular.ttf").then((response) => response.arrayBuffer());

const p = document.createElement("p");
document.body.append(p);

p.innerText = "Loading...";

async function main() {
const [CanvasKit, font] = await Promise.all([ckLoaded, fontLoaded]);
window.CanvasKit = CanvasKit;
const fontProvider = CanvasKit.TypefaceFontProvider.Make();
fontProvider.registerFont(font, "SomeFont");

// Repeatedly create a builder and use it to generate a paragraph, and then delete the builder and paragraph
// If there's no memleak, the JS heap size would remain ~150MB, when there is, it could be as high as 2GB
for (let i = 0; i < 500; i++) {
const builder = CanvasKit.ParagraphBuilder.MakeFromFontProvider(
new CanvasKit.ParagraphStyle({ textStyle: {} }),
fontProvider,
);

builder.pushStyle(
new CanvasKit.TextStyle({
fontFamilies: ["SomeFont"],
fontVariations: [{ axis: "wght", value: 700 }], // required to reproduce the mem leak
}),
);

builder.addText("test"); // required to reproduce the mem leak
const paragraph = builder.build(); // required to reproduce the mem leak
paragraph.layout(200); // required to reproduce the mem leak

paragraph.delete();
builder.delete();
}

p.innerText = `Use Devtools "Memory" tab to take heap snapshot`;
}
main();
</script>

Alan Song

unread,
May 3, 2024, 8:41:48 PM5/3/24
to skia-discuss
It seems that using `fontCollection` completely avoids the memory leak issue. Just replace `MakeFromFontProvider` with `MakeFromFontCollection` and set up the corresponding fontCollection for it.

Looking at the source, it seems `Make`(with fontMgr) and `MakeFromFontProvider` implicitly creates a FontCollection for you, so maybe that's where the memory leak is coming from when repeatedly creating/deleting a builder. 
https://skia.googlesource.com/skia/+/81ccf3ab27ee093cfe8c6896e7b75fc263b221e2/modules/canvaskit/paragraph_bindings.cpp#555

class_<para::ParagraphBuilderImpl>("ParagraphBuilder")


.class_function(


"_Make",


optional_override([](SimpleParagraphStyle style, sk_sp<SkFontMgr> fontMgr)


-> std::unique_ptr<para::ParagraphBuilderImpl> {


auto fc = sk_make_sp<para::FontCollection>();


fc->setDefaultFontManager(fontMgr);


fc->enableFontFallback();


auto ps = toParagraphStyle(style);


auto pb = para::ParagraphBuilderImpl::make(ps, fc, get_unicode());


return std::unique_ptr<para::ParagraphBuilderImpl>(


static_cast<para::ParagraphBuilderImpl*>(pb.release()));


}),


allow_raw_pointers())


.class_function(


"_MakeFromFontProvider",


optional_override([](SimpleParagraphStyle style,


sk_sp<para::TypefaceFontProvider> fontProvider)


-> std::unique_ptr<para::ParagraphBuilderImpl> {


auto fc = sk_make_sp<para::FontCollection>();


fc->setDefaultFontManager(fontProvider);


fc->enableFontFallback();


auto ps = toParagraphStyle(style);


auto pb = para::ParagraphBuilderImpl::make(ps, fc, get_unicode());


return std::unique_ptr<para::ParagraphBuilderImpl>(


static_cast<para::ParagraphBuilderImpl*>(pb.release()));


}),


allow_raw_pointers())



Julia Lavrova

unread,
May 6, 2024, 9:32:41 AM5/6/24
to skia-d...@googlegroups.com
You are right, there is a memory leak somewhere there, I think exactly in the line #555.
Of course, I need to confirm that (and to fix if there is anything to fix, not sure).
Also, FontCollection is designed to be a singleton.
It keeps the shaped text cache and we want that cache to be a single one for all the paragraphs.
I hope that for now you can use MakeFromFontCollection and avoid this memory leak.

--
You received this message because you are subscribed to the Google Groups "skia-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to skia-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/skia-discuss/e31530ee-74cf-4083-a533-effe9be3953cn%40googlegroups.com.

Alan Song

unread,
May 6, 2024, 6:58:01 PM5/6/24
to skia-discuss

Thank you for the detailed explanation, now I finally seem to understand the problem:

As you said, FontCollection should be a singleton, so paragraph builders with FontMgr or TypefaceFontProvider should always be using the same FontCollection instance.

However, that seems not the case (multiple FontCollections being created), which lead to the unresolvedCodepoints() behavior confusion in https://groups.google.com/g/skia-discuss/c/53A-RJjNzx4:

  • As you said, FontCollection caches the shaped text, and there seems to be a bug when cached result is returned, unresolvedCodepoints() returns empty, because it depends on a fresh text shaping process. This behavior explains this fiddle https://jsfiddle.net/83heb5fz/ where multiple calls to unresolvedCodepoints() with the same input (“一二三”) only returns result on the first call. If a fresh text input is provided (“一二三四”) it returns correct result because it hasn’t been previously cached.
  • This is also hinted at by the profiling screenshot in my original post - `sfnt_init_face` is called everytime a new paragraph builder is created, which suggests a new FontCollection is parsing the font binary. When switching to MakeFromFontCollection with a js-controlled singleton FontCollection, there is no more repeated font parsing.

As for the FontCollection bug for unresolvedCodepoints(), tracing backwards in the code, the logic seems to be:
  • paragraph.unresolvedCodepoints() returns its `fUnresolvedCodepoints`
  • `fUnresolvedCodepoints` is added by paragraph.addUnresolvedCodepoints()
  • paragraph.addUnresolvedCodepoints() is called by OneLineShaper::finish()
  • OneLineShaper::finish() <- oneLineShaper.shape() <- paragraph.shapeTextIntoEndlessLine()
  • And paragraph.shapeTextIntoEndlessLine()  is only called when `!fFontCollection->getParagraphCache()->findParagraph(this)` which means no paragraph cache is found
So when `fFontCollection` has a cache, unresolvedCodepoints will return empty because it’s not initialized by the shaping process. I think this matches the explanation from your original reply in https://groups.google.com/g/skia-discuss/c/53A-RJjNzx4/m/ySl8NY-AAwAJ which I didn’t fully understand at the time.

Unfortunately we can’t switch to FontCollection because we’re now inadvertently depending on the first bug which is new fontCollections being created every time for new builders, which allowed us to get the unresolvedCodePoints() for the same text input.

We need to call unresolvedCodePoints() for same text input because our fonts are loaded asynchronously. For any given text, unresolvedCodePoints() returning non-empty set indicates missing fonts, and it’ll be marked for re-shaping after new fonts are loaded. So we need to repeatedly check unresolvedCodePoints() until it returns empty, or a set where all code points are outside the unicode range of all our available fonts. So if it returns empty on the second call, we can't tell if we need to wait for new fonts.
Reply all
Reply to author
Forward
0 new messages