Issue 7429 in angleproject: linux-tsan-test very flaky since 1423

0 views
Skip to first unread message

ynovi… via monorail

unread,
Jun 13, 2022, 1:59:39 PM6/13/22
to angleproj...@googlegroups.com
Status: Available
Owner: ----
OS: Linux
Priority: Medium
Renderer: SwiftShader
Type: Defect

New issue 7429 by ynov...@chromium.org: linux-tsan-test very flaky since 1423
https://bugs.chromium.org/p/angleproject/issues/detail?id=7429

From https://ci.chromium.org/p/angle/builders/ci/linux-tsan-test/1423
to https://ci.chromium.org/p/angle/builders/ci/linux-tsan-test/1565
47 out of 142 builds failed.
Bot was fairly green before that.

The failures are to allocate memory:
==12016==ERROR: ThreadSanitizer failed to allocate 0xf70800083000 (271613732335616) bytes at address 80000013000 (errno: 12)

I'll try to see if something changed in ANGLE or SwiftShader memory allocation.

--
You received this message because:
1. The project was configured to send all issue notifications to this address

You may adjust your notification preferences at:
https://bugs.chromium.org/hosting/settings

jmad… via monorail

unread,
Jun 13, 2022, 2:06:38 PM6/13/22
to angleproj...@googlegroups.com
Updates:
Components: Infra

Comment #1 on issue 7429 by jmad...@chromium.org: linux-tsan-test very flaky since 1423
https://bugs.chromium.org/p/angleproject/issues/detail?id=7429#c1

Known flaky build.. we should update our retry mechanism to handle flaky crashes.

ynovi… via monorail

unread,
Jun 13, 2022, 2:29:09 PM6/13/22
to angleproj...@googlegroups.com

Comment #2 on issue 7429 by ynov...@chromium.org: linux-tsan-test very flaky since 1423
https://bugs.chromium.org/p/angleproject/issues/detail?id=7429#c2

Oh, I see.
The problem was always present.
But if you look at older builds
https://ci.chromium.org/p/angle/builders/ci/linux-tsan-test?cursor=id%3E8817715526820005521&limit=200
only had 5 out of 200 fail.

So, something happened that now the failures are much more frequent.

ynovi… via monorail

unread,
Jun 13, 2022, 2:36:27 PM6/13/22
to angleproj...@googlegroups.com

Comment #3 on issue 7429 by ynov...@chromium.org: linux-tsan-test very flaky since 1423
https://bugs.chromium.org/p/angleproject/issues/detail?id=7429#c3

After build 1071 there were 29 failures out of 200 = 14.5%
https://ci.chromium.org/p/angle/builders/ci/linux-tsan-test?cursor=id%3E8814622042173074097&limit=200

And now, after 1423, there are 47 out of 142 = 33%

syous… via monorail

unread,
Jun 14, 2022, 3:38:58 PM6/14/22
to angleproj...@googlegroups.com

Comment #4 on issue 7429 by syou...@chromium.org: linux-tsan-test very flaky since 1423
https://bugs.chromium.org/p/angleproject/issues/detail?id=7429#c4

For reference, I have this WIP change that enables flaky retires: https://chromium-review.googlesource.com/c/angle/angle/+/3696828 but in the end we decided to remove TSAN from CQ instead. It's unclear whether flaky retries work with TSAN.

ynovi… via monorail

unread,
Jun 14, 2022, 3:41:50 PM6/14/22
to angleproj...@googlegroups.com

Comment #5 on issue 7429 by ynov...@chromium.org: linux-tsan-test very flaky since 1423
https://bugs.chromium.org/p/angleproject/issues/detail?id=7429#c5

I think retries should be the last resort.
Right now the frequency of failures is high enough to try to reproduce them and figure out why they are happening.

syous… via monorail

unread,
Jun 14, 2022, 4:03:13 PM6/14/22
to angleproj...@googlegroups.com

Comment #6 on issue 7429 by syou...@chromium.org: linux-tsan-test very flaky since 1423
https://bugs.chromium.org/p/angleproject/issues/detail?id=7429#c6

I did try locally with no luck.

jmad… via monorail

unread,
Jun 15, 2022, 10:12:33 AM6/15/22
to angleproj...@googlegroups.com

Comment #7 on issue 7429 by jmad...@chromium.org: linux-tsan-test very flaky since 1423
https://bugs.chromium.org/p/angleproject/issues/detail?id=7429#c7

This is issue 1275223 - I don't think the flakiness is due to TLS conflicts like initially indicated in the issue. I think it's some other bug in TSAN. Because it seems like a bug in TSAN, and coverage is still helpful, I was suggesting we find a way to use flaky suppressions to mitigate the bug. If you think you can push the root cause fix though, Yuly, feel free to try and drive that.

jmad… via monorail

unread,
Jun 15, 2022, 10:12:41 AM6/15/22
to angleproj...@googlegroups.com

Comment #8 on issue 7429 by jmad...@chromium.org: linux-tsan-test very flaky since 1423
https://bugs.chromium.org/p/angleproject/issues/detail?id=7429#c8

sorry, issue chromium:1275223

ynovi… via monorail

unread,
May 26, 2023, 9:32:35 AM5/26/23
to angleproj...@googlegroups.com
Updates:
Status: Fixed

Comment #9 on issue 7429 by ynov...@chromium.org: linux-tsan-test very flaky since 1423
https://bugs.chromium.org/p/angleproject/issues/detail?id=7429#c9

The responsible TSAN bug was fixed and retries were removed in issue chromium:1275223.
Reply all
Reply to author
Forward
0 new messages