Telemetry unittests timeouts.

ki...@yandex-team.ru

unread,

Feb 26, 2018, 12:26:02 PM2/26/18

to telemetry

Hello.

What to you think about supporting timeouts for telemetry tests runner?

We encounter some problems with long running telemetry tests. And seems like it is actual also for chromium.
https://bugs.chromium.org/p/chromium/issues/detail?id=570955

Problems:
1. Some test can hold a process pool worker and then the whole build timed out. If there is more then one such test, some test may not be run at all.
2. There is some long-running unit tests (like system_health stories) and there is no way to disable it automatically.

I'm added timeout support in typ runner locally.
Hope it's also needed in chromium. Or may be you have some other solution for it.
https://github.com/dpranke/typ/pull/10/files

Ned

unread,

Feb 26, 2018, 12:27:44 PM2/26/18

to ki...@yandex-team.ru, dpr...@google.com, k...@google.com, telemetry

+Dirk Pranke +Kenneth Russell

This seems like a great feature to add to typ to me.

--
You received this message because you are subscribed to the Google Groups "telemetry" group.
To unsubscribe from this group and stop receiving emails from it, send an email to telemetry+...@chromium.org.

Ned

unread,

Feb 26, 2018, 12:29:21 PM2/26/18

to ki...@yandex-team.ru, dpr...@google.com, k...@google.com, telemetry

I just look at your pull commit. I think it would be better if we can specify timeout per test. Some tests are intentionally long, so it would be great to be able to set timeout for those differently.

Dirk Pranke

unread,

Feb 26, 2018, 12:29:34 PM2/26/18

to Ned, ki...@yandex-team.ru, k...@google.com, telemetry

Yes, I think it's generally useful and needed, I've just never gotten around to implementing it.

-- Dirk

To unsubscribe from this group and stop receiving emails from it, send an email to telemetry+unsubscribe@chromium.org.

Dirk Pranke

unread,

Feb 26, 2018, 12:32:15 PM2/26/18

to Ned, ki...@yandex-team.ru, k...@google.com, telemetry

I understand the thinking, but I would resist that as much as you can, because adding different timeouts adds a whole additional layer of configuration logic and maintenance.

-- Dirk

To unsubscribe from this group and stop receiving emails from it, send an email to telemetry+unsubscribe@chromium.org.

Ned

unread,

Feb 26, 2018, 12:49:15 PM2/26/18

to Dirk Pranke, ki...@yandex-team.ru, k...@google.com, telemetry

I talked with Dirk offline. There maybe some legit use cases for specifying timeout per test level as many Telemetry test suites are integration tests, and the timeout range can be widely varied.

Having said that, I think adding adding a uniform timeout is 100% better than it's now and allow us to better deal with stuck Telemetry tests, so the idea with the pull request is LGTM. I defer the actual code review to Dirk :-)

To unsubscribe from this group and stop receiving emails from it, send an email to telemetry+...@chromium.org.

Кирилл Косарев

unread,

Feb 27, 2018, 5:21:52 AM2/27/18

to telemetry, dpr...@google.com, ki...@yandex-team.ru, k...@google.com

Of course it's necessary to support timeout per test.

Here is actual telemetry changes and simple timeout decorator (may be it should also support timeout per platform). Typ-part of this is already exists in PR.

https://gist.github.com/kirr/022c26a34ad8aeeddf6dc87e84ff3261.

I also agree that it's pretty ambiguous to apply such changes in tests runner from the outside.

I just want to promote this feature in telemetry tests :)

понедельник, 26 февраля 2018 г., 20:49:15 UTC+3 пользователь Ned Nguyen написал:

Juan Antonio Navarro Pérez

unread,

Feb 27, 2018, 5:53:09 AM2/27/18

to Кирилл Косарев, telemetry, dpr...@google.com, ki...@yandex-team.ru, k...@google.com

My thoughts:

In an ideal world, timeouts at the typ/test-runner level should not be needed. Individual methods within Telemetry that may hang (e.g. waiting for a connection to be ready or a command to finish) should have sensible timeouts of at most a minute or two. So we should err-out early and close to the thing that is actually hanging.
The world is not ideal, so I'm ok with typ having some blanket hard timeout on all tests so they don't hang and run forever. Say, kill tests at 20-30 mins or something we deem admissible. Note, though, that a test hitting this timeout should be considered a bug in Telemetry. Whatever caused us to hang for too long should be fixed.
And the world is messy, so some tests might have legitimate reasons for running longer than that. So it could be fine to have some mechanism for a single test to opt-out from the hard timeout limit. Note, though, this should be exceedingly rare. We shouldn't allow more than a handful of tests doing this.

Dirk Pranke

unread,

Feb 27, 2018, 12:28:07 PM2/27/18

to Juan Antonio Navarro Pérez, Кирилл Косарев, telemetry, ki...@yandex-team.ru, Kenneth Russell

It's entirely reasonable to have an overall timeout for the old test suite, and a timeout per test, and I think we should support that.

My point was that having the ability for two different tests to have different timeouts (e.g., 6s for a fast test and 120s for a slow test) adds an additional level of complexity that may not be warranted so I'd advise we try to start without that and see if that's good enough. (This is basically agreeing with Juan, I think).

I probably would not want to support something with as much flexibility as the decorator in the patch, but I haven't had time to really look at the patch yet. One could argue that typ as a generic framework should offer that functionality, even if we didn't want to use it in telemetry.

-- Dirk

To unsubscribe from this group and stop receiving emails from it, send an email to telemetry+unsubscribe@chromium.org.

Reply all

Reply to author

Forward