Hardware Reliability Tests: How reliable are the tests mentioned on the wiki?

174 views
Skip to first unread message

Timo Sigurdsson

unread,
Nov 4, 2015, 12:39:22 PM11/4/15
to linux...@googlegroups.com
Hi,

since I have an extra A20 board that doesn't have a specific purpose at the moment, I thought I do some experiments with it and test performance and reliability (among other things).
One of my tests involved overclocking the board at stock voltage (1.4V) to see what the board can do. So I used cpufreq-ljt-stress-test and cpuburn-a7 as mentioned on the wiki[1]. What surprised me, though, was that the stress test neither runs for a very long time nor on both cores simultaniously. So, the test results suggest that my board can do 1104MHz at 1.4V (I didn't try higher frequencies because I didn't even expect that it would run stable at 1104 MHz without raising the voltage).

But I'm wondering - how reliable are these tests actually? I would have assumed that for the results to be meaningful it would be best to put as much stress on the CPU as possible and do that over a prolonged period. And if you have multiple cores, to put them under load simultaniously. It this assumption wrong? Is such extensive testing neglible in real life?

Since I didn't trust the quick test, I quickly changed the script to to 600 iterations instead of 60. But, of course, that doesn't make it run on two cores. So, while running cpufreq-ljt-stress-test, I also ran cpuburn-a7 in the background which put both cores under load. The board still passed the test.

What kind of tests or setups do you use to determine reliable settings?


Thanks,

Timo


[1] http://linux-sunxi.org/Hardware_Reliability_Tests

null

unread,
Nov 4, 2015, 2:07:40 PM11/4/15
to linux-sunxi, public...@silentcreek.de
By my tests, A20 with small heatsink can run 1Ghz 24/7 at 1.275mv with prolonged heavy load(emerge world, gentoo).

Without heatsink, it unstable even at 0.8Ghz.

my dmesg:
[cpu_freq] INF:  voltage = 1625mv        frequency = 1296MHz
[cpu_freq] INF:  voltage = 1475mv        frequency = 1200MHz
[cpu_freq] INF:  voltage = 1275mv        frequency = 1008MHz

среда, 4 ноября 2015 г., 20:39:22 UTC+3 пользователь Timo Sigurdsson написал:

Timo Sigurdsson

unread,
Nov 4, 2015, 4:58:32 PM11/4/15
to linux...@googlegroups.com
Am Mittwoch, den 04.11.2015, 18:39 +0100 schrieb Timo Sigurdsson:
> Hi,
>
> since I have an extra A20 board that doesn't have a specific purpose at the moment, I thought I do some experiments with it and test performance and reliability (among other things).
> One of my tests involved overclocking the board at stock voltage (1.4V) to see what the board can do. So I used cpufreq-ljt-stress-test and cpuburn-a7 as mentioned on the wiki[1]. What surprised me, though, was that the stress test neither runs for a very long time nor on both cores simultaniously. So, the test results suggest that my board can do 1104MHz at 1.4V (I didn't try higher frequencies because I didn't even expect that it would run stable at 1104 MHz without raising the voltage).
>
> But I'm wondering - how reliable are these tests actually? I would have assumed that for the results to be meaningful it would be best to put as much stress on the CPU as possible and do that over a prolonged period. And if you have multiple cores, to put them under load simultaniously. It this assumption wrong? Is such extensive testing neglible in real life?
>
> Since I didn't trust the quick test, I quickly changed the script to to 600 iterations instead of 60. But, of course, that doesn't make it run on two cores. So, while running cpufreq-ljt-stress-test, I also ran cpuburn-a7 in the background which put both cores under load. The board still passed the test.
>
> What kind of tests or setups do you use to determine reliable settings?
>
>
> Thanks,
>
> Timo
>
>
> [1] http://linux-sunxi.org/Hardware_Reliability_Tests

I did some more testing and can answer the main question myself now:

I reran my tests again, this time with frequencies up to 120...@1.4V.
The unchanged cpufreq-ljt-stress-test fails at 1200MHz but passes at
1152MHz.

If I change the test duration by increasing the iterations to 600 and
have cpuburn-a7 running in the background while running the stress test,
the 1152MHz setting runs fine for quite a while but eventuelly fails on
the second core at a late stage - just a few more iterations until it
would pass. At 1104MHz it still passes the test.

So I guess it's better to take the test result of the unmodified script
with a grain of salt and test with more intensive workloads.

I'd still be interested to hear what kind of reliable test setups are
used or recommended. Maybe we can add some recommendations to the wiki
then.

Thanks,

Timo

Timo Sigurdsson

unread,
Nov 4, 2015, 8:02:31 PM11/4/15
to null, linux-sunxi
Hi,

Am Mittwoch, den 04.11.2015, 11:07 -0800 schrieb null:
> By my tests, A20 with small heatsink can run 1Ghz 24/7 at 1.275mv with
> prolonged heavy load(emerge world, gentoo).
>
>
> Without heatsink, it unstable even at 0.8Ghz.
>
>
> my dmesg:
> [cpu_freq] INF: voltage = 1625mv frequency = 1296MHz
> [cpu_freq] INF: voltage = 1475mv frequency = 1200MHz
> [cpu_freq] INF: voltage = 1275mv frequency = 1008MHz

that's interesting, although it doesn't really answer my question about
how reliable the mentioned tools are. In the meantime, I found out that
maybe they are better not considered reliable in absolute terms. But I
still think they are useful with regards to the validation of the
results of calculations. I also did some kernel compilation before to
put the device under heavy load, but there you lack the validation part
(other than knowing the system didn't crash). Is that different with
gentoo/emerge? (Sorry, not a gentoo user.)

Regards,

Timo


P.S.: Sorry. I forgot to include the mailinglist when I first sent this reply - so this is a "resend".



null

unread,
Nov 6, 2015, 1:39:04 PM11/6/15
to linux-sunxi, opit...@gmail.com, public...@silentcreek.de
Gentoo is source based, rolling-release distro.
You need permanently compile something to be updated )
   
As for cpuburn-a7 - it is a really stress test.

For me it reliable, because in real world with real software(even in Gentoo), i can't stress A20 as cpuburn-a7 can.

четверг, 5 ноября 2015 г., 4:02:31 UTC+3 пользователь Timo Sigurdsson написал:
Reply all
Reply to author
Forward
0 new messages