libuv test suite documentation

304 views
Skip to first unread message

ErnieOnTheRun

unread,
Jul 11, 2013, 1:46:45 AM7/11/13
to li...@googlegroups.com
Is there any documentation for the libuv testsuite (the unit tests run with <make test>) ? I have started working with a number of unit tests and need to understand more detail of what is tested and why the tests were designed the way they are.

Thanks.

Ben Noordhuis

unread,
Jul 11, 2013, 6:35:07 AM7/11/13
to li...@googlegroups.com
There is no documentation but if you have questions, I'll be happy to
answer them.

ErnieOnTheRun

unread,
Jul 12, 2013, 7:28:56 AM7/12/13
to li...@googlegroups.com
Great, thx.

For now, I am running this on Linux (CENTOS 5.5 (32 bit))

**************************************
My questions:

1) "test-fs"
Obviously, the "chown" related test results will differ depending on which user 
executes them (root OR other). I might have to execute this as root and am wondering if I can just 
skip this test.

what exactly is the "uv_fs_chown" & "uv_fs_fchown" doing ? (I looked them up in the uv.h header but did not find any explanations) & what is the significance of the tests in this file ?

2) "test-loop"
I got an inconsistent error at L.65:
 ASSERT(prepare_called == 3);
maybe 3/5 times it was failing. Could you explain WHY the counter should be 3 here ? How can it fail "sometimes" ?

3) "signal_multiple_loop"
The tests were (sometimes) timing out.
After changing the timeout setting in "run_tests" from
#define TEST_TIMEOUT  5000 to
+#define TEST_TIMEOUT  50000
it passes.
What is the significance of these times and how did you come up with them ?
**************************************
Thanks a lot for your help.

Ben Noordhuis

unread,
Jul 12, 2013, 1:01:31 PM7/12/13
to li...@googlegroups.com
On Fri, Jul 12, 2013 at 1:28 PM, ErnieOnTheRun
<spanger...@gmail.com> wrote:
> Great, thx.
>
> For now, I am running this on Linux (CENTOS 5.5 (32 bit))
>
> **************************************
> My questions:
>
> 1) "test-fs"
> Obviously, the "chown" related test results will differ depending on which
> user
> executes them (root OR other). I might have to execute this as root and am
> wondering if I can just
> skip this test.
>
> what exactly is the "uv_fs_chown" & "uv_fs_fchown" doing ? (I looked them up
> in the uv.h header but did not find any explanations) & what is the
> significance of the tests in this file ?

They're thin bindings to chown(2) and fchown(2), i.e. they change file
ownership through a path or file descriptor.

> 2) "test-loop"
> I got an inconsistent error at L.65:
> ASSERT(prepare_called == 3);
> maybe 3/5 times it was failing. Could you explain WHY the counter should be
> 3 here ? How can it fail "sometimes" ?

I don't know, it always passes for me. It could be a timing issue -
the test uses a timer with a 100 ms interval to force the event loop
to cycle. If the system that you're testing on is under load or
virtualized, then it may not be very reliable.

You can run the test in gdb with `gdb --args path/to/run-tests
loop_stop loop_stop` and inspect the value of prepare_called. (A
printf statement works too, of course.)

> 3) "signal_multiple_loop"
> The tests were (sometimes) timing out.
> After changing the timeout setting in "run_tests" from
> #define TEST_TIMEOUT 5000 to
> +#define TEST_TIMEOUT 50000
> it passes.
> What is the significance of these times and how did you come up with them ?

That's the upper limit but realistically speaking,
signal_multiple_loops should finish in 0.5s or less.

ErnieOnTheRun

unread,
Jul 16, 2013, 8:06:27 AM7/16/13
to li...@googlegroups.com
Thank you for the answers. Yes, I am running this virtually and will look into the (possible) performance issues that you point to. I will post my results.

Just one follow up question for now (as below):
 when you say "That's the upper limit", is that the "theoretically determined" upper limit or how do you calculate this ? Of course, I agree that it SHOULD not take to increase this.

Ben Noordhuis

unread,
Jul 16, 2013, 1:42:11 PM7/16/13
to li...@googlegroups.com
The reasoning behind that timeout goes something like "It's
unreasonable for a test to take more than 2 to 2.5 seconds to complete
so let's double that to be on the safe side."

ErnieOnTheRun

unread,
Aug 6, 2013, 5:28:36 AM8/6/13
to li...@googlegroups.com
I did follow up experiments and for that I prepared a new clean PC (i.e. not running virtually anymore) with a clean install of Centos 5.5 (32 bit). HW is totally different than in the initial trials, so the issue should not be HW dependency.

I am using gcc 4.1.2 to build it (from "make test").

*****************************
Results:
The results of running run-tests are somewhat different from the Virtual Machine, but I still get failures.
Run as root, over 5 repeats, I get 2 - 4 failing unit tests.

1) `spawn_setuid_setgid` failed: exit code 6
Output from process `spawn_setuid_setgid`:
exit_cb
Assertion failed in test/test-spawn.c on line 58: exit_status == 1 

This fails every time. Debugging with gdb, I can see the function "exit_cb" is getting called with "exit_status = -1".
Any idea of the root cause ?

2) 
`loop_stop` failed: exit code 6
Output from process `loop_stop`:
Assertion failed in test/test-loop-stop.c on line 65: prepare_called == 3

This fails 2/5 times. This must be a timing issue; debugging with gdb, I don't get it too fail, I suppose since it is too slow. Just running it, shows that from time to time, prepare_called = 4.

3) 
`tcp_close_while_connecting` failed: exit code 6
Output from process `tcp_close_while_connecting`:
Assertion failed in test/test-tcp-close-while-connecting.c on line 41: uv_last_error(req->handle->loop).code == UV_ECANCELED

This fails 1/5 times. Debugging with gdb, I see it occassionally shows a: UV_ECONNREFUSED.

**********************************************************

Questions:
1) When you say "all unit tests pass" for you, what is your testing environment on unix/ linux OS ? What gcc version do you use ? Do you know of (assume) any other "pre-conditions" (like network settings, etc) for running the unit tests ?


Thanks a lot!

Ben Noordhuis

unread,
Aug 6, 2013, 6:35:10 AM8/6/13
to li...@googlegroups.com
Can you retest with the latest master? The error messages suggest
you're at least a few weeks behind (unless that's the v0.10 branch.)

> Questions:
> 1) When you say "all unit tests pass" for you, what is your testing
> environment on unix/ linux OS ? What gcc version do you use ? Do you know of
> (assume) any other "pre-conditions" (like network settings, etc) for running
> the unit tests ?

I test on a number of platforms - but primarily x86_64 Linux 3.9+ and
amd64 FreeBSD 8 and 9 - and a range of gcc releases: gcc 4.2.1 to 4.8
(4.2.1 only because that's the gcc that ships with OS X and the BSDs.)

We also have Jenkins set up[1] to test on the usual suspects: FreeBSD,
Linux, OS X, SmartOS (Solaris), Windows. There's a lot of red now but
that's because the build slaves had a spot of trouble recently.

By the way, I would suggest you upgrade your compiler. 4.1 is over
six years old and has been unsupported by upstream gcc for years now.
I don't know if Red Hat still supports it but if they do, it'll be
minimal at best.

[1] http://jenkins.nodejs.org/view/libuv/

ErnieOnTheRun

unread,
Aug 6, 2013, 9:20:33 PM8/6/13
to li...@googlegroups.com
Thank you for giving those details.

Background: I am working based off a large & pre-existing project. This means -- as much as I would like to -- I cannot change libuv version, gcc version or OS version.
--> Yes, you are right, I am using the v0.10 branch (actually v0.10.3). Do you have any particular comments about the characteristics of this branch ? Are the unit testing failures expected ?

> Questions:
> 1) When you say "all unit tests pass" for you, what is your testing
> environment on unix/ linux OS ? What gcc version do you use ? Do you know of
> (assume) any other "pre-conditions" (like network settings, etc) for running
> the unit tests ?

I test on a number of platforms - but primarily x86_64 Linux 3.9+ and
amd64 FreeBSD 8 and 9 - and a range of gcc releases: gcc 4.2.1 to 4.8
(4.2.1 only because that's the gcc that ships with OS X and the BSDs.)

--> From this I understand that all your testing is done on 64-bit platforms. This might be related to the root cause of 
the unit test failures. Am I missing something ? I might need to do more testing on a 64-bit OS to compare ?
 
We also have Jenkins set up[1] to test on the usual suspects: FreeBSD,
Linux, OS X, SmartOS (Solaris), Windows.  There's a lot of red now but
that's because the build slaves had a spot of trouble recently.

By the way, I would suggest you upgrade your compiler.  4.1 is over
six years old and has been unsupported by upstream gcc for years now.
I don't know if Red Hat still supports it but if they do, it'll be
minimal at best.
 
--> As mentioned, I cannot update this.
 
[1] http://jenkins.nodejs.org/view/libuv/

Ben Noordhuis

unread,
Aug 7, 2013, 8:29:57 AM8/7/13
to li...@googlegroups.com
You should at least consider upgrading to the latest v0.10 release
(v0.10.13 as of this writing) because you're a number of bug fixes
behind.

It should be a drop-in replacement: v0.10 is a stable branch and those
are API and ABI stable. You don't even need to recompile files that
depend on libuv, just relink.

>> > Questions:
>> > 1) When you say "all unit tests pass" for you, what is your testing
>> > environment on unix/ linux OS ? What gcc version do you use ? Do you
>> > know of
>> > (assume) any other "pre-conditions" (like network settings, etc) for
>> > running
>> > the unit tests ?
>>
>> I test on a number of platforms - but primarily x86_64 Linux 3.9+ and
>> amd64 FreeBSD 8 and 9 - and a range of gcc releases: gcc 4.2.1 to 4.8
>> (4.2.1 only because that's the gcc that ships with OS X and the BSDs.)
>>
> --> From this I understand that all your testing is done on 64-bit
> platforms. This might be related to the root cause of
> the unit test failures. Am I missing something ? I might need to do more
> testing on a 64-bit OS to compare ?

Sorry, what I mean is that I _personally_ mostly test on 64 bits
platforms. Our Jenkins setup tests on a matrix of 32 and 64 bits
platforms.

Going back to the failing tests, loop_stop and
tcp_close_while_connecting are probably timing issues.

That last one we could probably address. The test sets a 50 ms timer,
then tries to connect to 1.2.3.4. What happens when you set the
timeout to zero? You can find it in
test/test-tcp-close-while-connecting.c.

I don't know why spawn_setuid_setgid is failing for you. Try running
it in gdb, the system error will be in handle->loop->last_err.

>> We also have Jenkins set up[1] to test on the usual suspects: FreeBSD,
>> Linux, OS X, SmartOS (Solaris), Windows. There's a lot of red now but
>> that's because the build slaves had a spot of trouble recently.
>>
>> By the way, I would suggest you upgrade your compiler. 4.1 is over
>> six years old and has been unsupported by upstream gcc for years now.
>> I don't know if Red Hat still supports it but if they do, it'll be
>> minimal at best.
>
> --> As mentioned, I cannot update this.

Noted, but keep in mind that we won't go out of our way to accommodate
issues with a compiler that old. If if works for you, great. If it
doesn't, you're on your own.

ErnieOnTheRun

unread,
Aug 11, 2013, 9:51:34 PM8/11/13
to li...@googlegroups.com
--> Thank you for the advice. I will discuss this. I will use this mainly on Linux (and 
Windows). Do you have any decisive argument (key bug fix) to help me convince others 
to move from v.0.10.3 --> v.0.10.13 ? I looked over the notes in the different releases but it was hard to judge how critical those fixes are. I would appreciate your comment on this.

It should be a drop-in replacement: v0.10 is a stable branch and those
are API and ABI stable.  You don't even need to recompile files that
depend on libuv, just relink.

>> > Questions:
>> > 1) When you say "all unit tests pass" for you, what is your testing
>> > environment on unix/ linux OS ? What gcc version do you use ? Do you
>> > know of
>> > (assume) any other "pre-conditions" (like network settings, etc) for
>> > running
>> > the unit tests ?
>>
>> I test on a number of platforms - but primarily x86_64 Linux 3.9+ and
>> amd64 FreeBSD 8 and 9 - and a range of gcc releases: gcc 4.2.1 to 4.8
>> (4.2.1 only because that's the gcc that ships with OS X and the BSDs.)
>>
> --> From this I understand that all your testing is done on 64-bit
> platforms. This might be related to the root cause of
> the unit test failures. Am I missing something ? I might need to do more
> testing on a 64-bit OS to compare ?

Sorry, what I mean is that I _personally_ mostly test on 64 bits
platforms.  Our Jenkins setup tests on a matrix of 32 and 64 bits
platforms.

--> Thanks for the additional info. That is good to know. So I assume the timing issues must be related to some particularities of the CentOS & Hardware I am working on..... (?)
 
Going back to the failing tests, loop_stop and
tcp_close_while_connecting are probably timing issues.

That last one we could probably address.  The test sets a 50 ms timer,
then tries to connect to 1.2.3.4.  What happens when you set the
timeout to zero?  You can find it in
test/test-tcp-close-while-connecting.c.
--> When I ran this with the timeout as 0, it always passed. What exactly is this timeout and why does it pass with the timeout being set to 0 ?
 

I don't know why spawn_setuid_setgid is failing for you.  Try running
it in gdb, the system error will be in handle->loop->last_err.
--> The system error I get is: "{code = UV_EACCES, sys_errno_ = 13}", which looks like a 
access (permissions ?) issue. Any idea why I would see such an error if I run as root ?
 

>> We also have Jenkins set up[1] to test on the usual suspects: FreeBSD,
>> Linux, OS X, SmartOS (Solaris), Windows.  There's a lot of red now but
>> that's because the build slaves had a spot of trouble recently.
>>
>> By the way, I would suggest you upgrade your compiler.  4.1 is over
>> six years old and has been unsupported by upstream gcc for years now.
>> I don't know if Red Hat still supports it but if they do, it'll be
>> minimal at best.
>
> --> As mentioned, I cannot update this.

Noted, but keep in mind that we won't go out of our way to accommodate
issues with a compiler that old.  If if works for you, great.  If it
doesn't, you're on your own.
--> Sure, that is understood. Thanks.

Ben Noordhuis

unread,
Aug 12, 2013, 1:00:35 AM8/12/13
to li...@googlegroups.com
On Mon, Aug 12, 2013 at 3:51 AM, ErnieOnTheRun
<spanger...@gmail.com> wrote:
>> You should at least consider upgrading to the latest v0.10 release
>> (v0.10.13 as of this writing) because you're a number of bug fixes
>> behind.
>>
> --> Thank you for the advice. I will discuss this. I will use this mainly on
> Linux (and
> Windows). Do you have any decisive argument (key bug fix) to help me
> convince others
> to move from v.0.10.3 --> v.0.10.13 ? I looked over the notes in the
> different releases but it was hard to judge how critical those fixes are. I
> would appreciate your comment on this.

There were some fixes for older Linux* kernels in v0.10.11 that you
probably want. Said kernels report errors in an unusual way and libuv
didn't handle that correctly, resulting in a busy loop.

Another busy loop when hitting the file descriptor limit while
listening for incoming connections was fixed in v0.10.6.

* I say 'Linux' but I could only reproduce it on CentOS systems. The
delta between RHEL kernels and mainline is huge so it's possible it's
some kind of RHEL/CentOS-only regression.

>> >> I test on a number of platforms - but primarily x86_64 Linux 3.9+ and
>> >> amd64 FreeBSD 8 and 9 - and a range of gcc releases: gcc 4.2.1 to 4.8
>> >> (4.2.1 only because that's the gcc that ships with OS X and the BSDs.)
>> >>
>> > --> From this I understand that all your testing is done on 64-bit
>> > platforms. This might be related to the root cause of
>> > the unit test failures. Am I missing something ? I might need to do more
>> > testing on a 64-bit OS to compare ?
>>
>> Sorry, what I mean is that I _personally_ mostly test on 64 bits
>> platforms. Our Jenkins setup tests on a matrix of 32 and 64 bits
>> platforms.
>>
> --> Thanks for the additional info. That is good to know. So I assume the
> timing issues must be related to some particularities of the CentOS &
> Hardware I am working on..... (?)

That sounds plausible.

>> Going back to the failing tests, loop_stop and
>> tcp_close_while_connecting are probably timing issues.
>>
>> That last one we could probably address. The test sets a 50 ms timer,
>> then tries to connect to 1.2.3.4. What happens when you set the
>> timeout to zero? You can find it in
>> test/test-tcp-close-while-connecting.c.
>
> --> When I ran this with the timeout as 0, it always passed. What exactly is
> this timeout and why does it pass with the timeout being set to 0 ?

The test tries to connect to a non-routable (or at least unreachable)
address. It usually takes seconds if not minutes for a connection to
time out but if you're in an environment where an upstream router or
firewall drops the connection immediately, then the timeout is hit
sooner than the test expects. That's why the zero timeout fixes it
(for certain values of 'fix.')

If you open an issue, I'll look into fixing it properly.

>> I don't know why spawn_setuid_setgid is failing for you. Try running
>> it in gdb, the system error will be in handle->loop->last_err.
>
> --> The system error I get is: "{code = UV_EACCES, sys_errno_ = 13}", which
> looks like a
> access (permissions ?) issue. Any idea why I would see such an error if I
> run as root ?

The test changes user to 'nobody', then tries to spawn
`path/to/run-tests some_helper`. I suspect that the permissions of
run-tests prevent user nobody from executing it.
Reply all
Reply to author
Forward
0 new messages