GNU time is a simple program and for the most part simply reports
information provided by the kernel about the child process. It is
hard to believe that it is showing incorrect information. But
OpenSuSE may have patches which modify or extend the functionality and
so it might be somewhat different from the upstream GNU version.
Can you reproduce the problem? Can you run time under strace to show
what values were returned from the system call?
$ strace -v -o time.strace.out /usr/bin/time -v SOMEPROGRAM
That should print to the time.strace.out file the values of the rusage
structure returned from the wait4(2) system call. That is the
information that GNU time prints out. That information is direct from
the Linux kernel. But hopefully the strace will show something
interesting.
Thanks,
Bob
Thu, 30 Sep 2010 23:40:47 -0600, bob wrote:
...
> $ strace -v -o time.strace.out /usr/bin/time -v SOMEPROGRAM
...
> That should print to the time.strace.out file the values of the rusage
> structure returned from the wait4(2) system call. That is the
> information that GNU time prints out. That information is direct from
> the Linux kernel.
Here is the standard output from the line you suggested:
User time (seconds): 1203.23
System time (seconds): 0.62
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 20:04.45
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2004096
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 3
Minor (reclaiming a frame) page faults: 125482
Voluntary context switches: 200
Involuntary context switches: 102319
Swaps: 0
File system inputs: 287616
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
Note the resident set size (RSS) of 2004096,
top and htop show only around 500000.
And the wait4 result "ru_maxrss" differs as expected:
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, {ru_utime={1203, 239197}, ru_stime={0, 620038}, ru_maxrss=501024, ru_ixrss=0, ru_idrss=0, ru_isrss=0, ru_minflt=125482, ru_majflt=3, ru_nswap=0, ru_inblock=287616, ru_oublock=0, ru_msgsnd=0, ru_msgrcv=0, ru_nsignals=0, ru_nvcsw=200, ru_nivcsw=102319}) = 9953
Who multiplies the correct "ru_maxrss" value by 4?
Some more details about my machine:
> /usr/bin/time -V
GNU time 1.7
> uname -a
Linux ... 2.6.34.7-0.3-default #1 SMP 2010-09-20 15:27:38 +0200 x86_64 x86_64 x86_64 GNU/Linux
> rpm -q -f /usr/bin/time
util-linux-2.17.2-5.3.x86_64
> cat /etc/SuSE-release
openSUSE 11.3 (x86_64)
VERSION = 11.3
Greetings
Sven
Thank you for your report. You have indeed found a bug in GNU time.
GNU time is expecting the ru_maxrss value to indicate in pages and is
converting that value from number of pages to kbytes. That is
incorrect and results in the bug you found.
> And the wait4 result "ru_maxrss" differs as expected:
> wait4(... , ru_maxrss=501024, ...
>
> Who multiplies the correct "ru_maxrss" value by 4?
GNU time does, which is incorrect.
The rusage structure is documented in getrusage(2) and says:
long ru_maxrss; /* maximum resident set size */
While other entries are documented as pages that entry is without
units attached to the comment. That is undoubtedly the source of the
error. Pages were assumed when it was really kbytes. Looking at the
libc documentation we see that it is not ambiguous:
$ info libc 'Resource Usage'
'long int ru_maxrss'
The maximum resident set size used, in kilobytes. That is,
the maximum number of kilobytes of physical memory that
PROCESSES used simultaneously.
And other older BSD documentation concurs. The value is reported in
kbytes.
This trivial patch corrects the problem. I will see about getting GNU
time updated.
diff -ru time-1.7.orig/ChangeLog time-1.7/ChangeLog
--- time-1.7.orig/ChangeLog 1996-07-11 10:37:20.000000000 -0600
+++ time-1.7/ChangeLog 2010-10-01 16:03:09.000000000 -0600
@@ -1,3 +1,10 @@
+2010-10-01 Bob Proulx <b...@proulx.com>
+
+ The struct rusage reports ru_maxrss in kbytes not pages and
+ should not be converted through pages-to-kbytes again.
+ Reported by Sven Hartrumpf.
+ * time.c (summarize): Do not call ptok on ru_maxrss.
+
diff -ru time-1.7.orig/time.c time-1.7/time.c
--- time-1.7.orig/time.c 1996-06-13 13:38:21.000000000 -0600
+++ time-1.7/time.c 2010-10-01 15:49:11.000000000 -0600
@@ -392,7 +392,7 @@
ptok ((UL) resp->ru.ru_ixrss) / MSEC_TO_TICKS (v));
break;
case 'M': /* Maximum resident set size. */
- fprintf (fp, "%lu", ptok ((UL) resp->ru.ru_maxrss));
+ fprintf (fp, "%lu", (UL) resp->ru.ru_maxrss);
break;
case 'O': /* Outputs. */
fprintf (fp, "%ld", resp->ru.ru_oublock);
Thanks again for reporting this problem and for helping to improve
GNU!
Bob
I wrote:
> rpm -q -f /usr/bin/time
> util-linux-2.17.2-5.3.x86_64
Just curious: how will your fix reach util-linux because
this is what distributions use (not GNU time directly, it seems)?
Ciao
Sven
Yes, well, we will see! But I am very hopeful. :-)
> I wrote:
> > rpm -q -f /usr/bin/time
> > util-linux-2.17.2-5.3.x86_64
>
> Just curious: how will your fix reach util-linux because
> this is what distributions use (not GNU time directly, it seems)?
It should. Since util-linux on SuSE uses GNU time as one of the
components then I expect that it would eventually flow into it. Since
time is such a small program I can understand why it would be bundled
into a larger collection of other small programs. Note however that
other distributions such as Debian and Red Hat package time into its
own package. The way that it is packaged is not universal.
Bob