RE: POE and the time

Philip Gwyn

unread,

Dec 14, 2012, 8:24:52 PM12/14/12

to p...@perl.org

The previous patch does not solve the problem properly. I've been thinking
this over for a week now. I've collected my thoughts on the wiki, in my usual
rambling style of writing.

If anyone has any ideas or thoughts about POE and timing, please read and
comment here. In particular, I'd like maintainers of POE::Loop to comment on
the specifics of their implementation.

http://poe.perl.org/?POE_Documentation/Problem_with_the_clock

-Philip

Chris Fedde

unread,

Dec 15, 2012, 12:52:20 AM12/15/12

to Philip Gwyn, POE Mailing List

A few comments about the wiki notes.

First ntpd is careful about ensuring that time always moves forward. It
does this by adjusting the clock slew rather than jumping time forwards or
backwards. When ntpd first starts up and the system clock is not set
within 5 seconds relative to its ntp servers then ntpd will die with a log
message rather than jump the time. Within about an hour of starting ntpd
can have the system synchronized within a few milliseconds of its best
server.

Some admins like to run ntpdate out of cron and force a time reset at
regular intervals. It's never a good idea to force time jumps. It can case
all kinds of trouble for databases, NFS, cryptography, process scheduler
and even event systems written in perl.

Most systems have a separate hardware clocks that is pretty accurate. OS
time is typically synchronized with it at boot then runs independently of
it. Some admins like to synch with hwclock out of cron. This is also a
bad idea. On "wintel" type systems the hardware clock typically reports
only down to seconds. This is much too rough for modern time keeping.,

Virtual machines add another whole layer of complexity to the picture. A
vm will frequently have time jumps between time slices when it gets to run.
Things that require real time behaviors probably should not be run in a
vm. VM guests have to trust their host for an accurate representation of
time. And all the issues that crop up above add another layer of
complexity to accurate time keeping here

There are very few good reasons for administrative clock adjustments.
These days there are no good reasons based on simple calendrical causes.
Soon even the dreaded leap second will be dealt with through math and
lookup tables rather than clock adjustments. Nearly all other reasons to
adjust the clock are political and organizational rather than technical.

If we still want to deal with time sifts we have to detect them. When time
jumps occur it is hard for a program to detect them. The only way that the
program would know that time has changed is if it has two sources to
compare. If such an event has been detected then it has to decide if there
was a good administrative reason for the change or not.

If the program decides to take action because it has detected a time change
then it must have an array of logic to decide what special behaviors to
apply for all of the various ways that the change could have occurred.
Should it fire missed events? Should it recompute the offset for future
events? what if the scale of the shift is large? What if the scale of the
shift is tiny? What if time shifts occur frequently? What if they are
noisy, crossing and recrossing the same point in time? We quickly get into
pathological behaviors that either need to be accounted or ignored.

Most developers who have cracked open this nut have eventually decided to
leave well enough alone and have their system trust that the OS knows what
time it is and that the admin had a "good reason" for mucking with the
system time and leave it at that. Trying to correct for bad admin
behavior is better done at the social layer than at the technical one.

Just my thoughts.
chris

Philip Gwyn

unread,

Dec 17, 2012, 12:39:34 PM12/17/12

to Chris Fedde, POE Mailing List

Your post is an excellent counterpoint. I agree in theory with all your
comments. But in practice, things are broken. POE should not depend on a
stable, monotonic time(2) because there is zero guarantee of such.

On 15-Dec-2012 Chris Fedde wrote:

> The only way that the
> program would know that time has changed is if it has two sources to
> compare.

Modern systems do have 2 sources to compare. See timer_gettime(2)
CLOCK_REALTIME vs CLOCK_MONOTONIC (for Linux). Win32 has SetTimer et al

> If the program decides to take action because it has detected a time change
> then it must have an array of logic to decide what special behaviors to
> apply for all of the various ways that the change could have occurred.

This is why I propose the 'clock-skew' psuedo-signal.

> Trying to correct for bad admin behavior is better done at the social layer
> than at the technical one.

POE does not currently act usefully during clock skew. If time jumps back, then
the queue blocks until time(2) gets back to the previous value. This precludes
any events being delivered unless there is new FD activity, and user code can
not react usefully. Saying "but that shouldn't happen" is not very useful. I
agree it shouldn't happen, but it does happen and must be able to handle the
situation gracefully.

-Philip

Jeremy Leader

unread,

Dec 17, 2012, 2:55:48 PM12/17/12

to Chris Fedde, Philip Gwyn, POE Mailing List

This discussion reminded me of:

http://infiniteundo.com/post/25326999628/falsehoods-programmers-believe-about-time

--
Jeremy Leader
jle...@oversee.net