On Tue, Apr 21, 2015 at 1:01 PM, Andrius Bentkus <
toxed...@gmail.com> wrote:
>> Nice!
>>
>> When drafting the uv_timeout API someone (Ben IIRC) told me that we
>> could make the timeout a double, and thus support sub-millisecond
>> precision, where possible. AFAIK we could have sub-millisecond precision
>> if we use a timerfd instead of passing the timeout to epoll_wait/pwait.
>> We'd arm a single timerfd per loop, witht he value of the closest timer,
>> and add it to the epoll set, and then we'd call epoll_wait/pwait with a
>> NULL timeout.
>
>
> This is indeed possible and probably the best way! I thought about
> implementing a timerfd for every timer would be too much since file
> descriptors are a precious resource (a lot of systems set them to
> small values). However this would mean only one additional fd,
> which makes it a reasonable trade off for real time precision.
>
> A downside is that we have to call an additional system call
> after every event loop iteration for resetting the timerfd and
> calculating when the next timer is to be used.
That doesn't have to be the case. Libuv would only use the timerfd
when sub-millisecond precision is needed, and only in one-shot mode.
A potential drawback is that libuv, upon returning from epoll_wait(),
may have to scan the events list for the timerfd first. If a number
of normal callbacks run first, a high-resolution timer isn't much
good.
In most cases when the timerfd expires, the list will only contain the
timerfd and nothing else, so it probably won't be too bad.
Perhaps it's an idea to batch events first instead of dispatching
callbacks straight from uv__io_poll(). That would allow for more
optimizations than just avoiding the scan; for example, it would make
it possible to coalesce a read and write event for the same file
descriptor into a single callback.
> That could be mitigated though in the future once that patch hits
> mainline linux.
>
>>
>>
>> I'm not sure if there are limitations I'm not aware of here, but that
>> could be a start. The new epoll API could be supported as well, but
>> before that the user facing APIs need to support specifying
>> sub-millisecond precision.
>
>
> I actually now wonder why the timerfd was not used before?
> I mean, obviously we have now a suboptimal behaviour
> since my machine can handle 1k loop iterations in one second
> but I only get 920.
I can answer that question: because it was more work and < 1 ms
precision wasn't needed at the time. :-)