OSX: Repeated return values from time.Now().UnixNano(): expected behavior or bug?

947 views
Skip to first unread message

ma...@influxdb.com

unread,
Apr 13, 2016, 8:15:46 PM4/13/16
to golang-dev, Jack Zampolin, Michael desa
Hi golang-dev,

My team was writing some code that involved producing many nanosecond-precise timestamps, and to our surprise, we found that on OSX, we would occasionally see duplicate timestamps in calls to time.Now().UnixNano().
These weren't consecutive calls, but rather a call to .UnixNano, a call to strconv.AppendInt, and some more buffer manipulation in a loop.

Here's a pretty minimal reproduction case (you'll have to copy and run it locally): http://play.golang.org/p/xhmQ7s27KP

We weren't able to reproduce the issue on a Linux machine, but we did reproduce it on multiple Macs (both with and without -race).
We have some variation in our hardware:

* Go 1.6, OSX 10.11.4, mid 2015 MacBook Pro: constantly reproducible
* Go 1.5.3, OSX 10.11.2, mid 2015 MacBook Pro: constantly reproducible
* Go 1.6, OSX 10.10.5, mid 2015 MacBook Pro: constantly reproducible
* Go 1.6.1, OSX 10.11.3, mid 2014 MacBook Pro:  reproducible but very infrequent

Here's some sample output from a machine that regularly reproduces the issue: 

Found collision at 1460590310621970000 after 609162048 iterations last collison was 2 iterations ago.
Found collision at 1460590310621973000 after 609162051 iterations last collison was 3 iterations ago.
Found collision at 1460590310621973000 after 609162052 iterations last collison was 1 iterations ago.
Found collision at 1460590310621977000 after 609162054 iterations last collison was 2 iterations ago.
Found collision at 1460590310621979000 after 609162056 iterations last collison was 2 iterations ago.
Found collision at 1460590310621982000 after 609162059 iterations last collison was 3 iterations ago.
Found collision at 1460590310621982000 after 609162060 iterations last collison was 1 iterations ago.
Found collision at 1460590310621986000 after 609162062 iterations last collison was 2 iterations ago.
Found collision at 1460590312493887000 after 616709309 iterations last collison was 7547247 iterations ago.
Found collision at 1460590312493887000 after 616709310 iterations last collison was 1 iterations ago.
Found collision at 1460590312493887000 after 616709311 iterations last collison was 1 iterations ago.
Found collision at 1460590312493887000 after 616709312 iterations last collison was 1 iterations ago.
Found collision at 1460590312493887000 after 616709313 iterations last collison was 1 iterations ago.
Found collision at 1460590312493887000 after 616709314 iterations last collison was 1 iterations ago.
Found collision at 1460590312493887000 after 616709315 iterations last collison was 1 iterations ago.
Found collision at 1460590312493887000 after 616709316 iterations last collison was 1 iterations ago.
Found collision at 1460590312493887000 after 616709317 iterations last collison was 1 iterations ago.
Found collision at 1460590312493887000 after 616709318 iterations last collison was 1 iterations ago.
Found collision at 1460590312493887000 after 616709319 iterations last collison was 1 iterations ago.
Found collision at 1460590312493904000 after 616709321 iterations last collison was 2 iterations ago.
Found collision at 1460590312493907000 after 616709324 iterations last collison was 3 iterations ago.
Found collision at 1460590312493910000 after 616709327 iterations last collison was 3 iterations ago.
Found collision at 1460590312493912000 after 616709329 iterations last collison was 2 iterations ago.
Found collision at 1460590312493915000 after 616709332 iterations last collison was 3 iterations ago.
Found collision at 1460590312493918000 after 616709335 iterations last collison was 3 iterations ago.
Found collision at 1460590312493920000 after 616709337 iterations last collison was 2 iterations ago.
Found collision at 1460590312493922000 after 616709339 iterations last collison was 2 iterations ago.
Found collision at 1460590312493925000 after 616709342 iterations last collison was 3 iterations ago.
Found collision at 1460590312494046000 after 616709345 iterations last collison was 3 iterations ago.
Found collision at 1460590312494049000 after 616709348 iterations last collison was 3 iterations ago.
Found collision at 1460590312494052000 after 616709351 iterations last collison was 3 iterations ago.
Found collision at 1460590312494054000 after 616709353 iterations last collison was 2 iterations ago.
Found collision at 1460590312494056000 after 616709355 iterations last collison was 2 iterations ago.
Found collision at 1460590312494059000 after 616709358 iterations last collison was 3 iterations ago.
Found collision at 1460590312494062000 after 616709361 iterations last collison was 3 iterations ago.
Found collision at 1460590312494064000 after 616709363 iterations last collison was 2 iterations ago.
Found collision at 1460590312494067000 after 616709366 iterations last collison was 3 iterations ago.
Found collision at 1460590312494067000 after 616709367 iterations last collison was 1 iterations ago.
Found collision at 1460590312494161000 after 616709370 iterations last collison was 3 iterations ago.
Found collision at 1460590312494164000 after 616709373 iterations last collison was 3 iterations ago.
Found collision at 1460590312494167000 after 616709376 iterations last collison was 3 iterations ago.
Found collision at 1460590312494170000 after 616709379 iterations last collison was 3 iterations ago.
Found collision at 1460590312494172000 after 616709381 iterations last collison was 2 iterations ago.
Found collision at 1460590312494175000 after 616709384 iterations last collison was 3 iterations ago.
Found collision at 1460590312494175000 after 616709385 iterations last collison was 1 iterations ago.
Found collision at 1460590312494179000 after 616709387 iterations last collison was 2 iterations ago.
Found collision at 1460590312494181000 after 616709389 iterations last collison was 2 iterations ago.
Found collision at 1460590312494184000 after 616709392 iterations last collison was 3 iterations ago.
Found collision at 1460590312494254000 after 616709395 iterations last collison was 3 iterations ago.
Found collision at 1460590312494257000 after 616709398 iterations last collison was 3 iterations ago.
Found collision at 1460590312494257000 after 616709399 iterations last collison was 1 iterations ago.

Sample output from the machine that infrequently reproduces it:

Found collision at 1460590290165280000 after 810340214 iterations last collison was 810340214 iterations ago.
Found collision at 1460590510467578000 after 1593300312 iterations last collison was 782960098 iterations ago.
Found collision at 1460590513370454000 after 1603193249 iterations last collison was 9892937 iterations ago.
Found collision at 1460590827990295000 after 2734688158 iterations last collison was 1131494909 iterations ago.
Found collision at 1460591003566309000 after 3368962405 iterations last collison was 634274247 iterations ago.
Found collision at 1460591263005088000 after 4301052067 iterations last collison was 932089662 iterations ago.
Found collision at 1460591357143260000 after 4636762475 iterations last collison was 335710408 iterations ago.

I noticed that the collisions occur when the call to UnixNano seems to return microsecond-precise numbers (i.e. the final 3 digits are all zero).
Looking at the Darwin assembly at https://golang.org/src/runtime/sys_darwin_amd64.s?h=nanotime, I wasn't able to determine if this is expected OSX/Darwin behavior or if this is an issue with Go.
I did try various Google searches with the words Darwin, OSX, nanoseconds, microseconds, golang, etc., but I wasn't able to find any information online that seemed relevant.

Is this expected behavior with OSX?
If so, what steps should I have taken to determine that for myself?

Thanks,
Mark

Andrew Gerrand

unread,
Apr 13, 2016, 8:21:30 PM4/13/16
to ma...@influxdb.com, golang-dev, Jack Zampolin, Michael desa
Doing a search for "darwin gettimeofday" reveals the man page for darwin's libc gettimeofday


which seems to indicate the kernel only provides microsecond-accurate time of day.

Andrew 

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dave Cheney

unread,
Apr 13, 2016, 8:23:29 PM4/13/16
to Andrew Gerrand, ma...@influxdb.com, golang-dev, Jack Zampolin, Michael desa
Can you try this simpler example.

http://play.golang.org/p/5sJgUMiiiD

As Andrew noted, I don't think gettimeofday gives you the precision
you need. You probably want to use a monotonic clock if you have a
requirement that the value never repeats or rolls backwars.

Dave Cheney

unread,
Apr 13, 2016, 8:27:36 PM4/13/16
to Anmol Sethi, Andrew Gerrand, ma...@influxdb.com, golang-dev, Jack Zampolin, Michael desa
That's not unexpected. I don't think time.Now gives any guarantee
apart from "This is what the time is right now, to some precision"

On Thu, Apr 14, 2016 at 10:25 AM, Anmol Sethi <an...@aubble.com> wrote:
> I tested your example on OSX 10.11.5 on a mid 2015 macbook pro and I still get the collisions.

Anmol Sethi

unread,
Apr 13, 2016, 9:08:06 PM4/13/16
to Dave Cheney, Andrew Gerrand, ma...@influxdb.com, golang-dev, Jack Zampolin, Michael desa
I tested your example on OSX 10.11.5 on a mid 2015 macbook pro and I still get the collisions.

> On Apr 13, 2016, at 8:23 PM, Dave Cheney <da...@cheney.net> wrote:
>

andrey mirtchovski

unread,
Apr 13, 2016, 9:25:28 PM4/13/16
to Dave Cheney, Anmol Sethi, Andrew Gerrand, ma...@influxdb.com, golang-dev, Jack Zampolin, Michael desa
If you're looking for a monotonically increasing clock on osx (the
equivalent of linux' CLOCK_MONOTONIC) you should look for
clock_get_time with the argument SYSTEM_CLOCK.

cf:
http://stackoverflow.com/a/11681069

andrey mirtchovski

unread,
Apr 13, 2016, 9:45:48 PM4/13/16
to Dave Cheney, Anmol Sethi, Andrew Gerrand, ma...@influxdb.com, golang-dev, Jack Zampolin, Michael desa
to follow up, using the code below shows no collisions (while Dave's
original did show them), however a simple benchmark of the function
gives 2000+ ns per function call[1]. i don't know if this is a
consequence of the kernel's attempts of ensuring monotonicity or an
artifact of cgo. i have not tried to measure how expensive the call is
in C.

http://play.golang.org/p/ToDcqEmGqx

--
1: BenchmarkTime-4 500000 2551 ns/op

ma...@influxdb.com

unread,
Apr 13, 2016, 10:42:21 PM4/13/16
to golang-dev, ma...@influxdb.com, ja...@influxdb.com, mic...@influxdb.com
The gettimeofday docs do specify microsecond precision, but the kernel provides nanotime as well: https://developer.apple.com/library/mac/documentation/Darwin/Conceptual/KernelProgramming/services/services.html

AFAICT from https://golang.org/src/runtime/sys_darwin_amd64.s?h=nanotime, the runtime's implementation of now() calls the Go assembly implementation of nanotime(), which looks very similar to Apple's assembly nanotime implementation from https://opensource.apple.com/source/Libc/Libc-825.40.1/x86_64/sys/nanotime.s.

https://play.golang.org/p/S7tH8vGZd9 confirms that the majority of the timestamps generated on OSX are precise to the nanosecond (i.e. the collisions are not a result of the timestamps only being precise to the microsecond):

0 1460600769182366366
1 1460600769182371004
2 1460600769182371036
3 1460600769182371052
4 1460600769182371064
5 1460600769182371079

Running Dave's simplified example, with a small tweak to print out the result of .UnixNano(), shows the same behavior as the original, non-simple example (collisions only when the last 3 digits are zero):

collision at 1460600325432075000
collision at 1460600350328895000
collision at 1460600358888459000
collision at 1460600365100560000
collision at 1460600368499674000
collision at 1460600368502876000
collision at 1460600368506996000

I think the next step would be to try and reimplement Dave's example in C on OSX, to rule out the Go runtime and see if the collisions still occur.
I probably won't have time to dig into that until maybe next week, possibly this weekend.
If anyone else wants to investigate more before then, I'm still very interested in learning more about this behavior.

Mark

andrey mirtchovski

unread,
Apr 13, 2016, 10:49:08 PM4/13/16
to Dave Cheney, Anmol Sethi, Andrew Gerrand, Mark Rushakoff, golang-dev, Jack Zampolin, Michael desa
there's also this: "Technical Q&A QA1398: Mach Absolute Time Units"


https://developer.apple.com/library/mac/qa/qa1398/_index.html

Russ Cox

unread,
Apr 14, 2016, 12:36:41 AM4/14/16
to ma...@influxdb.com, golang-dev, Jack Zampolin, Michael desa
In general, time.Now makes no guarantee of returning nanosecond resolution. If you need a unique number in your process, it is much better to use a global counter fetched with atomic.AddInt64(&counter, +1) than to use the current nanosecond time.

On OS X, the kernel usually offers the formula for computing nanosecond resolution time to user space, in which case time.Now will use it. But not always. The kernel can rescind the formula, such as when adjtime has been called with a small negative delta, which actually stops the time-of-day clock for that duration, to preserve monotonicity. In typical operation, ntpd calls adjtime once per second to fool with the system clock. At least on my system, ntpd seems to believe the clock is always running ahead, so that it's always asking for tiny negative adjustments, which means the time-of-day clock is stopping like this for a small amount of time every second.

You can snoop on ntpd by running 'csrutil disable' from a recovery mode boot and then restarting into regular OS X and running

dtrace -n 'syscall::adjtime:entry {
    p=copyin(arg0, 16);
    t=timestamp;
    printf("%s %d.%03d %d %d", execname, t/1000000000, (t/1000000)%1000, ((int64_t*)p)[0], ((int32_t*)p)[2]);
}'

I get output like:

CPU     ID                    FUNCTION:NAME
  0    430                    adjtime:entry ntpd 4034.056 0 -9
  0    430                    adjtime:entry ntpd 4035.121 0 -9
  0    430                    adjtime:entry ntpd 4036.136 0 -8
  0    430                    adjtime:entry ntpd 4037.192 0 -9
  0    430                    adjtime:entry ntpd 4038.248 0 -9
  0    430                    adjtime:entry ntpd 4039.319 0 -9
  0    430                    adjtime:entry ntpd 4040.393 0 -8
  0    430                    adjtime:entry ntpd 4041.415 0 -9
  0    430                    adjtime:entry ntpd 4042.477 0 -9
  0    430                    adjtime:entry ntpd 4043.548 0 -8
  0    430                    adjtime:entry ntpd 4044.617 0 -9
  0    430                    adjtime:entry ntpd 4045.673 0 -9
  0    430                    adjtime:entry ntpd 4046.696 0 -8
  0    430                    adjtime:entry ntpd 4047.719 0 -9
  0    430                    adjtime:entry ntpd 4048.742 0 -9
  0    430                    adjtime:entry ntpd 4049.803 0 -8
  0    430                    adjtime:entry ntpd 4050.821 0 -9
  0    430                    adjtime:entry ntpd 4051.843 0 -9

Basically once per second ntpd is waking up and telling the kernel to turn back the clock by (well, stop it for) a few microseconds. The exact amount wobbles up and down maybe 10s of microseconds over time on my system but is always negative.

If you run 'systemsetup -setusingnetworktime off', that will kill ntpd and the adjustments and the time.Now jitter will go away. (Of course, your clock may slowly drift as well.) If you fix the time and then turn ntpd back on, it stays OK for a while but then goes back to its endless negative fixup.

That is, this makes the problem go away for at least a few minutes:

systemsetup -setusingnetworktime off
echo 0 >/var/db/ntp.drift
ntpdate -b -u $(systemsetup -getnetworktimeserver | awk '{print $4}')
systemsetup -setusingnetworktime on

But the tiny negative adjustments always comes back, at least on my system. I wonder if the clock really is just running too fast and ntpd is by heroic effort helping it stay where it should be or if ntpd is just buggy and is obsessive compulsively making adjustments when none are really needed.

Short answer: no, I didn't expect this, but I also don't see much that can be done about it.

Russ

Nick Craig-Wood

unread,
Apr 18, 2016, 8:35:50 AM4/18/16
to Russ Cox, ma...@influxdb.com, golang-dev, Jack Zampolin, Michael desa
On 14/04/16 05:36, Russ Cox wrote:
> But the tiny negative adjustments always comes back, at least on my
> system. I wonder if the clock really is just running too fast and ntpd
> is by heroic effort helping it stay where it should be or if ntpd is
> just buggy and is obsessive compulsively making adjustments when none
> are really needed.

9 uS per second is a drift of 9 PPM.

A typical quartz crystal will have an accuracy of around +/- 20 PPM so
I'd say this is well within the range of the expected accuracy.

So gold star for ntpd for its heroic efforts ;-)

--
Nick Craig-Wood <ni...@craig-wood.com> -- http://www.craig-wood.com/nick
Reply all
Reply to author
Forward
0 new messages