How is the BIG guys (Google. FB etc) DNS and NTP architecture

87 views
Skip to first unread message

rexon gaming

unread,
Oct 13, 2020, 9:13:52 AM10/13/20
to
Hi Folks,

Searching for reference architecture of BIG players (Google, FB) for the DNS and NTP inside their data centers.

If you have any please share.

Terje Mathisen

unread,
Oct 14, 2020, 2:32:44 AM10/14/20
to
Google decided a few years ago to skip all leap seconds by implementing
their own leap smear hack to a DMZ ntp server that bridges the gap
between normal NTP servers and the internal "la, la, la, la, la, I can't
hear you!" world which tries hard to avoid having to care about steps in
UTC seconds counting.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

CRasch Net

unread,
Oct 20, 2020, 9:20:21 PM10/20/20
to
Facebook is now using Chrony, you can read about their implementation:

Building a more accurate time service at Facebook scale
https://engineering.fb.com/production-engineering/ntp-service/

William Unruh

unread,
Oct 20, 2020, 11:47:10 PM10/20/20
to
Interesting. While I agree that chrony is more precise, I think that
their results for ntpd are worse than they should be. ntpd can
certainly do better than 1ms scatter/accuracy (and chrony can do better
than 100us.There is something weird about their network paths.) About 10
years I ran a number of tests of chrony vs ntpd. and got about a fctor
of 3-10 better, not 100. Interrupt latency/clock reading for chrony gave
about 1us fluctuations.

I find this whole thing about leap second smoothing to be a real farce.
Just let the step occur instead of delivering the wrong time for hours.
Or if you want, run your clocks on TIA not UTC and make the leapsecond
conversion in the interpretation as is done for timezones. Would anyone
advise leap day smoothing every 4 years so that we do not have trouble
with our calenders?

Miroslav Lichvar

unread,
Oct 21, 2020, 3:34:57 AM10/21/20
to
On 2020-10-21, William Unruh <un...@invalid.ca> wrote:
> On 2020-10-21, CRasch Net <cra...@crasch.net> wrote:
>> Facebook is now using Chrony, you can read about their implementation:
>>
>> Building a more accurate time service at Facebook scale
>> https://engineering.fb.com/production-engineering/ntp-service/
>
> Interesting. While I agree that chrony is more precise, I think that
> their results for ntpd are worse than they should be. ntpd can
> certainly do better than 1ms scatter/accuracy (and chrony can do better
> than 100us.There is something weird about their network paths.) About 10
> years I ran a number of tests of chrony vs ntpd. and got about a fctor
> of 3-10 better, not 100. Interrupt latency/clock reading for chrony gave
> about 1us fluctuations.

It's not clear how ntpd and chrony were configured in their tests. The
ntpq/chronyc outputs show a poll of 6, which is too long for a highly
stable synchronization in a local network. If they were using the
default minpoll 6 and maxpoll 10, a factor of 100 would not surprise me.
ntpd doesn't adjusts its polling very well when it has stable
measurements, so it would be effectively comparing ntpd polling at 10 vs
chrony polling at 6.

> I find this whole thing about leap second smoothing to be a real farce.
> Just let the step occur instead of delivering the wrong time for hours.
> Or if you want, run your clocks on TIA not UTC and make the leapsecond
> conversion in the interpretation as is done for timezones. Would anyone
> advise leap day smoothing every 4 years so that we do not have trouble
> with our calenders?

Well, yes. The trouble is that there are applications that break on
backward step, they need synchronized clocks, and not all NTP clients
can be configured to make a consistent slew on the leap second. So, the
easiest way to fix this is to make a slew on the server and hide the
leap second from the clients. When you internally do this everywhere and
you want to provide a public NTP service, it's easier to just serve your
internal leap-smeared time.

--
Miroslav Lichvar

Jakob Bohm

unread,
Oct 21, 2020, 3:55:16 AM10/21/20
to
Originally, that is what they did, they smeared the leap day over 2
solar days.

For a smeared timescale, perhaps something could be created that
distributes UT instead of UTC, it would need an input file with the
same data used to transmit UT-UTC in the standard radio protocol
(see the ITU-R standard for UTC). Plus some kind of NTP protocol
change to let clients know that the data incorporates the astronomical
wobble and may need polling at the right time to pick up the wobble.




Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

William Unruh

unread,
Oct 21, 2020, 11:59:53 AM10/21/20
to
Easier. It is probably even easier to forget about ntp and just free run
your server. "Easy" is not the purpose of serving ntp time.
>
Reply all
Reply to author
Forward
0 new messages