Performance enhancement for gettimeofday()?

Gordon.S...@seagate.com

unread,

Jan 4, 2007, 7:16:32 PM1/4/07

to dj...@delorie.com

This might seem like a strange place to be focusing on for performance...
but it turns out that the GNU Pth library uses gettimeofday() from inside
the "put a thread to sleep for x time" functions and that is something
which can get called from performance-critical code (and profiling tells
me is being done in the application I'm working on).

It was spending much of its time in the __dpmi_int function (no surprise),
and I figured out that better than 90% of those calls were coming from
gettimeofday(). So I started pondering whether there was a way to fix
the library routine to not have to make a real-mode transition.

Most of the data returned by the DOS int 21/2C and 21/2A functions can be
obtained directly from the CMOS, with nothing more than port I/O: that
will give us everything except the "hundredths" of a second field. For
that field, what about installing a interrupt handler to simply increment
a counter every timer tick, and reset it to zero when the counter reaches
91 (signifying five seconds have elapsed)?

I can, indeed, also fix this inside of the Pth library (or for that
matter inside my application so as not to call pth_nap at all) but it
seemed like the ideal solution would be to fix the library so that
everyone wins with the performance boost.

Also did anyone have any thoughts on my ideas for improvements to pipe()?

Gordon.S...@seagate.com

unread,

Jan 5, 2007, 11:53:34 AM1/5/07

to dj...@delorie.com

Gordon Schumacher/Seagate wrote on 01/04/2007 05:16:32 PM:

# Most of the data returned by the DOS int 21/2C and 21/2A functions can be
# obtained directly from the CMOS, with nothing more than port I/O: that
# will give us everything except the "hundredths" of a second field. For
# that field, what about installing a interrupt handler to simply increment
# a counter every timer tick, and reset it to zero when the counter reaches
# 91 (signifying five seconds have elapsed)?

In fact, it would not even need to be done with an interrupt handler. The
first time the function is called, the int21/2C function could be called
in order to determine the hundredths value that DOS is using. Then the
clock() function could be called to align those two values, and thereafter
the clock() function could be used instead.

Rod Pemberton

unread,

Jan 6, 2007, 1:32:07 AM1/6/07

to

<Gordon.S...@seagate.com> wrote in message
news:OF67F16F38.763FE5DE-ON872572...@seagate.com...

>
> This might seem like a strange place to be focusing on for performance...
> but it turns out that the GNU Pth library uses gettimeofday() from inside
> the "put a thread to sleep for x time" functions and that is something
> which can get called from performance-critical code (and profiling tells
> me is being done in the application I'm working on).
>
> It was spending much of its time in the __dpmi_int function (no surprise),
> and I figured out that better than 90% of those calls were coming from
> gettimeofday().

Why so many calls? Isn't it a design problem with Pth to consume lots of
CPU to set an alarm?

> So I started pondering whether there was a way to fix
> the library routine to not have to make a real-mode transition.
>
> Most of the data returned by the DOS int 21/2C and 21/2A functions can be
> obtained directly from the CMOS, with nothing more than port I/O: that
> will give us everything except the "hundredths" of a second field.

DOS or Windows? IIRC, PM port I/O is priviledged. Does anyone know if this
would affect Windows?

> For
> that field, what about installing a interrupt handler to simply increment
> a counter every timer tick, and reset it to zero when the counter reaches
> 91 (signifying five seconds have elapsed)?
>

Together with your second post,

<Gordon.S...@seagate.com> wrote in message
news:OFF86BAC20.A8BCA0CB-ON872572...@seagate.com...

> In fact, it would not even need to be done with an interrupt handler. The
> first time the function is called, the int21/2C function could be called
> in order to determine the hundredths value that DOS is using. Then the
> clock() function could be called to align those two values, and thereafter
> the clock() function could be used instead.
>

I like the basic idea. But, I think you'd need a faster timer. And, I'm
unsure of interactions with various DOS's and Windows. (I did find one XP
potential problem, explained near the end).

You didn't say what timer: IRQ 0 (int 08h) or IRQ 8 (int 70h)
IRQ 0 by default is 18.0265Hz.
IRQ 8 by default is 1024Hz.

For "hundredths" of a second, wouldn't you want a timer of 100Hz or greater?
Basically, if you used 18.0265Hz, you'd get "eighteenths" of a second or
almost "twentieths" of a second. So, the "hundredths" timer would update,
be stale for the next 4 reads, update, be stale for the next 4 reads,
update, etc.

clock() gets its ticks from 40:6ch which are saved from IRQ 0 (int 08h or
18.0265Hz). clock has the define CLOCK_PER_SEC which is 91. I'm not sure
why that value is used or the period it generates is used, but it does
produce a period of about 5 seconds using the 18.0265Hz clock.

Getting values from memory (40:6ch) is quick, and the PM memory protection
issues were solved. Getting values from CMOS is slightly slower, and I'm
unsure of the Windows priviledge issues of CMOS port I/O here. Getting
values from repeated DOS calls through the DPMI host probably is slower.

> I can, indeed, also fix this inside of the Pth library (or for that
> matter inside my application so as not to call pth_nap at all) but it
> seemed like the ideal solution would be to fix the library so that
> everyone wins with the performance boost.
>

I may not be the person to respond on this issue. I've seen a number of
posts on the intracacies of getting DJGPP to work on XP (and you didn't say
what OS). I use DOS mostly (and Win98). So, anything which affects DOS
affects me. I think alot of others are using XP. I suspect whatever you
create needs to work with MS-DOS,DR-DOS,FreeDOS, the various versions of
Windows, etc...

For example, someone attempted to use 40:1a and 40:1c to speed up
_bios_keybrd() read for OpenWatcom:
http://groups.google.com/group/openwatcom.contributors/msg/ca944968d2a3485d?hl=en

I thought this would also be a good idea for DJGPP since I like
_bios_keybrd() and use it. But, someone else attempted to access the bios
timer at 40:6c like you intend and as is done in clock() for DJGPP. He
found out that XP doesn't update BIOS variables unless a DOS interrupt is
called:
http://groups.google.com/group/comp.os.msdos.djgpp/msg/30ac39382eb3560e?hl=en
http://groups.google.com/group/comp.os.msdos.djgpp/msg/67916cf4c1c73b7e?hl=en

> Also did anyone have any thoughts on my ideas for improvements to pipe()?
>

Sorry, not I.

I had hoped someone much more familiar with the issues you bring up would
respond to you first...

Rod Pemberton

Gordon.S...@seagate.com

unread,

Jan 8, 2007, 1:03:25 PM1/8/07

to dj...@delorie.com

Rod Pemberton wrote on Sat, 6 Jan 2007 at 01:32:07 -0500:

# Why so many calls? Isn't it a design problem with Pth to consume lots of
# CPU to set an alarm?

Well, gettimeofday() is usually not a CPU-intensive function on most
platforms, so I don't think that this counts as a Pth problem per se.

# DOS or Windows? IIRC, PM port I/O is priviledged. Does anyone know if
this
# would affect Windows?

For my latest version I dropped the direct-from-CMOS version as
unnecessary, so no longer relevant... my latest is using only
DOS interrupt calls and clock().

# For "hundredths" of a second, wouldn't you want a timer of 100Hz or
greater?
# Basically, if you used 18.0265Hz, you'd get "eighteenths" of a second or
# almost "twentieths" of a second. So, the "hundredths" timer would
update,
# be stale for the next 4 reads, update, be stale for the next 4 reads,
# update, etc.

I was indeed talking about the 18.2 per second timer - which appears to be
the one that DOS is using internally. So that staleness of the hundredths
field is already an issue just using the CMOS. Here's a snippet of
documentation on this: http://www.merlyn.demon.co.uk/prog2000.htm#MSF
I've also dug through the FreeDOS source code and confirmed that this is
how FD behaves.
Use the following for proof:

#include <stdio.h>
#include <time.h>

int main()
{
int i;
for (i = 0; i < 400; i++) {
uclock_t start;
struct timeval tp;

gettimeofday(&tp, NULL);
printf("%ld\t", tp.tv_usec);
// Uncomment if your machine is too fast...
// start = uclock();
// while (uclock() < start + UCLOCKS_PER_SEC / 500);
}
printf("\n");
return 0;
}

# clock() gets its ticks from 40:6ch which are saved from IRQ 0 (int 08h or
# 18.0265Hz). clock has the define CLOCK_PER_SEC which is 91. I'm not
sure
# why that value is used or the period it generates is used, but it does
# produce a period of about 5 seconds using the 18.0265Hz clock.

Looking at the libc source code, the value from the interrupt is multiplied
by five before being handed off to the user, so that the user does not have
to use floating-point math.

The clock frequency should be (1.193*10^6) / 65536 Hz, which works out to
18.203 or so. I'm modifying my code to use the more accurate calculation
starting with the crystal frequency instead of CLOCKS_PER_SEC (it means
doing floating-point math but it's still so much faster than the original
that it's still worth it).

# I may not be the person to respond on this issue. I've seen a number of
# posts on the intracacies of getting DJGPP to work on XP (and you didn't
say
# what OS). I use DOS mostly (and Win98). So, anything which affects DOS
# affects me. I think alot of others are using XP. I suspect whatever you
# create needs to work with MS-DOS,DR-DOS,FreeDOS, the various versions of
# Windows, etc...

I've tested my code under XP and FreeDOS. I don't have DR-DOS anywhere
handy, though I could install it in a VM if that should be a sufficient
test?

# I had hoped someone much more familiar with the issues you bring up would
# respond to you first...

No worries. I've also posted some actual code over on DJGPP-Workers,
since this didn't seem like the appropriate place for it... I'll link
to it though:
http://www.delorie.com/djgpp/mail-archives/browse.cgi?p=djgpp-workers/2007/01/05/17:36:42

DJ Delorie

unread,

Jan 8, 2007, 1:34:43 PM1/8/07

to Gordon.S...@seagate.com, dj...@delorie.com

> # 18.0265Hz). clock has the define CLOCK_PER_SEC which is 91. I'm
> not sure # why that value is used or the period it generates is
> used,

We just scale the 18.2Hz count by 5 so that CLOCKS_PER_SEC is an
integer.

> Looking at the libc source code, the value from the interrupt is
> multiplied by five before being handed off to the user, so that the
> user does not have to use floating-point math.

Yup.

Rod Pemberton

unread,

Jan 9, 2007, 3:46:03 AM1/9/07

to

<Gordon.S...@seagate.com> wrote in message
news:OF1B107D53.1922E890-ON872572...@seagate.com...

> I was indeed talking about the 18.2 per second timer - which appears to be
> the one that DOS is using internally. So that staleness of the hundredths
> field is already an issue just using the CMOS. Here's a snippet of
> documentation on this: http://www.merlyn.demon.co.uk/prog2000.htm#MSF
> I've also dug through the FreeDOS source code and confirmed that this is
> how FD behaves.

Interesting... Thanks for that. Looks like some more good info there too.

> Use the following for proof:
>
> #include <stdio.h>
> #include <time.h>
>
> int main()
> {
> int i;
> for (i = 0; i < 400; i++) {
> uclock_t start;
> struct timeval tp;
>
> gettimeofday(&tp, NULL);
> printf("%ld\t", tp.tv_usec);
> // Uncomment if your machine is too fast...
> // start = uclock();
> // while (uclock() < start + UCLOCKS_PER_SEC / 500);
> }
> printf("\n");
> return 0;
> }
>

I'm sure I'll get around to trying it eventually. Seems like I have some
need for everything six months or so later... :-)

> The clock frequency should be (1.193*10^6) / 65536 Hz, which works out to
> 18.203 or so. I'm modifying my code to use the more accurate calculation
> starting with the crystal frequency instead of CLOCKS_PER_SEC (it means
> doing floating-point math but it's still so much faster than the original
> that it's still worth it).

I believe this it the math you'll need:

14.318Mhz=4*3.58Mhz=4*(4.5Mhz*455/572)
(4.5Mhz US TV bandwith/channel, 455 colorburst phase changes/line, 572
total lines/frame including sync)
14.318Mhz/12=1.93182Mhz
1.93182Mhz/65536=18.2065Hz

> # I had hoped someone much more familiar with the issues you bring up
would
> # respond to you first...
>
> No worries. I've also posted some actual code over on DJGPP-Workers,
> since this didn't seem like the appropriate place for it... I'll link
> to it though:
>
http://www.delorie.com/djgpp/mail-archives/browse.cgi?p=djgpp-workers/2007/01/05/17:36:42
>

Yes, thanks for that too. I usually only look at mail archives if GMANE
NNTP-izes it.

Rod Pemberton

Gordon.S...@seagate.com

unread,

Jan 10, 2007, 11:11:09 AM1/10/07

to dj...@delorie.com

Rod Pemberton wrote on Tue, 9 Jan 2007 at 03:46:03 -0500:

# I believe this it the math you'll need:
#
# 14.318Mhz=4*3.58Mhz=4*(4.5Mhz*455/572)
# (4.5Mhz US TV bandwith/channel, 455 colorburst phase changes/line,
572
# total lines/frame including sync)
# 14.318Mhz/12=1.93182Mhz
# 1.93182Mhz/65536=18.2065Hz

*blink*

I came across some vague hand-waving references about this oscillator
having something to do with video signals, but nothing I could make
any sense out of much less use.

So much for "someone much more familiar"... :)

I'll put this math into the code and try it out again; hopefully this
will further reduce the drift that I'm seeing (and thus reduce how
often I need to requery the DOS clock). As it is, I've already cut
20% off the performance in my program! (It makes heavy use of
pth_nap() which uses gettimeofday() internally.)

Gordon.S...@seagate.com

unread,

Jan 10, 2007, 11:19:53 AM1/10/07

to dj...@delorie.com

Rod Pemberton wrote on Tue, 9 Jan 2007 at 03:46:03 -0500:

# I believe this it the math you'll need:
#
# 14.318Mhz=4*3.58Mhz=4*(4.5Mhz*455/572)
# (4.5Mhz US TV bandwith/channel, 455 colorburst phase changes/line,
572
# total lines/frame including sync)
# 14.318Mhz/12=1.93182Mhz

Aha, this is the one that's why our numbers don't agree:
14.318MHz divided by 12 is actually 1.193666... MHz.

# 1.93182Mhz/65536=18.2065Hz

So it ends up being:

1.193666... MHz / 65536 = 18.20627848307291666... Hz

Rod Pemberton

unread,

Jan 10, 2007, 1:25:42 PM1/10/07

to

<Gordon.S...@seagate.com> wrote in message
news:OF77475379.7BA371D5-ON872572...@seagate.com...

> Rod Pemberton wrote on Tue, 9 Jan 2007 at 03:46:03 -0500:
>
> # I believe this it the math you'll need:
> #
> # 14.318Mhz=4*3.58Mhz=4*(4.5Mhz*455/572)
> # (4.5Mhz US TV bandwith/channel, 455 colorburst phase changes/line,
> 572
> # total lines/frame including sync)
> # 14.318Mhz/12=1.93182Mhz
>
> Aha, this is the one that's why our numbers don't agree:
> 14.318MHz divided by 12 is actually 1.193666... MHz.
>

Sorry, it appears I failed to type a 1 following the decimal. It's not
14.318000MHz, but 14.318181MHz. You really need to enter
4*4.5*(10^6)*455/572 to compute the 14.318MHz and work from there. IIRC
('twas 25+ years ago), it's 4 times the colorburst as calculated by the
original engineer who designed the US color TV standard. That way you won't
loose precision. Of course, a real crystal usually has a tolerance range,
but that range is usually small compared to the frequency, like +/- 100Hz or
+/-10KHz. Of course, you could go to Mouser or another electronic supplier,
and look for a crystal if you think the range would help.

Like you, I'll use ... for repeating digits. The 1 and 8 repeat for both.
I was using more decimals but rounded/truncated.

14.318181818181... Mhz / 12 = 1.193181818181... Mhz.
1.193181818181...Mhz / 65536 = 18.206509676846 Hz

Rod Pemberton

Brian Inglis

unread,

Jan 13, 2007, 1:09:46 PM1/13/07

to

IIRC crystal frequency 157.5MHz = 9/2*7*5*1E6, /11 colour burst
14318181.8..Hz, /3 for PC clock 4772727.27..Hz, /4 for timer
1193181.8..Hz, /65536 for tick 18.2065096768465909..Hz, giving a period
of 54925.4095238..us with the last six digits repeating.
So the relevant factors here are 7*5*3/11/2^19.

--
Thanks. Take care, Brian Inglis Calgary, Alberta, Canada

Brian....@CSi.com (Brian[dot]Inglis{at}SystematicSW[dot]ab[dot]ca)
fake address use address above to reply

Rod Pemberton

unread,

Jan 13, 2007, 6:18:48 PM1/13/07

to

"Brian Inglis" <Brian....@SystematicSW.Invalid> wrote in message
news:5t3iq2dlv4hoh2sdf...@4ax.com...

A 157.5Mhz crystal in 1980's? ROFL!

In a PC, in the 1980's? ROFL! (Where's the Kleenex, I've got to wipe
away the tears...)

It's possible that 157.5Mhz crystals are being used in PC's today. It's
plausible, but I doubt it (I haven't looked lately.). The primary factor is
cost. For PC's, the lowest cost solution usually wins. I know of only
three PC exceptions to that rule. A 157.5Mhz crystal costs more than a
14.318Mhz crystal, and only provides an advantage _if_ the PC uses needs an
other clock closely related to 157.5Mhz. With common PC bus frequencies of
66, 100, 133, 166, 200, 215, and 225Mhz, 157.5Mhz _seems_ to be a really
_odd_ frequency choice to me.

I had schematics for a few '80's PC's during the '80's: C64, Apple II, etc.
The highest frequency in the schematic was always some multiple of the
colorburst which was used by the video circuitry. If you actually have a
real schematic showing a that 157.5Mhz crystal was in use in a 1980's era PC
of any common make (i.e., no Crays), it'd probably be worth a fortune...
Unfortunately, I wasn't actively monitoring the advancement in crystal
oscillator frequencies. However, if you're interested, this link contains
an oscillator frequency timeline:
http://www.npcamerica.com/Datasheets/KEYNOTE2.PDF

> 14318181.8..Hz, /3 for PC clock 4772727.27..Hz, /4 for timer
> 1193181.8..Hz, /65536 for tick 18.2065096768465909..Hz, giving a period
> of 54925.4095238..us with the last six digits repeating.
>
> So the relevant factors here are 7*5*3/11/2^19.
>

Sure, except if one of those factors came from the 157.5Mhz. As I showed,
the colorburst and the common 4*colorburst crystal are derived from the TV
baseband, created for the US color TV standard to be compatible with US
Black & White (BW) TV's, i.e., not from 157.5Mhz.

Rod Pemberton

Brian Inglis

unread,

Jan 14, 2007, 11:28:45 PM1/14/07

to

On Sat, 13 Jan 2007 18:18:48 -0500 in comp.os.msdos.djgpp, "Rod
Pemberton" <do_no...@bitfoad.cmm> wrote:

My assumption was that the frequency is quoted as 157.5/11Mhz various
places and that commercial broadcast equipment would have generated and
divided down that reference oscillator (don't really know if it
could/would have been crystal in the valve/tube era, although WWII army
radios operated in the 30-40MHz range and came with 72-120 crystals) to
get an accurate, stable colour burst frequency for transmission.

>In a PC, in the 1980's? ROFL! (Where's the Kleenex, I've got to wipe
>away the tears...)

I'm well aware that TVs and PCs used a 14...MHz crystal.

>Unfortunately, I wasn't actively monitoring the advancement in crystal
>oscillator frequencies. However, if you're interested, this link contains
>an oscillator frequency timeline:
>http://www.npcamerica.com/Datasheets/KEYNOTE2.PDF

Timeline actually shows development of *CMOS* ICs for timebase
generation.

Rod Pemberton

unread,

Jan 15, 2007, 4:17:36 PM1/15/07

to

"Brian Inglis" <Brian....@SystematicSW.Invalid> wrote in message

news:8fulq298bibfkgelg...@4ax.com...

I'll assume you meant: "with 72-120 [Mhz] crystals."

Of course, you didn't state whether they were quartz crystals, or quartz
crystal oscillators... So, it appears you want an apple vs. oranges
discussion.

Dividing or mixing down an oscillator is a common technique. But, so is
generating a harmonic or overtone from a lower frequency. Without
familiarity with broadcast equipment, I'd bet on the later for older
equipment. Why? Because dividing down an oscillator frequency is easy with
digital circuitry, but hard without it. And, mixing down the oscillator
frequency requires intermediate frequencies which have to be generated too.
To me, it's more likely that they attempted to eliminate high frequency (and
high cost) circuitry, which means generating the high frequencies from lower
frequencies: harmonics or overtones. For new equipment, digital and high
frequencies work well so I see no reason why a video flash DAC circuit
wouldn't use 157.5Mhz.

The point was that 157.5Mhz wasn't used in PC's of the era. I know that
oscillators well above 1Ghz were available in the '80's, they just weren't
crystal oscillators.

> >In a PC, in the 1980's? ROFL! (Where's the Kleenex, I've got to wipe
> >away the tears...)
>
> I'm well aware that TVs and PCs used a 14...MHz crystal.
>
> >Unfortunately, I wasn't actively monitoring the advancement in crystal
> >oscillator frequencies. However, if you're interested, this link
contains
> >an oscillator frequency timeline:
> >http://www.npcamerica.com/Datasheets/KEYNOTE2.PDF
>
> Timeline actually shows development of *CMOS* ICs for timebase
> generation.
>

No. The timeline shows the development in frequencies of crystal
oscillators, as I stated. More specifically, the timeline is for quartz
crystal oscillators using CMOS logic of different voltage levels for the
oscillator, not 100% CMOS (non-crystal) oscillators, as you've stated. The
timeline doesn't show the development in frequencies of the quartz crystal
(without integrated oscillator circuitry).

And, yes, it is very possible that the output frequencies weren't limited by
the quartz crystal, but by the ability of the CMOS oscillator circuitry to
function at different voltages and frequencies. But, then, that applies to
all CMOS IC's. CMOS circuits can't use frequencies higher than they can
generate. And, IIRC, CMOS gradually became the dominate logic family of
early CPU's. So, when one talks about CMOS cpu's with a crystal oscillator,
one shouldn't perceive a large leap of logic to assume the crystal
oscillator is using CMOS also.

1) first quote:

"Our company has been seriously involved in the
CMOS IC crystal oscillator from its birth, and is
producing various clock-signal-generation ICs for
crystal oscillators, which evolved from the quartz clock."

2) second quote: (Fig. 5. is the timeline)

"In response to the high demand, our company has
developed the CMOS ICs for the oscillator module with
a quartz crystal and an oscillator unified in a package.
This is one of the main products of our company. The
developed and future oscillator products are shown as a
roadmap in Fig.5."

Rod Pemberton