If we omit the assumption that all 365 days are equally likely
  [...]
  MONTH |  A   |  B   |  A/B  |
  -----------------------------
  Jan   | 8.08 | 8.47 | 0.954 |
  Feb   | 7.75 | 7.92 | 0.979 |
  Mar   | 8.29 | 8.47 | 0.979 |
  Apr   | 8.03 | 8.20 | 0.979 |
  May   | 8.37 | 8.47 | 0.988 |
  Jun   | 8.19 | 8.20 | 0.999 |
  Jul   | 8.87 | 8.47 | 1.047 |
  Aug   | 8.90 | 8.47 | 1.051 |
  Sep   | 8.64 | 8.20 | 1.054 |
  Oct   | 8.64 | 8.47 | 1.020 |
  Nov   | 7.95 | 8.20 | 0.970 |
  Dec   | 8.29 | 8.47 | 0.979 |
  -----------------------------
  In column A, we have the percentage of 1996 US live births
  occurring in said month.  In column B, we have the percentage
  of days of 1996 occurring in said month.  The last column is
  the ratio of the percentages in columns A and B.
  Right now I'm thinking about how to derive an approximate
  distribution curve for each of the 366  days of 1996.  What
  comes to mind is using a polynomial of degree <= 2 for each of
  the 12 months p_1, p_2, ... p_12  (p_1 for Jan, etc) where
  \int_{0}^{1} {p_1(x)dx} would give an approximation to births
  occurring from 00:00 CST(?) on 1/1/1996 to 00:00 CST(?) on
  1/2/1996 (etc) subject to:
  (1) The integral of p_1 from 0 to 31 should give the number
      of births (modulo  least squares?) in January 1996 (etc)
  (2) Some kind of "smoothness" condition such as
      (p_1)'(31) = (p_2)'(0), and so on all through the year
       [for a total of twelve equations].
  (3) A condition similar to:
      \SUM_{n=1,12}{\int_{0}^{Days_n} { [(p_n)'(x)]^2 dx } }
      is minimal where Days_n is the number of days in month
      number n.
  RFC:  I'd welcome comments and suggestions as to how to
        derive the "expected" number of births for each day of 1996.
data from:  http://www.cdc.gov/nchswww/releases/98news/98news/natal96.htm
                    (see Table 15)
www.mapblast.com www.nytimes.com www.blackvault.com (;-)
www.terraserver.microsoft.com www.gsoc.dlr.de/satvis
dictionaries.travlang.com www.bldrdoc.gov/timefreq/javaclck.htm
-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own    
It seems that really you should assume that December-January has
the same kind of "smoothness" conditions as the transition between
any other pair of months, and that in fact what you are looking for
is a periodic function approximated not by a polynomial but by
a partial Fourier series.
In fact the whole thing strikes me as a signal-processing problem.
(You assume the probability density of births at any instant during
the year is a function that varies continuously with time, you sample
it by integrating it over various unequal periods, and now you want
to reconstruct the original signal from the samples.)  Then things
like Fourier transforms come to mind, but it's been a long time since
I've looked at any of that.
-- 
David A. Karr       "Groups of guitars are on the way out, Mr. Epstein."
ka...@shore.net                         --Decca executive Dick Rowe, 1962
>  MONTH |  A   |  B   |  A/B  |
>  -----------------------------
>  Jan   | 8.08 | 8.47 | 0.954 |
>  Feb   | 7.75 | 7.92 | 0.979 |
>  Mar   | 8.29 | 8.47 | 0.979 |
>  Apr   | 8.03 | 8.20 | 0.979 |
>  May   | 8.37 | 8.47 | 0.988 |
>  Jun   | 8.19 | 8.20 | 0.999 |
>  Jul   | 8.87 | 8.47 | 1.047 |
>  Aug   | 8.90 | 8.47 | 1.051 |
>  Sep   | 8.64 | 8.20 | 1.054 |
>  Oct   | 8.64 | 8.47 | 1.020 |
>  Nov   | 7.95 | 8.20 | 0.970 |
>  Dec   | 8.29 | 8.47 | 0.979 |
>  -----------------------------
>
>  In column A, we have the percentage of 1996 US live births
>  occurring in said month.  In column B, we have the percentage
>  of days of 1996 occurring in said month.  The last column is
>  the ratio of the percentages in columns A and B.
and asked about how best to approximate the birth rate over
the course of that year as some sort of smooth function.
David Karr pointed out, correctly, that since the data should
be assumed to be cyclic (assuming that 1997 births will
be more or less like 1996 births -- except in February),
Fourier methods are appropriate.
Let p(j) (j=1,...,366) be the probability of a child being born on
the jth day of the year.  Model p as a Fourier series
with, say, a total of five terms:
p(j) = a_2 cos(4 pi j/366) + a_1 cos(2 pi j/366) + a_0 + 
         b_1 sin(2 pi j/366) + b_2 sin(4 pi j/366),
and do a least-squares fit to the twelve pieces of data to find
the five unknown coefficients.  You could choose some number
of terms other than five, of course.  With 12, you'll get a perfect
fit, but that's probably ridiculously overfitting the data.
I tried this -- it's a piece of cake with Matlab -- and
found that fitting to a five-term series as above worked pretty
well.  Going up to seven terms didn't seem to improve things
too much.
The coefficients are
(a_2,a_1,a_0,b_1,b_2) = (0.00460,-0.00785,0.27323,-0.00847,-0.00217)
(probabilities given in percents).  This gives errors of between
1% and 2% in January, November, and December, and under 1% everywhere
else.
That should be good enough for bernier's purpose, which (as I
understand it) is to determine how much of a difference seasonal
variations in birth rate make to the classic
probability-of-matching-birthdays puzzle.  I certainly don't feel like
taking on that part of the job -- I don't see how to do it except via
Monte Carlo methods -- so I'll let someone else take over from here.
-Ted
>p(j) = a_2 cos(4 pi j/366) + a_1 cos(2 pi j/366) + a_0 + 
>         b_1 sin(2 pi j/366) + b_2 sin(4 pi j/366),
[...]
>The coefficients are
>
>(a_2,a_1,a_0,b_1,b_2) = (0.00460,-0.00785,0.27323,-0.00847,-0.00217)
But I think I got my notation mixed up.  The above numbers
actually have the sine coefficients first, not the cosines.
The order is
(b_2,b_1,a_0,a_1,a_2).
-Ted