Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

vanishing surnames puzzle

1 view
Skip to first unread message

jon wild

unread,
Apr 15, 2001, 11:07:33 PM4/15/01
to
I was walking in a cemetery with some very old graves the other day. I
noticed that many of the surnames seemed colourful, and not ones I'd ever
encountered before. (This was in Boston, and many were from the 18th
century.) I looked some of them up in a phone book and on the web
afterwards, and found no surviving bearers of many of the names.

I wonder, given:
-a certain number and distribution of surnames,
-a set of simple assumptions about number of children born to each family,
-random pairing of offspring
-and a rule that children must bear the surname of the father
(or mother, makes little or no difference to the problem),
is it easy to figure out how many names are likely to vanish over a
certain period of time?

To try it out, here's a starting population of 2000 couples, with surnames
as follows:

5 surnames are each held by 100 couples
40 surnames are each held by 20 couples
100 surnames are each held by 5 couples
200 surnames are each held by 1 couple

(This means 500 couples have a very common surname, 800 have a fairly
common surname, 500 have a fairly rare surname and 200 have a very rare
surname.)

Assume each couple is equally likely to have any number of
(randomly-sexed, surviving-to-adulthood) children between zero and four
inclusively (this is unrealistic but simple, and ensures stable population
size). People of one generation can only marry people of the same
generation, and only 80% end up marrying.

If there are four generations per century, what will the distribution of
names be after 200 years?

If you'd like to figure it out using a different starting distribution, or
different assumptions, feel free, I'd just like to see how people might
attack the problem, and how quickly names are likely to disappear.

I figure, given the above numbers, that about 77 of the 200 couples with
very rare surnames will have no male children. So these are doomed right
off the bat.

thanks for any interest --jon

David Eppstein

unread,
Apr 16, 2001, 12:41:57 AM4/16/01
to
In article <9bdnll$k7a$1...@news.fas.harvard.edu>,
jon wild <wi...@fas.harvard.edu> wrote:

> is it easy to figure out how many names are likely to vanish over a
> certain period of time?

I don't know the answer, but this sort of thing is usually studied under
the keywords "birth-death process". Perhaps the usual search engines will
dig up something relevant.

Since you're assuming names only get passed down from the father, you can
simplify your model by completely omitting all women as irrelevant.
--
David Eppstein UC Irvine Dept. of Information & Computer Science
epps...@ics.uci.edu http://www.ics.uci.edu/~eppstein/

Bob Harris

unread,
Apr 16, 2001, 7:32:58 AM4/16/01
to
jon wild wrote:
> I was walking in a cemetery with some very old graves the other day. I noticed
> that many of the surnames seemed colourful, and not ones I'd ever encountered
> before.

What were some of the surnames?

> Assume each couple is equally likely to have any number of (randomly-sexed,
> surviving-to-adulthood) children between zero and four inclusively (this is
> unrealistic but simple, and ensures stable population size). People of one
> generation can only marry people of the same generation, and only 80% end up
> marrying.

Doesn't that mean that we'd expect a 20% decrease in population every
generation? The 80% that end up marrying will have an average of one child
per person.

> If there are four generations per century, what will the distribution of names
> be after 200 years?

Starting population is 4,000 people. Mulitplying this by .8 for 8
generations leaves about 671 people. Is this what you intended?


jon wild

unread,
Apr 16, 2001, 11:16:30 AM4/16/01
to
Bob Harris <nit...@mindspring.com> wrote:

: Doesn't that mean that we'd expect a 20% decrease in population every


: generation? The 80% that end up marrying will have an average of one child
: per person.

Oops, thanks for noticing that. I decided to add in a factor to account
for not everyone marrying _after_ I'd wrote that there should be an
average of 2 kids per couple to keep the population stable, and I forgot
to correct the number of kids to compensate.

Feel free to ignore the 80% figure and assume everyone marries, if anyone
wants to help me figure it out.

Best wishes --Jon

Dennis Yelle

unread,
Apr 17, 2001, 9:42:57 PM4/17/01
to
jon wild wrote:
>
> I was walking in a cemetery with some very old graves the other day. I
> noticed that many of the surnames seemed colourful, and not ones I'd ever
> encountered before. (This was in Boston, and many were from the 18th
> century.) I looked some of them up in a phone book and on the web
> afterwards, and found no surviving bearers of many of the names.
>
> I wonder, given:
> -a certain number and distribution of surnames,
> -a set of simple assumptions about number of children born to each family,
> -random pairing of offspring
> -and a rule that children must bear the surname of the father
> (or mother, makes little or no difference to the problem),
> is it easy to figure out how many names are likely to vanish over a
> certain period of time?
>
> To try it out, here's a starting population of 2000 couples, with surnames
> as follows:
>
I used these names:
--------------------------
> 5 surnames are each held by 100 couples A1, A2, A3, A4, A5
> 40 surnames are each held by 20 couples B01, B02, ... B40
> 100 surnames are each held by 5 couples C001, ... C100
> 200 surnames are each held by 1 couple D001, ... D200

>
> (This means 500 couples have a very common surname, 800 have a fairly
> common surname, 500 have a fairly rare surname and 200 have a very rare
> surname.)
>
> Assume each couple is equally likely to have any number of
> (randomly-sexed, surviving-to-adulthood) children between zero and four
> inclusively (this is unrealistic but simple, and ensures stable population
> size). People of one generation can only marry people of the same
> generation, and only 80% end up marrying.
>
> If there are four generations per century, what will the distribution of
> names be after 200 years?
>
> If you'd like to figure it out using a different starting distribution, or
> different assumptions, feel free, I'd just like to see how people might
> attack the problem, and how quickly names are likely to disappear.
>
> I figure, given the above numbers, that about 77 of the 200 couples with
> very rare surnames will have no male children. So these are doomed right
> off the bat.

See above for my initial name assignments.

Here are 2 simple simulation results:

First a run with 80% marrying:

Generation 0:
2000 couples with 345 different names. Most common:
100 100 100 100 100 20 20 20 20 20 20 20 20 20 20 20
A1 A2 A3 A4 A5 B01 B02 B03 B04 B05 B06 B07 B08 B09 B10 B11

Generation 1:
1608 couples with 240 different names. Most common:
95 91 85 81 77 26 24 22 21 21 21 20 19 18 18 17
A5 A3 A4 A2 A1 B29 B17 B15 B23 B31 B33 B38 B39 B08 B30 B04

Generation 2:
1305 couples with 200 different names. Most common:
82 79 70 61 59 21 19 19 18 17 17 16 16 15 15 15
A5 A3 A4 A2 A1 B39 B27 B33 B29 B15 B36 B17 B38 B04 B07 B13

Generation 3:
1050 couples with 171 different names. Most common:
68 66 65 59 35 20 16 15 15 14 14 13 13 13 13 13
A3 A4 A5 A2 A1 B33 B36 B29 B39 B07 B22 B13 B19 B21 B25 C013

Generation 4:
850 couples with 149 different names. Most common:
53 50 49 49 36 16 15 15 14 12 11 10 10 10 10 10
A3 A4 A2 A5 A1 B40 B21 B38 B36 B37 B39 B13 B16 B22 C015 C066

Generation 5:
670 couples with 128 different names. Most common:
50 39 38 35 28 16 16 13 11 11 11 10 9 9 9 9
A4 A3 A2 A5 A1 B21 B40 B26 B20 B22 C015 C022 B05 B37 B38 B39

Generation 6:
511 couples with 110 different names. Most common:
41 33 29 28 27 15 14 13 11 11 10 8 8 8 8 7
A4 A5 A1 A2 A3 B20 B21 C022 B22 B33 B40 B16 B26 B39 C015 B29

Generation 7:
424 couples with 91 different names. Most common:
35 31 28 25 21 14 12 9 8 8 8 8 7 7 7 6
A4 A3 A5 A1 A2 B21 B20 B25 B22 B33 C015 C022 C040 C069 C083 B05

Generation 8:
336 couples with 70 different names. Most common:
34 30 17 14 13 11 10 10 9 9 8 7 7 7 6 6
A4 A3 A5 A1 A2 B21 C015 C022 B20 B22 B39 B16 B25 D173 B29 C042

------------------------------

Here is for 100% allowed to marry:

Generation 0:
2000 couples with 345 different names. Most common:
100 100 100 100 100 20 20 20 20 20 20 20 20 20 20 20
A1 A2 A3 A4 A5 B01 B02 B03 B04 B05 B06 B07 B08 B09 B10 B11
Producing 1993 boys, and 2027 girls

Generation 1:
1993 couples with 254 different names. Most common:
115 113 106 102 94 32 30 27 26 25 25 24 24 23 23 23
A5 A3 A2 A4 A1 B29 B23 B15 B17 B04 B33 B31 B38 B01 B02 B27
Producing 2017 boys, and 2005 girls

Generation 2:
2005 couples with 222 different names. Most common:
119 116 114 106 100 40 33 30 27 26 26 25 24 24 24 23
A5 A4 A2 A3 A1 B04 B29 B23 B24 B20 B33 B13 B02 B15 B21 B07
Producing 2006 boys, and 2051 girls

Generation 3:
2006 couples with 195 different names. Most common:
142 122 114 91 91 42 41 34 29 27 26 26 26 24 24 24
A5 A2 A4 A1 A3 B24 B23 B04 B07 B26 B15 B29 B36 B11 B12 B13
Producing 1915 boys, and 1939 girls

Generation 4:
1915 couples with 179 different names. Most common:
136 106 106 100 89 46 46 36 35 33 29 27 27 26 24 24
A5 A2 A4 A1 A3 B23 B24 B07 B36 B37 B15 B11 B20 B26 B29 B33
Producing 1954 boys, and 1899 girls

Generation 5:
1899 couples with 162 different names. Most common:
130 121 118 104 88 48 39 38 37 29 28 26 26 26 24 24
A5 A2 A4 A3 A1 B24 B36 B07 B23 B37 B26 B02 B15 B30 B03 B19
Producing 1872 boys, and 1948 girls

Generation 6:
1872 couples with 154 different names. Most common:
135 132 131 101 86 48 45 37 33 29 28 27 26 26 24 24
A4 A5 A2 A3 A1 B24 B36 B23 B07 B26 C022 B15 B21 B25 B03 B12
Producing 1930 boys, and 1937 girls

Generation 7:
1930 couples with 145 different names. Most common:
134 128 125 103 81 61 43 40 34 33 30 29 28 27 27 27
A4 A2 A5 A3 A1 B24 B36 B07 B23 B26 B25 C020 B11 B15 C008 C022
Producing 1934 boys, and 1958 girls

Generation 8:
1934 couples with 134 different names. Most common:
136 125 120 117 84 58 49 46 40 36 35 33 30 29 27 27
A4 A2 A3 A5 A1 B24 B07 B36 C020 B15 C008 B25 B29 B12 B11 B26
Producing 1865 boys, and 1883 girls

-----------------------------

Dennis Yelle
--
I am a computer programmer and I am looking for a job.
There is a link to my resume here:
http://table.jps.net/~vert/

Tom Collins

unread,
Apr 17, 2001, 10:08:22 PM4/17/01
to
West Virginia: One Million People, . . . and 15 last names


"jon wild" <wi...@fas.harvard.edu> wrote in message
news:9bdnll$k7a$1...@news.fas.harvard.edu...
: I was walking in a cemetery with some very old graves the other day. I

David Burn

unread,
Apr 18, 2001, 7:06:12 AM4/18/01
to
"David Eppstein" <epps...@ics.uci.edu> wrote in message
news:eppstein-46B487...@news.service.uci.edu...

> you can
> simplify your model by completely omitting all women as irrelevant.

And, indeed, your entire existence.

Robert Israel

unread,
Apr 19, 2001, 9:01:48 PM4/19/01
to
In article <9bdnll$k7a$1...@news.fas.harvard.edu>,
jon wild <wi...@fas.harvard.edu> wrote:
>I was walking in a cemetery with some very old graves the other day. I
>noticed that many of the surnames seemed colourful, and not ones I'd ever
>encountered before. (This was in Boston, and many were from the 18th
>century.) I looked some of them up in a phone book and on the web
>afterwards, and found no surviving bearers of many of the names.
>
>I wonder, given:
>-a certain number and distribution of surnames,
>-a set of simple assumptions about number of children born to each family,
>-random pairing of offspring
>-and a rule that children must bear the surname of the father
> (or mother, makes little or no difference to the problem),
>is it easy to figure out how many names are likely to vanish over a
>certain period of time?

>If there are four generations per century, what will the distribution of


>names be after 200 years?

Suppose p(j) is the probability of a given male having exactly j male
children. The generating function of this is h(z) = sum_j p(j) z^j.
Let h_n be h iterated n times, i.e. h_1(z) = h(z), h_2(z) = h(h(z)), etc.
Suppose a given last name starts out with x_0 males in generation 0.
Then the probability P_n(x) of there being exactly x males with a
given last name in generation #n has the generating function
g_n(z) = h_n(z)^(x_0). The probability that the name will have
died out by generation #n is P_n(0) = g_n(0) = h_n(0)^(x_0).
h_n'(1) = sum_j j p(j) is the expected number of male children for each
male: if this is <= 1 with p(0) <> 0, then g_n(0) -> 1 as n -> infinity,
and any given surname dies out eventually with probability 1.

For example, suppose each male is equally likely to have 0,1,2,3 or 4
children, of which each is equally likely to be male or female. Then
h(z) = 31/80 + 13/80 z + 1/5 z^2 + 3/40 z^3 + 1/80 z^4. In generation
8 we have h_8(0) = .8163472175. That is the probability that a surname
owned by one male in generation 0 will have died out by generation 8.

Robert Israel isr...@math.ubc.ca
Department of Mathematics http://www.math.ubc.ca/~israel
University of British Columbia
Vancouver, BC, Canada V6T 1Z2

library....@saqnet.co.uk

unread,
Apr 24, 2001, 7:02:56 AM4/24/01
to

Your problem seems original and quite interesting.

I cannot venture to give an answer. However I note that your outlined,
and necessarily simplistic, model could perhaps accommodate one more
factor.

It is childbirth outside marriage. In the past, while this eventuality
carried a stigma for the unmarried mother or for her family, it was
hardly avoidable because of the relative rarity of contraceptives, or
of their use, and also of legal abortions.

Therefore, a possibly significant number of children were born outside
wedlock, who received their mother's maiden name in lieu of that of a
missing or unknown father's.

I would not know it myself, though, how to fit this factor into a more
refined equation that might yield the secret of the disappearing
family names.

Regards,
Thomas


0 new messages