On 17/09/15 13:53, Ian Goddard wrote:
> I've thought about this from a different direction: estimating
> extent of pedigree collapse.
It's not quite the same thing, but a standard measure of pedigree
collapse is the coefficient of inbreeding.
http://www.genetic-genealogy.co.uk/Toc115570144.html
Genetically speaking, it can be interpreted as the probability of two
alleles being identical through descent, which (loosely) is the property
that gives rise to the harmful effects of inbreeding. The coefficient
is a number between 0 (entirely uninbred) and 1 (impossibly inbred), and
is often expressed as a percentage. I don't recall having seen a value
above 40%, even in the most inbred dynasties such as the Habsburgs in
the 17th century, the Ptolemies in the 1st century BC, or the Eighteenth
Dynasty in the 14th century BC.
Where it differs from your idea of pedigree collapse (as I understand
your idea), is that inbreeding is solely about how related a persons
parents are. If an individual's parents are each individually very
inbred, but are not related to each other, the child is not considered
to be inbred at all. In an extreme example, one could imagine a man
with only four great grandparents, because his paternal grandparents
were siblings, as were his maternal grandparents. This has a huge
amount of pedigree collapse -- 50% in that generation -- but no
inbreeding, because his mother and father are not related to each other.
> My first approach was this: one would expect, for example, 8
> ggparents. Call these roles. If there's a cousin marriage such that
> one pair of ggparents appears in different lines there will be 6
> individuals filling the 8 roles so there are 2 less individuals than
> expected. 2/8 = 25% collapse.
A nice property of this definition is that it is not a function of
which generation of ancestors you look at: two out of eight, four out of
sixteen, and eight out of thirty-two all work out at 25%.
For comparison, the coefficient of inbreeding in this case is 6.25%.
> In another situation one might have 1 ggparent marrying twice with a
> line to a child of each of these marriages. In that case there is
> only one individual less than the number of roles so the collapes is
> 1/8 = 13.5%
>
> However, like Charlie, I have the situation of the same pair
> appearing in different generations. The next version was to say
> that with 3 generations of parents one would have 2 (parents) + 4
> (grandparents) + 8 (ggparents) = 14 roles so that the cousin
> marriage above is missing 2 out of 14 giving 2/14 = ~14.3% and in
> the remarriage example it's 1/14 = ~7.1%.
An unfortunate property of this definition is that the pedigree collapse
is now a function of the number of generations considered, even when the
earlier generation contain no additional intermarriages. When
considering three generations you have 2/14 = 14%, but if you consider
another generation, you have 6/30 = 20%. As the number of generations
considers increases, the value converges on 25%.
> This enables us to deal with the situation where a couple appear as
> ggparents in one line and gggparents in another. If we extend the
> count to the gggparents there are now 2+4+8+16=30 roles. Assuming
> it's just one couple who are duplicated there are 28 individuals
> filling these roles and the collapse is 2/30 = ~6.7%.
>
> The general formula, therefore, is to count r, the number of roles
> for which an ancestor has been identified and i, the number of
> individuals identified as filling those roles and the pedigree
> collapse is given by (r -i)/r.
An alternative, and one that I've used in other contexts, might be to
consider a graphical representation of an ancestor table as a unit
square. Divide the square in quarters. The two left quarters represent
the father and mother, while the two right quarters will themselves be
quarter. This process continues with smaller and smaller squares for
increasingly distant ancestors. Parents occupy 1/4 = 25% of the square,
grandparents 1/16 = 6.25%, great grandparents 1/64 = 1.56%, etc.
If the base person's parents are cousins, the two great-grandparents are
shared. Imagine blocking out one of every duplicated ancestor on the
unit square. This blocks out two great-grandparents (1/64 each), and
all their ancestors (another 2/64). In all, 6.25% of the diagram is
blocked out, which is the same as the coefficient of inbreeding.
If the duplicates are on different generations, which should you block
out? For a simple (if extreme) example, consider an individual whose
parents were uncle and niece. This means two grandparents are also
great-grandparents. Do you block out the grandparents (25%) or the
great-grandparents (6.25%)? Intuitively, the right numeric answer seems
to be the geometric mean (12.5%), which again is the coefficient of
inbreeding.
It's worth noting that this method doesn't always give the coefficient
of inbreeding. Imagine a person whose paternal grandparents were
siblings. That results in two duplicate great-grandparents, or 6.25%,
but the coefficient of inbreeding is 0 because the individual's parents
are unrelated.
Richard