Common ancestors of all humans (using genetics)
- Main Sources:
- In general:
- Mitochondrial Eve:
- Y-chromosome Adam:
- Paabo, S. (1995), The Y chromosome and the origin of all of us (men). Science 268, 1141-42.
-
The above is a commentary on (in the same issue):
- Dorit et al (1995), Absence of Polymorphism at the ZFY Locus on the Human Y Chromosome,
Science 268, 1183-5.
-
There are further commentaries on this in:
- Various (1996),
Estimating the age of the common ancestor of men from the ZFY intron.
Science,
272, 1356-62.
-
The Genetic Legacy of the Mongols
(also here),
Zerjal et al.,
American Journal of Human Genetics, 72:717-721, 2003.
- Sources yet to be consulted:
- Mitochondrial Eve:
- Cann, R. L., Stoneking, M., and Wilson, A. C. (1987),
Mitochondrial DNA and human evolution.
Nature 325, 31-36.
- Vigilant, L., Stoneking, M., Harpending, H., Hawkes, K., and Wilson, A. C. (1991),
African populations and the evolution of human mitochondrial DNA.
Science 253, 1503-07.
- Y-chromosome Adam:
- Extinction of surnames/titles
as a branching process
(the Galton-Watson process):
- Galton, Francis and H.W. Watson (1874),
On the probability of the extinction of families,
Journal of the Anthropological Institute of
Great Britain and Ireland 4:138-44.
- MRCA:
Summary
DNA studies
can tell us about some interesting CA's such as
Mitochondrial Eve
and
Y chromosome Adam,
but these CA's are
much older than the MRCA.
In fact,
by focusing only on common ancestry of DNA that gets inherited,
all CA's found in genetic studies will be much older than the MRCA.
CAs using DNA
Take all people alive today.
Take all their mothers. This is a smaller set.
Take all
their mothers. This is a smaller set.
And so on, until you get to 1 person
(
[Slatkin, 1999]
says you must get to 1 person since mathematically this is
"a pure death process that has an absorbing state at 1").
Our most recent female-female line ancestor is called
"Mitochondrial Eve"
since Mitochondrial DNA passes (almost) entirely through the female line
and so may be used to estimate a date for her.
Contrary to a lot of confused discussion,
e.g. [Ayala, 1995],
Mitochondrial Eve's existence is not in doubt.
We can work it out from our armchair.
What is in dispute is the date,
which has been estimated at 100,000 to 200,000 years ago.
Also contrary to much confused discussion by paleontologists,
no date for Mitochondrial Eve implies any sort of
population bottleneck at that time.
Mitochondrial Eve would have co-existed
with huge numbers of male and
female relations from whom we also descend.
Indeed, [Ayala, 1995] points out that
our inheritance from Mitochondrial Eve
would be only 1 part in 400,000 of our DNA.
The rest we inherit from her contemporaries.
But he still spends half the paper attacking the idea
of a small ancestral population - an idea that no one believes.
Similarly, by studying male-only DNA,
we can try to get an estimate for
"Y chromosome Adam".
Here there is little to no variation,
and much controversy about why.
Estimates range from 15,000 to 270,000 years ago,
depending on the model used.
- Genghis Khan
(temp 1200 AD)
- Descent from Genghis Khan
-
[Zerjal et al., 2003]
point out an unusually strong Y-chromosome in Mongolia,
found among believed male-line descendants of Genghis Khan.
This is likely Genghis Khan's Y-chromosome,
and he is possibly a future Y-chromosome Adam.
-
Of course, this also makes males close to him
(such as his great-great-grandfather)
likely to be future Y-chromosome Adams.
Because of what their relative Genghis Khan did.
- Genghis Khan is also likely a CA (in other lines) of most of C Asia.
See summary:
"Tyler-Smith stressed that the 16 million male descendants are just those
who belong to this one patriarchal lineage, not the much greater number who are descended
in any fashion from Genghis Khan. "Virtually everybody today who lives near the Asian steppe
must have Genghis Khan somewhere in his or her family tree," speculated Cochran."
- What we expect to find in Y-chromosome / surname descent:
"Early in the last millennium, the population of the world was, speaking very roughly, 1/20 as large as it is today.
Therefore, the average man alive then has 20 descendants alive today in his direct male line.
In contrast, with about 16 million direct descendants, this one mega-ancestor was something like
800,000 times more successful than the average."
-
New tests: Florida professor not direct descendant of Genghis Khan after all
(and here),
about the DNA test of accounting professor Tom Robinson,
shows the confusion in thinking about this issue.
"A marker that they tested showed that I was on a different branch of the tree
than Genghis Khan was on.
He and I have a common ancestor, but I'm not descended from him."
If it is true that he descends from anyone at all in Central Asia around 1200 AD,
then computer simulations
indicate it
is highly likely that he is directly descended from Genghis Khan
(just maybe not in the male-male line).
- Genealogical DNA test
- DNA testing in genealogy
- find out mainly about 2 of your billions of lines
- your female-female line
and your male-male line (the most useful
since it correlates with surname).
- Oxford Ancestors
will tell you into which local European cluster your
female-female ancestry belongs.
See article.
- publications
(and here)
- The book
The Seven Daughters of Eve
by Bryan Sykes
- The book
The Blood of the Isles
by Bryan Sykes
-
Report
(and here)
by the same Oxford group
showing a close DNA relationship between some modern English people
and the
native hunter-gatherer English of 7000 BC
(not just pre-Norman,
pre-Anglo-Saxon, pre-Roman, pre-Celt, but pre-farming
- suggesting that the successive invasions have been invasions
of memes more than genes).
-
The Journey of Man: A Genetic Odyssey
by Spencer Wells
- Article
by Steve Olson
pointing out the incredibly narrow focus
of these DNA tests,
which ignore most of your ancestry.
All surnames (except one) will die out
Clearly if we had used
surnames strictly in the male line
since Y chromosome Adam,
then we would all now bear his surname,
despite him
having millions of male contemporaries of different surnames.
(In our thought experiment, that is.
Of course surnames did not exist back then.)
As a result of thinking about Y chromosome Adam, we can see that if
we use surnames strictly in the male line
forever into the future,
then not only will all hereditary titles die out,
but all surnames except one will die out too.
The world does not of course strictly follow that surname rule,
but the West approximately does,
and surnames do go extinct.
Without a mechanism for generating entirely new surnames from scratch
(not belonging to either parent)
the diversity of surnames can only decline.
Neil Fraser
nicely describes it as
"a random walk - next to a cliff. The only force acting on the system is that once a name randomly stumbles to zero it is gone and can never recover."
The MRCA is much more recent than
Mitochondrial Eve or Y chromosome Adam
By following only the female-female or male-male paths,
we ignore the
billions of other ancestral paths
we could follow,
thus pushing the common ancestor
much further back into the past.
The MRCA in
any line
will be
much more recent than
Mitochondrial Eve or Y chromosome Adam.
DNA studies have a problem in telling us about the MRCA.
As
[Chang, 1999]
notes,
the MRCA will be much more recent than any MRCA
that could ever be found in DNA studies,
even if one were to study the ancestry of
every single gene.
The reason being that we are considering people who are simply ancestors,
through any route,
whether or not
any of their genes actually
survived the journey.
Random 1/2 of parent's DNA (t=1)
Consider that you only get 1/2 of your father's DNA,
1/2 of your mother's DNA (hence total size of DNA is constant).
Which bits you get is somewhat random.
For any given bit, you may not have inherited it
at all.
For any given marker, you may not possess that marker,
even though you
are your father's child in reality
(i.e. in genealogy).
Example of how evidence can be lacking in the DNA
Here's an example.
First remember that
everyone has 2 copies of the genome,
so that when you inherit a random 1/2 from your father
you get a
full genome,
rather than, say, having
gaps that the 1/2 from your mother
has to cover for.
Say for one gene, your father's two copies are AB,
your mother's are CD.
You could end up with AC, your sibling could end up with BD.
For this gene only,
there is no genetic evidence of your recent common ancestry.
Random average 1/4 of grandparent's DNA (t=2)
While you
do get 1/2 of your father's DNA,
this does
not mean you get 1/4 of your grandfather's DNA.
Your father's DNA is a mix of your grandfather and grandmother's DNA.
But which bits of your father's DNA you get is somewhat random.
On average you will get a somewhat random 1/4
of your grandfather's DNA,
but you could get less or more.
Probability of inheriting no DNA from a grandparent
While I am sure that you can inherit
probabilistically
more or less
than 1/4 from a grandparent,
I am unsure of the details.
Here's a sketch, but I need to do more reading on this.
If you can point me to the answers,
let me know.
If there are n events at which to choose between
your father's grandfather copy and grandmother copy,
the probability of you inheriting from him
none of your grandfather's DNA (*)
is:
(1/2)n
(*) If you are your father's daughter.
If his son, you must inherit the Y chromosome.
We will ignore the special cases of the
male-male and female-female lines.
Admittedly these are hard to ignore with grandparents,
since they are 2 of only 4 lines,
but these 2 special lines can
be ignored as we go back 10 generations or more.
[Chang, 1999, author's reply]
discusses this extreme case.
I'm not sure if n=23 here
(the no. of chromosomes).
Then the probability of all grandmother,
none from grandfather, would be
(1/2)23
= 1 in 223
= 1 in
8.4 million.
If we allow for
crossover,
the probability of all grandmother,
none from grandfather, is:
(1/4)n
If n=23,
(1/4)23
=
1 in 246
= 1 in
70 trillion.
Q. Is n=23?
Random average 1/8 of great-grandparent's DNA (t=3)
Similarly, you get on average a somewhat random inheritance
of 1/8 of each great-grandparent's DNA,
but could get less (or more).
Probability of inheriting no DNA from a great-grandparent
Again I am unsure of the following.
Let's say your father's 4 grandparents had, as
chromosome 1,
c
1 and d
1 on one side,
e
1 and f
1 on the other side.
Excluding crossover,
there is a choice between
c
1 and d
1 on one side,
and a choice between
e
1 and f
1 on the other.
Your father gets the two winners,
and he passes on to you one of these.
So there is probability 3/4 of not inheriting from one of the four.
Probability of you inheriting
no DNA
from a particular great-grandparent is:
(3/4)n
If n=23
probability (3/4)23
= 1 in 747.
How does crossover affect this?
If one great-grandparent is c,
your father has 3/4 chance of getting either c,
or c crossed with d.
He then has 3/4 chance of passing this on,
either as is or crossed over.
So you have (3/4)2 = 0.56 chance of inheriting some c,
or 1 - (3/4)2 = 0.44 chance of inheriting none.
So we get
chance of inheriting no DNA
from a great-grandparent is:
(1 - (3/4)2)n
If n=23
probability (0.44)23
= 1 in 181 million.
Q. Is n=23?
Random average 1 part in 2t
of ancestor t generations ago
In general, for an ancestor of yours
t generations ago,
you inherit
an average 1 part in 2
t of their DNA,
but can inherit more (or less).
Probability of inheriting no DNA from an
ancestor t generations ago
Again I am unsure of the following.
For an ancestor t generations ago,
what is the probability of inheriting
none of their DNA?
If there was no crossover this would be:
( 1 - (1/2)t-1 )n
where to inherit, the DNA must get through (t-1) competitions,
where, in each, the probability of getting through is 1/2.
So the prob. of getting through all the competitions is (1/2)
t-1.
If t=2 or 3 we can see this reduces to the probabilities listed above.
This is always between 0 and 1.
With finite n,
this increases towards 1 as t (the number of generations back) increases.
If n=23, the probability depends on t.
This is equal to 1/2
for:
1-(1/2)t-1 = 0.97
1/2t-1 = 0.03
2t-1 = 33.7
t-1 = 5
In other words, more than 6 generations back,
the prob. of inheriting no DNA at all from one of your
ancestors is more than 1/2.
But what about crossover?
With crossover, the probability of inheriting none of the DNA
of an ancestor at generation t is:
( 1 - (3/4)t-1 )n
where to inherit, the DNA must get through (t-1) competitions,
where, in each, the probability of some getting through is 3/4.
So the prob. of getting through all the competitions is (3/4)
t-1.
If t=2 or 3 we can see this reduces to the probabilities listed above.
This is always between 0 and 1.
With finite n,
this increases towards 1 as t increases.
If n=23, the probability depends on t.
This is equal to 1/2
for:
(3/4)t-1 = 0.03
t-1 = 12
In other words, more than 13 generations back,
the prob. of inheriting no DNA at all from one of your
ancestors is more than 1/2.
Note that at 13 generations back (c. 1500s - 1600s)
you have
8192 ancestors.
Q. Is n=23?
As n goes to infinity
Q. Is n=23?
For small n, it is easier (more probable)
to not inherit from an ancestor.
With a single event (n=1), it could easily lose that event.
With a large number of events, it is unlikely it loses
them all.
For large n, it is harder to not inherit
from an ancestor.
As n goes to infinity, you must have inherited some DNA
from the ancestor.
We can see that above, for any finite t,
as n goes to infinity,
the probability of not inheriting goes to zero.
Even with high n,
probability of inheriting no DNA from ancestor
is still high in historical times
But even for quite high n,
the probability of inheriting
no DNA from an ancestor
is still high in historical times.
For example, for n = 10,000,
the probability of inheriting nothing from an ancestor
is equal to 1/2 for:
( 1 - (3/4)
t-1 )
10000 = 1/2
1 - (3/4)
t-1 = 0.99993
(3/4)
t-1 = 0.00007
t-1 = 33
Hence, even for n=10000, by the time we get back only 34 generations
(i.e. c.1000 AD)
the probability of inheriting
no DNA at all from an ancestor
is greater than 1/2.
As n goes to infinity, inheritance goes to infinity
As n goes to infinity,
not only does the probability of inheriting
something
from any ancestor go to 1,
but the
amount inherited from any ancestor
goes to infinity.
But for any finite n,
the amount inherited is still only
1 part in 2
t
of the ancestor's DNA.
Whatever the probabilities, you still only inherit 1 part in 2t
I am unsure of the probabilities above.
But even if it is harder than I think
to inherit
no DNA
from an ancestor,
it is still true that,
for all n,
you inherit
on average
only 1 part in 2
t.
Example - An MRCA with no evidence in the DNA
Imagine 16 people (2
4) with an MRCA
10 generations ago.
Each has inherited on average
1 part in 2
10 (1/1024) of the MRCA's DNA
(and maybe less).
Quite possibly these parts don't overlap.
Take 16 samples of 1/1024 of the MRCA's genes.
These samples may not overlap at
any point.
These 16 people
are in reality descended from this MRCA,
but there is nothing in their genes to show it.
For an MRCA 30 generations ago,
you need 230 people = 1 billion people
to be sure that their samples of
1 part in 230 of the ancestor's DNA
must overlap.
Your ancestors are related
In real life,
the issue of detecting that you have actually inherited
none of an ancestor's DNA is
made more complex by the fact that
your ancestors themselves are related to each other,
and so may share DNA,
and so it may
look like you inherited DNA from one,
when in fact you didn't.
Conclusion - Within historical times, you have ancestors
from whom you have no DNA
Even though my model is simplified,
I think the conclusion is probably true
- that
within historical times (3000 BC to 2000 AD)
you have ancestors from whom you have inherited
no DNA.
As I say, I need to do more reading on this.
I'm sure this has been discussed before.
There is some discussion of this in
[Wiuf and Hein, 1999].
The "MRCA" for every gene
So what can DNA studies tell us?
Above we looked at the "MRCA" for the
Y chromosome.
If you studied all genes in the genome,
and found an "MRCA" for each gene,
would
all of them be much further back than the (true) MRCA?
Let us define the terms:
- CA1
- Our original definition of a CA - just an ancestor,
including those who left no genes at all to the present day.
- CA2
- CAs
who have left any genes in different places on the genome
in a minority of their descendants.
A smaller number than CA1s.
- CA3
- CAs
who have left the same gene in the same place
in a minority of their descendants.
A smaller number than CA2s.
- CA4
- CAs
who have left just one
gene in the same place in all of their descendants.
A smaller number than CA3s.
This is what DNA studies will calculate as the "MRCA" for this gene.
- MMRCA
- The most recent MRCA of any gene.
So the "real" CAs (the CA1s)
outnumber the CAs of a gene (the CA4s),
but do they vastly outnumber them?
As genome size tends to infinity
(i.e. n goes to infinity)
it becomes impossible
for an actual ancestor (CA1) not to be at least
a partial genetic ancestor (CA2) as well.
So the difference between CA1 and CA2 breaks down.
I used to say on this page:
"Not just that, but as genome size tends to infinity,
the number of genes a
CA1 must pass to you goes to infinity,
and (I think) finding two in the same place in two descendants
becomes more likely.
So the difference between CA2 and CA3 breaks down.
And with a finite number of individuals,
finding two in the same place for all of them
becomes more likely.
So the difference between CA3 and CA4 breaks down."
but now we can see this is not so.
(At least I put in "(I think)" in the correct place!)
The difference between CA2 and CA3 does not break down.
For any finite n, you are getting a larger inheritance from the ancestor
alright,
but it is still only 1 part in 2t,
so for any 2 descendants it is quite possible
that their samples
do not overlap (for any reasonable size t).
The probability of overlap depends on t, not on n.
Conclusion - All CAs in DNA studies
will be much older than the MRCA
So my conclusion is
that even with large genomes, the CA1s (true CAs)
are not equal to the CA4s (gene study CAs).
Even if you did the MRCA for every gene,
even the
best
result (the most recent)
could be
much older than the true MRCA.
[Chang, 1999]
has a good discussion of why the true MRCA,
following
any line,
will be much more recent than any gene-tracking MRCA.
See also
[Wiuf and Hein, 1999].
Archaeology is also of limited use in telling us about the MRCA.
For instance,
even the MRCAs found in DNA studies
will exist
much more recently than the paleontologists might imagine
looking at the fossil evidence - for the simple reason that they are
merely "statistical artefacts"
of no real importance to the overall story of human evolution.
It would be totally wrong, for example, to imagine that the CA
lived in an important or influential place or
culture.
See
[Dawkins and Jones, 1992].
For instance,
[O'Connell, 1995]
is confused about Mitochondrial Eve's relation to the fossil record
- no date for Mitochondrial Eve, no matter how recent,
could possibly contradict the fossil record studied by the paleontologists.
This is based on the error of assuming that Mitochondrial Eve is important
(see above).
One could even say that
genealogy is the pursuit of statistical artefacts.
Return to
Common ancestors of all humans.