Thursday, March 06, 2014

Adventures in Genealogy - Autosomal DNA

Most family trees have a few nuts

Several companies now offer Autosomal DNA tests for genealogical purposes. An Autosomal DNA tests looks at the ancestral DNA of both parents of the person being tested. This means you see both Mama and Daddy's parents, grandparents and those before them, and also the descendants of the great-great-great grandparent, etc.

So you don't see only a straight line of descent, but you also may have matches with people who descend from your 3rd great-grandmother's sister, whose married name you do not know, and whose 4th great-grand daughter's married name you do not know.  

Autosomal DNA is the long-time genealogist's dream come true, if you know how to use it.  Personally I can't rave about it enough. It's allowed me to identify the parentage of my 3rd great-grandfather George Perkins, confirm a long string of ancestors I'd done the paperwork for, confirmed that I am 1/10th Native American, and that ancestry is shared between three different grandparents. It's allowed me to confirm that my 3rd great-grandfather Levin Clark was Nanticoke Delaware, as we believed all along. I've learned that Winston Churchill was my 10th cousin as was Franklin Delano Roosevelt. Now, along with my pirate Haymans I have a 9th or 10th great-grandmother who was convicted as a witch during the Salem witch hysteria. And at some point in Colonial America I have a set of great-grandparents, one of whom was Native and the other African, parents from South Africa and Sub-Saharan Africa. One of my next quests is to locate that couple if possible.   

Unfortunately most of the people I've encountered haven't a clue how to use Autosomal DNA. Those who have done genealogy a long time usually grasp the concept more quickly, but those who are just beginning and thought a DNA test would provide an easy trip to the genealogical heaven of royal ancestry (or whatever their goal) is have learned:

1) that autosomal DNA does not come with surnames attached.

2) the family tree is not built by the test.

3) that they should pray they have an avid genealogist in the family who has a well-researched and documented family tree, and that they can be induced to share it.

4) that it takes hard work and a good deal of time to find the match between two persons or families who are more than two generations apart.

5)  that there very well may be a "surprise" or two  (i.e. illegitimate child, adoption, or a racial heritage one did not expect) in one's tree. 

So how do you start if you have tested and don't know where to go next?

1) Register with Gedmatch and upload your raw data. You'll be given a kit number when you register. Write it down.

2) Build a spreadsheet. This can be a table in a txt document or any spreadsheet program. Unless you organize your matches and keep track of them you'll soon be drowning in a sea of jumbled information.

Here are the headings I use on my spreadsheet:

chr #  beg  end cM  snps  GM#  OCM  Name  Notes

chr# is the chromosome number the match is found on. Your matches should be put on in order beginning with chromosome 1.

The "beg" and "end" are where the matching segment begins on the chromosome and where it ends. These should placed on the sheet in order from the lowest number of beginning segment to higher, so on chromosome 1 my first match is at 2,492,640 and ends at 247,174,776. The next match starts at 3,669,635 and ends at 7,490,355. The third begins at 4,058,815 and ends at 11,694,927. These are always ordered by the position of the beginning segment.

If your relationship is a close one you will probably share segment matches on more than one chromosome so OCM stands for "Other Chromosome Match". I share segment matches on seven or eight chromosomes with several close cousins. The OCM column is how you keep track of that.  

In the SNPs column record the number of Single Nucleotide Polymorphisms (single units of DNA) you share. Under 500 is insignificant, unless you know the person is related. Since SNPs are shuffled with every generation you lose the larger segments in a roughly calculable order.

You share about 50% of your SNPs with each parent, 25% with each grandparent, 12.5% with each great-grandparent. In theory by the time you are back to your 3rd great-grandparent you expect to see only about 1.5% of their DNA in an unrecombined state and this may be too small for the tests to pick out. In practice this does not always work out. Some DNA segments seem to stick together for a longer period of time. I have a robust match with a man who shares a set of 8th great-grandparents in Bath Somerset UK in the mid 1600s. Despite intense searching we can't find any other shared ancestry. Our 7th great-grandparents were a pair of brothers.

On the other hand I have a very small match with a 2nd cousin I know well, whose great-grandmother was the younger sister of my grandfather. I do not match her brother at all, though my son does.

On the spreadsheet GM# stands for GedMatch number. GedMatch is so useful that I can honestly say I'd not have accomplished much of anything without it. For one thing anyone who has tested with any of the companies that do autosomal testing can upload their raw data to GedMatch which gives you a pool of several million people. GedMatch generates a fresh list of who you match every time you log on, showing you where you match them, who *both* of you match (this is called triangulation), how many generations apart you are, and about two dozen different utilities unavailable elsewhere. These are all free of charge. The site is run by volunteers, but be fair. When you sign up donate a few dollars to cover the costs of running the servers. I appreciate their services so much I donate to them on a quarterly basis.

The Name column should be self-explanatory, this is the name and e-mail address of your match.

Notes - here's where I put the relationship once I have worked it out. Usually relationship and last common ancestor(s).

Keep in mind that a match from a couple (say gg-grandparents George and Betsy) could be from either (or both) of them. To separate out whether the segment came from George or Betsy you need to find a descendant of George's parents or sibling, who was not married to a relative of Betsy's. For example when I finally confirmed the parents of my very elusive 3rd great-grandfather, George Perkins, it was through DNA matches.

First I matched segments with descendants of two men who were documented in census records and his will as sons of Jacob Perkins and wife Elizabeth Cole Perkins. Then I was able to identify which DNA belonged to Perkins by matching segments with a descendant of Jacob's great-grandfather, Isaac Perkins who had no Cole ancestry, and I identified the Cole ancestry in the same way, by matching segments with a Cole relative of Betsy's who had no Perkins ancestry. (It probably goes without saying that I really love analyzing data.) 

The spreadsheet will be "thinly populated" to start with, but as you begin to identify matches it becomes clear that people who match the same segments as your "Smith" cousins A and B are going to match the Smiths or one of their ancestors. It becomes a process of you and your match comparing trees for common ancestors.

As a rule I won't work with someone who is unwilling to meet me halfway and do their share of the research. Nor will I play 20 questions with someone who won't give me a basic dropline or is unwilling to let me see their family tree. Here's an example:

Background: From sharing genomes months ago and GedMatch I know this person's ancestry lies within my father's maternal line. Our exchange went like this:

Them: Can we share genomes?

Me: We shared our genome information months ago. We match on Chromosome 17, these segments xxxxx - xxxxx. That segment matches my father's maternal line. Are any of the following surnames familiar to you? [list of surnames]

Them: I have a Kelly from Ireland. You have a Kelly on your profile. Was your Kelly Irish?

Me: I have only one Kelly in my tree. She was born in Dorset UK in 1747. I recently found their marriage certificate and learned she was the 2nd wife of my 4th great-grandfather. They were in their 60s when they married and had no children. So she couldn't be our match.

Did your ancestors live in any of these places? [I listed the places my father's maternal family lived (all in southern states) including Carroll County Arkansas.]

Them: Were your Carroll ancestors Irish? I have Irish Carrolls.

Me: I have no Carroll ancestors, that was a location, Carroll County Arkansas. Do you have a dropline or family tree I could look at?

Them: I have a tree on It's private. You can't see it.

Me: I'm sorry I can't be of any more help.

I was so frustrated by this I got up and vacuumed my floors. So it's all good. I was going to try to mop too but no one else asked me another sufficiently exasperating question, and I ran out of energy before I got to the mop bucket. Now I have to sweep again before I can mop since the "enfant terrible" has, as usual, rolled in the litterbox and carried clay litter everywhere. I don't need to do his tree to know he's got a bit of the Devil in him. 

But, if you are interested in learning how to do genomic "mapping" I'm happy to answer questions. Please feel free.


Linda P. said...

Thanks for this information. We've been frustrated in our family search for the parents of a Robert Peel in our family line. At the time of his birth, the famous Robert Peel in the U.K. was apparently convincing any Peel with a male child to name that child Robert! I just sent my daughter a link to your blog so that we could maybe think about pursuing this avenue of research.

smm said...

It all seems very complex...not sure my brain would bring the whole spreadsheet love this would take to accomplish.