Home Page Forums Darwin’s Ark Sequencing Complete; analysis underway Reply To: Sequencing Complete; analysis underway

jesse mcclure
jesse mcclure

Becky & Casey,

Perhaps we should be more detailed with that “location” description. It is accurate in that your dogs’ samples have moved from spit in a tube, through purified DNA in a smaller tube, and then through a sequencing machine to be turned into computerized data. However, that data, in it’s initial form isn’t particularly meaningful. The “analysis underway” is a bit of an understatement.

There are several stages that data goes through. One of the earliest stages of the data that comes off the sequencers is a file full of “reads”. These are short chunks of DNA in the range of hundreds of “letters” long. As your dogs’ dna is made up of several *billion* letters, these are really tiny bits of the genome. These individual ‘reads’ are initially in no particular order, we just know that a given list of letters appeared somewhere in your dog’s DNA: so this is basically the worlds worst jigsaw puzzle to put together tens of millions of these ‘reads’ to make a full picture of your dogs DNA (this is called ‘aligning’ to a reference in genomic terms if you want to read further).

From there we do actually have a sequence of letters representing your dog’s DNA, but at each of those there is also a fair bit of uncertainty – or a margin of error. The good news is the sequencing machine tells us how sure it is about each letter. When we take this information from your dog, and put it side-by-side with *many* other dog samples, we can make good inferences about which letters we should really believe and which ones might have been an error (this would be ‘variant calling’ in genomics).

After these steps, we have the first type of data that would be meaningful … to a geneticist. But it would still just look like gibberish to most normal people. But we can turn this data into something more meaningful for you. We can infer who your dog may share ancestry with (this turns out to be what most of our participants are interested in: what breeds is their dog related to). We can also see which coat color genes your dog has (of course you already know what color your dog is!)

Each of these stages take a fair bit of work. Lots of big computers sure help us do it in a more practical time-frame, but we still have to get all the data running through all the right computers. The last stage, of turning the genetic data into something useful for dog owners, may be the most challenging and time consuming.

Our goal, of course, is to see what variants your dog has and pair this with what you tell us about your dog. For example, we might find little differences in the DNA of all the dogs that are most playful, or most social, or most anxious, or those prone to allergies, etc. But *those* answers will only come out once we have looked at the DNA of a *whole lot* of dogs.