FORUM

Home Page Forums Darwin’s Ark Different results?

This topic contains 23 replies, has 9 voices, and was last updated by allison miller allison miller 1 month, 2 weeks ago.

Viewing 9 posts - 16 through 24 (of 24 total)
  • Author
    Posts
  • #8179
    jessica hekman
    jessica hekman
    Keymaster

    Folks – just wanted to clarify a few things that have been discussed on this thread.

    Where part of a dog’s ancestry is marked “unknown,” this is because our computer algorithm honestly failed to match it to a breed. This is most commonly because those parts of the ancestry are so mixed up (lots of little chunks from lots of breeds) that the computer doesn’t have enough info to match it to a breed confidently. (We are hopeful that we’ll do better here when we start using more markers – getting our v2 breed panel up and running is currently one of Kathleen’s highest priorities.)

    It is also possible that this is a breed that we don’t have in our panel yet. This is less likely for most dogs because a) we have the most common breeds in there except for American Pit Bull Terrier (which, again, will be in v2) and b) I’ve observed that when we don’t have a particular breed, but there is a large chunk of ancestry, the computer tends to match it to a closely related breed. So when you have Malinois, the computer tends to find German Shepherd. (Mal will also be in v2…)

    When you see breed results with really small percentages – under 10% – don’t trust them too much. It’s just really hard for the computer when the bit of ancestry from that breed is so small. This is why, under 5%, we group them all into an “other” category – we don’t want people putting too much stock on those breed calls.

    I hope this helps and feel free to ask me questions!

    Jessica
    Darwin’s Ark Researcher

    #8180
    Erika-Pretorius
    Erika-Pretorius
    Participant

    Thanks Jessica!

    Any thoughts on my post above regarding my dog, Bee? IE generally speaking…why would Embark show Golden back to great grandparents, and DA show so much unknown? Just curious what could cause this. (Is Embark just generalizing? But why? They also come back w/ unknowns, and also have mapped more dog breeds…)

    Best
    Erika

    #8181
    jessica hekman
    jessica hekman
    Keymaster

    Erika – good question! I don’t know if Embark rounds the numbers from their results, though it would surprise me a bit to see exact 50/50 results like that, so it’s certainly possible that they are.

    We use different animals in our breed panel than Embark does, so it’s alternatively possible that your dog is indeed 50% golden and for whatever reason, one of the golden grandparents doesn’t match to golden in our breed panel (maybe it was a golden from an unusual-for-us population – from Europe, or from a line that hasn’t mixed much with others and we don’t have an example of, or something like that).

    If this were my dog I think I’d assume the dog is in fact a 50/50 mix but I’d keep in the back of my head that there might be something else going on.

    #8182
    kathleen morrill
    kathleen morrill
    Participant

    Hi Erika,

    As Jessica mentioned, it may be the individual Goldens we have in the reference panel (12 dogs) don’t represent all lineages of Golden out there. For the new breed panel (v2) that I am constructing and testing right now, we have a different selection of 12 Golden Retrievers (who had deeper sequencing data), so the percentage calls on Bee might shift again when we re-analyze Bee’s DNA with v2 of the panel.

    In addition to a wider panel of breeds, Embark may have more than 12 individual dogs / breed represented in their reference panel. For every new purebred Golden Retriever that genotypes through Embark, they might further expand the panel, thereby representing more variation across Goldens.

    We also use a different type of machine learning algorithm than Embark, which may introduce some differences. Our method is very similar to the approach that the human genetics test 23andMe uses. In the long term, I’d like to do some comparisons between ancestry calling methods and see what works better on mixed breed dogs.

    Best,
    Kathleen

    #8183
    Erika-Pretorius
    Erika-Pretorius
    Participant

    Thanks both Jessica and Kathleen. (This is all very interesting!)

    For this statement:

    We also use a different type of machine learning algorithm than Embark, which may introduce some differences. Our method is very similar to the approach that the human genetics test 23andMe uses. In the long term, I’d like to do some comparisons between ancestry calling methods and see what works better on mixed breed dogs.

    Is the implication, accuracy? Assessing more markers? Both? I am not a scientist and as such, my thought is: “it’s either half Golden or it’s not.’

    As I said in a previous comment – if something is super analyzed, then do you ‘find’ that say, two generations back a breeder bought a dog that was really only 95% Golden… (or similar), and so then that dog’s offpring were not 100% Golden and so on.

    This is all out of curiosity – I could care less about the pedigree other than the fact that my southern kill shelter dog potentially came from two full-blooded dogs! (The same is partially true w/ my other dog I had tested. DA was confused by her, and Embark tells me she’s full Lab on one side – and related to dozens of purebreds on the site…) Let me know if you want to see her results.

    Best,
    Erika

    #8247
    kathleen morrill
    kathleen morrill
    Participant

    Of course! These are important questions for ancestry testing in general, too.

    Yes, the implication is accuracy. It depends on multiple features: number of markers, # of breeds, # of individual dogs in the panel, and algorithm. As Darwin’s Ark and Embark differ on all of these features, a direct comparison of what feature matters the most isn’t possible.

    But! We have ways to test what features make a difference in accuracy. We perform computer simulations to make virtual mutts, where we know exactly the family tree and how much DNA they inherited from each breed. We can then vary the marker number, breed number, algorithm, etc. to see how accurate the results are compared to what they should be.

    For grandparents, ancestry is usually estimated from the tested dog’s overall percentages, assuming a quarter from each grandparent (which is not completely true). If the pup is roughly 22-27% Golden, then it is a good guess that they have a Golden grandparent. Without testing the parents or grandparents directly, we cannot find their ancestry.

    If one did test the grandparents and found 95%, that could still be interpreted as full-blooded Golden. A unique lineage from the UK, for instance, might be inherently 5% different from all the U.S. Golden Retrievers in the reference panel, and that lineage difference may be responsible for the 5%. Even show populations versus working dog populations may have essential differences despite being the same breed. There is no way to catalogue every Golden Retriever lineage but the more individual dogs, the better.

    For Bee’s results, I would guess the biggest difference is the sheer number of purebred Golden Retrievers she is compared to in Embark (publicly available data plus all new Goldens in their database) versus the 12 purebred Goldens (publicly available data) she gets compared to here.

    #8634
    Erika-Pretorius
    Erika-Pretorius
    Participant

    Thanks! I’ll wait for future screenings (V2)! Best, Erika

    #14577
    william hutton
    william hutton
    Participant

    just out of curiosity
    when forming your reference panel,
    does this consist of only dogs in your screening ?
    or do you also source DNA pools from outside sources?

    there are a large number of different breed clubs out there that collect DNA on their registered dogs.
    each pool would have more defined indictors for each respective breed.
    i’d think it might help,
    not only with highly confusing lines like the bully breeds but also with rarer breeds you might not ever get a sample from.

    as example
    I have a wirehair point griffon ,through paper work I can determine a predominantly European back ground (france)
    46% of her breed mix can be accounted for..
    19% is WPG, 7.5 GSP, 5.2 Brittany..13.4% other hunting breeds
    54% is unknown ..
    through breed research I have reason to believe the unknown is “cesky fousek”
    a relatively unknown and rare breed in north America.
    one you will probably never get enough samples of to process on your own

    #14651
    allison miller
    allison miller
    Participant

    How great that you are adding more breeds! I tell everyone I know with dogs about this project.

Viewing 9 posts - 16 through 24 (of 24 total)

You must be logged in to reply to this topic. Login