Enterprise & Industry

Using big data for good

A nonprofit's massive pet DNA database is cracking complex diseases by analyzing over 67,000 cats and dogs.

Deep Dive

Darwin's Ark, a community science nonprofit co-founded by data expert Charlie Lieu and geneticist Elinor K. Karlsson, is leveraging pet owners' enthusiasm to build one of the world's largest genomic databases. By sequencing pets like the famously sequenced cat Petra and collecting detailed behavioral surveys from over 67,000 pet owners, the organization has created a unique research platform that pairs extensive genetic data with real-world behavioral observations. This approach directly addresses a critical bottleneck in human medical research: gathering the massive, longitudinal datasets needed to understand complex diseases while navigating privacy restrictions and participant recruitment challenges.

The platform has already yielded significant scientific insights, most notably debunking long-held stereotypes about dog breeds. Research using Darwin's Ark data demonstrated that only 9% of behavioral variation in dogs can be predicted by breed, a finding with real-world implications for adoption rates and breed-specific legislation. Beyond behavior, the database serves as a powerful proxy for studying human diseases like cancer, leveraging pets' shorter lifespans and fewer data privacy constraints to accelerate discoveries. The model demonstrates how creative, community-driven data collection can overcome traditional research limitations and advance both veterinary and human medicine simultaneously.

Key Points
  • Darwin's Ark has sequenced and collected behavioral data from over 67,000 cats and dogs, creating a massive research database
  • Research using this data showed only 9% of dog behavior is predictable by breed, debunking common stereotypes
  • The platform uses pets as proxies to study complex human diseases like cancer, overcoming data privacy and collection hurdles

Why It Matters

This model demonstrates a scalable way to gather the big data needed for medical breakthroughs, benefiting both human and animal health.