Completing the Personal Genomics Toolkit

toolkitThe big news buzzing through the world of genomics this afternoon is the publication of a paper in the journal Science announcing the production of three whole-genome sequences at an average materials cost of $4,400. The work was performed by the third-generation sequencing company Complete Genomics Incorporated, along with researchers from George Church’s lab at Harvard Medical School.

The Race for the $1,000 Genome

Erika Check Hayden of Nature’s blog The Great Beyond has an excellent summary of the Complete announcement in which she also attempts to head off some of the inevitable media hype:

Complete’s $4,400 price tag doesn’t include costs for the company’s infrastructure, such as its Silicon Valley data farm and the army of analysts and technicians required to make sense of the data; the company lists more than 60 employees in this paper’s author list. The company is actually selling genomes at $20,000 apiece in minimum orders of five; costs go down as the order size increases. That puts it slightly behind the schedule it set at its launch; the $5,000 genomes won’t be available until next year.

The announcement from Complete Genomics is hardly unexpected. At its launch last fall the company promised that it would deliver $5,000 genomes (and 1,000 of them, not just 3) by the end of 2009.

From a personal genomics standpoint, there is no question that Complete is a viable contender in the race to deliver affordable, individual whole-genome sequences. Spurred by competition from the likes of IBM, Illumina, Pacific Biosciences, Oxford Nanopore and others, the $1,000 genome continues to draw closer. It is no longer a question of if but when that magic number will be attained.

But while the $1,000 genome competition makes for an exciting horserace, the real focus of today’s announcement should be not on how much a genome sequence costs, but on what you can (or cannot) do with that sequence.

Trait-o-matic: An Open-Source Genomic Tool

In the inaugural post of the Genomics Law Report’s What ELSI is New? series, Stanford Law professor Hank Greely asked how we will handle the rapidly approaching flood of personal genomic information that will accompany the $1,000 genome and concluded, quite simply, that “the age of cheap full genomes is almost upon us – and we are not close to ready for it.”

While Complete’s announcement does not constitute a flood of whole genomes – it’s probably too soon to say that there’s even a steady trickle at this point – it is a meaningful addition to the catalogue of public genomes, as well as a reminder that we are still not close to ready.

One of the three genomes produced by Complete belongs to Harvard Medical School geneticist, Personal Genome Project founder and Complete Genomics advisor George Church. The Church genome, along with genomes from J. Craig VenterJames Watson and other genomics pioneers, form the backbone of the Trait-o-matic, an open-source genome interpretation tool recently unveiled by the Church Lab and the Personal Genome Project.

Although the name seems straight from the 1950s, the Trait-o-matic is surely a creature of today. The Trait-o-matic is designed to streamline the analysis of genetic variants by automatically pulling and associating the variations in an individual’s genome with information from Trait-o-matic’s own comprehensive database derived from OMIM, HGMD, PharmGKB, SNPedia and the PGP’s own participant data.

Running Church’s Complete Genomics data through the Trait-o-matic produces a catalog of hundreds of potential genotype-phenotype correlations ranging from eye color to prostate cancer to susceptibility to drug addiction. The sheer magnitude of the Trait-o-matic report is a tangible reminder of the degree of difficulty associated with identifying which genetic variants within an individual’s genome are the ones that actually matter.

While falling price points for whole-genome sequences make for clear benchmarks and attention-grabbing headlines, it is increasingly apparent that much more attention must be focused on making sure that interpretive tools such as the Trait-o-matic – which is still in its infancy – are keeping pace with advances in generating the raw data of whole-genome sequences.

Completing the Personal Genomics Toolkit

Not long ago, in “Leveraging the Crowd to Understand Your Genome,” I wrote about the significant role that I expect “open-source genomics to play in the continuing expansion and democratization of personal genomic inquiry.” In that article, I also discussed the importance of developing open-source and publicly accessible interpretative tools, such as Trait-o-matic, to provide individuals with the option to invite an interested public to share in the challenge of interpreting genomic data.

Over at Common Knowledge, John Wilbanks, director of the Science Commons project at Creative Commons, has a pair of highly recommended posts (part one and part two) on the importance of moving science beyond the open-source paradigm that has worked so well in software development.

Low-cost genomic sequencing and even open-source interpretive tools such as the Trait-o-matic are necessary additions to the personal genomics toolkit. But by themselves they are not sufficient to achieve the type of scientific breakthroughs required to realize the potential of personal genomics. Wilbanks writes:

Distributed science, user-driven science, open innovation science, we need ALL of them, not a narrow idea that comes from software. It’s about hardware for science. It’s about data for science. It’s about laboratories for science. It’s about research departments and funders and promotion and tenure. It’s about paradigms, and paradigm shifts.

A $4,400 genome – even a $1,000 $100 or $0 genome – is important. But it is an improvement to an existing tool. It is not a new tool, and it is certainly not a paradigm shift. To proceed from merely inexpensive genomics to truly personal genomics, we will need to develop the rest of the toolkit. And if the progress of companies such as Complete Genomics is any indication, we will need to develop it quickly.

Want to Read More? Coverage of the Complete Genomics Announcement: