Google Cloud Platform Blog
Product updates, customer stories, and tips and tricks on Google Cloud Platform
Exploring genetic variation with Google Genomics and Tute
March 16, 2015
Today we hear from Tute Genomics on how they're using cloud-based technology and big data tools to support the scientific community and help advance genomics research. Tute has made its
8.5 billion record genetic annotation database
publicly available to users of
Google Genomics
through
Google BigQuery
.
Human genome sequencing is fast becoming standard practice for both clinicians and researchers now that the cost to read all 3 billion letters of a person’s DNA has dropped to just a few thousand dollars. But what does all that information mean for medicine and health?
At Tute Genomics we’re answering that question by building a comprehensive database of all known genetic variants and what they mean for disease risk, drug response, and basic research. Because the database contains 9 billion records, it’s been a challenge to work with it on a local computer or servers. That’s why we were excited to discover
Google BigQuery
.
With BigQuery, scientists can run sophisticated queries against the Tute database to link an individual’s genome to the wealth of information about genetic variants in general. The background of information can even include large public datasets like the
1000 Genomes Project
, which is already hosted on Google Cloud Platform.
With so much data to analyze, data analysis tools like BigQuery are essential. Running queries using standard computers or VMs takes significantly longer. For example, we're able to rapidly count variants from the 17 Platinum Genomes by function. Even with 88GB of input data, we're able to see results in 30 seconds for less than $1, whereas it would have taken many minutes or even hours without BigQuery.
Our initial database version includes annotations on 8.5 billion genetic variants. Sources include clinical annotations from ClinVar and GWAS catalog, population frequencies from the 1000 Genomes Project, gene and transcript model annotations – such as amino acid and protein substitutions – and the functional consequence of exonic variants. Additionally, the database includes conservation scores, evolutionary scores, and predictions of whether genetic variants are likely to be associated with Mendelian phenotypes.
As genome sequencing becomes a more common part of clinical care as well as basic research, accurate and comprehensive genetic variant databases will be essential to help make sense of genetic information. We find that detailed annotations of genetic variants are a natural match for big data processing with Google BigQuery. We believe in this so strongly that we’ve donated an unprecedented database to the genomics community, made available through
Google Cloud Platform
.
Reach out to us on the Tute Genomics
discussion group
with any specific questions.
Posted by Bryce Daines, Reid Robison, Chris London, Brendon Beebe, David Mittelman, and Kai Wang of Tute Genomics
No comments :
Post a Comment
Free Trial
Labels
Android
Announcement
api
app engine
Atmosphere Live
bigquery
BigTable
CDN
Cloud Console
Cloud Dataflow
Cloud Datastore
cloud endpoints
Cloud Pub/Sub
Cloud SDK
cloud sql
cloud storage
Cloudera
Compute
Compute Engine
container cluster
customer
Dev Tools
developer tools
developer-insights
Developers
Developers Console
devfests
Disaster Recovery
Encryption Keys
ESG
Event
events
GA
Go Client
Google App Engine
Google Apps
Google BigQuery
Google Cloud Deployment Manager
Google Cloud Networking
Google Cloud Platform
Google Cloud Storage
Google Compute Engine
Google Container Engine
gRPC
hadoop
Hardware
Helium
how to
IO2013
iOS
Kubernetes
Levyx
Local SSD
mapreduce
Media
Nearline
networking
open source
PaaS Solution
Partner
Pricing
Research
round-up
Server
Siggraph
solutions
Startup
Tableau
TCO
Technical
Windows
Wowza
Zync
Archive
2015
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Feed
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Follow @googlecloud
No comments :
Post a Comment