Google Cloud Platform Blog
Product updates, customer stories, and tips and tricks on Google Cloud Platform
Cassandra Hits One Million Writes Per Second on Google Compute Engine
March 20, 2014
Google is
known for creating scalable high performance systems
. In a recent blog post, we demonstrated how Google Cloud Platform can rapidly provision and scale networking load to handle
one million requests per second
. A fast front end without a fast backend has limited use, so we decided to demonstrate a backend serving infrastructure that could handle the same load. We looked at popular open source building blocks for cloud applications and choose
Cassandra
, a NoSQL database designed for scale and simplicity.
Using 330
Google Compute Engin
e virtual machines, 300 1TB Persistent Disk volumes, Debian Linux, and Datastax Cassandra 2.2, we were able to construct a setup that can:
sustain one million writes per second to Cassandra with a median latency of 10.3 ms and 95% completing under 23 ms
sustain a loss of ⅓ of the instances and volumes and still maintain the 1 million writes per second (though with higher latency)
scale up and down linearly so that the configuration described can be used to create a cost effective solution
go from nothing in existence to a fully configured and deployed instances hitting 1 million writes per second took just 70 minutes. A configured environment can achieve the same throughput in 20 minutes.
All data presented in this post is using
Cassandra Quorum commit
(writes must be complete in at least 2 nodes), triple replication, and data encrypted at rest. With commit ONE Compute Engine was able to sustain ~1.4M writes per second, with a latency median of 7.6ms and 95th percentile of 21ms. However, for this post we wanted to focus on quorum commit which improves reliability.
Setup
We deployed a 300 data node cluster running DataStax Cassandra 2.0 on Compute Engine as the backend. To generate the load we used 30 virtual machines that wrote 3 billion small records (170 bytes each) into Cassandra using cassandra-stress. The following depicts the setup used:
You can find the instructions on how to reproduce the results by following the
setup instructions
.
Results
With 15,000 concurrent clients Cassandra was able to maintain 10.5ms median latency (8.3ms with 12,000 clients), and 95th latency percentile at 23ms. Here is how the solution scales as the number of concurrent clients grows:
Below we show a graph of the throughput versus 95th percentile latency which quickly achieves very good response times after Cassandra initializes its internal state, and Java warms up its heap and memory mapped files table. This test was run longer than the minimal time required to hit over 1M writes per second in order to show the sustained throughput:
In addition to looking at top end performance we also looked at resiliency. We removed ⅓ of the cluster nodes and it remained functional and serving more than 1M writes per second. Median latency held at 13.5ms, 95th percentile at 61.8ms, and 994.9th percentile at 1,333.5ms. We consider those numbers very good for a cluster in distress, proving Compute Engine and Cassandra can handle both spiky workloads and failures.
Conclusion
Tuning the workload costs $5 per hour (on a 3 node cluster), and the minimal test required to hit one million writes per second takes 1 hour and 10 minutes at a cost of $330 USD when run in March 2014.
Putting it all together, this means the Google Cloud Platform was able to sustain
one million Cassandra writes per second at a cost of $0.07 USD per million writes
.
-Posted by Ivan Santa Maria Filho, Performance Engineering Lead
No comments :
Post a Comment
Free Trial
Labels
Android
Announcement
api
app engine
Atmosphere Live
bigquery
BigTable
CDN
Cloud Console
Cloud Dataflow
Cloud Datastore
cloud endpoints
Cloud Pub/Sub
Cloud SDK
cloud sql
cloud storage
Cloudera
Compute
Compute Engine
container cluster
customer
Dev Tools
developer tools
developer-insights
Developers
Developers Console
devfests
Disaster Recovery
Encryption Keys
ESG
Event
events
GA
Go Client
Google App Engine
Google Apps
Google BigQuery
Google Cloud Deployment Manager
Google Cloud Networking
Google Cloud Platform
Google Cloud Storage
Google Compute Engine
Google Container Engine
gRPC
hadoop
Hardware
Helium
how to
IO2013
iOS
Kubernetes
Levyx
Local SSD
mapreduce
Media
Nearline
networking
open source
PaaS Solution
Partner
Pricing
Research
round-up
Server
Siggraph
solutions
Startup
Tableau
TCO
Technical
Windows
Wowza
Zync
Archive
2015
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Feed
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Follow @googlecloud
No comments :
Post a Comment