Google Cloud Platform Blog
Product updates, customer stories, and tips and tricks on Google Cloud Platform
Aerospike Hits One Million Writes Per Second with just 50 Nodes on Google Compute Engine
December 4, 2014
Today’s guest blogger is Sunil Sayyaparaju, Director of Product & Technology at Aerospike, the open source, flash-optimized, in-memory NoSQL database.
What exactly is the speed of Google? We at
Aerospike
take pride in meeting our challenges of high throughput, consistently low latency, and real time processing that we know will be characteristic of tomorrow’s cloud applications. That’s why after we saw Ivan Santa Maria Filho, Performance Engineering Lead at Google, demonstrate
1 Million Writes Per Second with Cassandra on Google Compute Engine
, our team at Aerospike decided to benchmark our product’s performance on Google Compute Engine and push the boundaries of Google’s speed.
And guess what we found out. Aerospike scaled on Google Compute Engine with consistently low latency, required smaller clusters and was simpler to operate. The combined Aerospike-Google Cloud Platform solution could fuel an entirely new category of applications that must process data in real-time and at scale from the very start, enabling a new class of startups with business models that were not viable economically previously.
Our benchmark used a similar setup as the Cassandra benchmark: 100 Million records at 200 bytes each, debian 7 backports, servers on n1-standard-8 instances with data-in-memory with on-disk persistence on a 500GB non-SSD persistent disks at $0.504/hr, clients on n1-highcpu-8 instances at $0.32/hr, and followed these
steps
. In addition to pure write performance, we also documented pure read and mixed read/write performance. Our findings:
High Throughput for both Reads and Writes
1 Million Writes per Second with just 50 Aerospike servers
1 Million Reads per Second with just 10 Aerospike servers
Consistent low latency, no jitter for both Reads and Writes
7ms median latency for Writes with 83% of writes < 16ms and 96% < 32
1ms median latency for Reads with 80% of reads < 4ms and 96.5% < 16ms
Note that latencies are measured on the server (latencies on the client will be higher)
Unmatched Price / Performance for both Reads and Writes
1 Million Writes Per Second for just $41.20/hour
1 Million Reads Per Second for just $11.44/hour
Aerospike is used as a front edge operational database for a variety of purposes: a session or user context store for real-time bidding, personalization, fraud detection, and real-time analytics. These applications must read and write billions of keys and terabytes, from click-streams to sensor data. Data in Aerospike is replicated synchronously in-memory to ensure immediate consistency and written to disk asynchronously.
Here are details on our experiments with Aerospike on Google Compute Engine. Using 10 server nodes and 20 client nodes, we first examined throughput for a variety of different read and write ratios and documented latency results for those same workloads. Then, we documented how throughput scaled with cluster size, as we pushed a 100% read load and a 100% write load onto Aerospike clusters, going from a 2 nodes to 10.
High Throughput at different Read / Write ratios (10 server nodes, 20 client nodes)
For all read/write ratios, 80% of TPS in this graph is achieved with 50% of the clients (10), adding more clients only marginally improves throughput.
Disk IOPS
depend on size
. We used 500GB non-SSD persistent disks to ensure high IOPS, so the disk is not the bottleneck. For larger clusters, 500GB is a huge over-allocation and can be reduced for lower costs. To achieve this high performance, we did not need to use SSD persistent disks to get higher IOPS.
Consistent Low Latency for different Read / Write ratios (10 server nodes, 20 client nodes)
For a 100% read load, only ~20% of reads took more than 4ms and ~3.5% reads took more than 16ms. This is because reads in Aerospike are only 1 hop (network round trip) away from the client, while writes take 2 network-roundtrips for synchronous replication. Even with a 100% write load, only 16% of writes took more than 32ms. We ran Aerospike on 7 of 8 cores on each server node. Leaving 1 core idle helped latency; if all cores were busy, network latencies increased. Latencies are as measured on the server.
Linear Scalability for both Reads and Writes
This graph shows linear scalability for 100% reads and 100% writes but you can expect linear scaling for mixed workload too. For reads, a 2:1 clients:server ratio was used i.e for a 6 node cluster, we used 12 client machines to saturate the cluster. For writes, a 1:1 client:server ratio was enough because of the lower throughput of writes.
A new generation of applications with mixed read/write data access patterns sense and respond to what users do on websites and on mobile apps across the Internet. These applications perform data writes with every click and swipe, make decisions, record, and respond in real-time.
Aerospike running on Google Compute Engine showcases an example application that requires very high throughput and consistently low latency for both reads and writes. Aerospike processes 1 Million writes per second with just 50 servers, a new level of price and performance for us. You too can follow these
steps
to see the results for yourself and maybe even challenge us.
- Posted by Sunil Sayyaparaju, Director of Product & Technology at Aerospike
No comments :
Post a Comment
Free Trial
Labels
Android
Announcement
api
app engine
Atmosphere Live
bigquery
BigTable
CDN
Cloud Console
Cloud Dataflow
Cloud Datastore
cloud endpoints
Cloud Pub/Sub
Cloud SDK
cloud sql
cloud storage
Cloudera
Compute
Compute Engine
container cluster
customer
Dev Tools
developer tools
developer-insights
Developers
Developers Console
devfests
Disaster Recovery
Encryption Keys
ESG
Event
events
GA
Go Client
Google App Engine
Google Apps
Google BigQuery
Google Cloud Deployment Manager
Google Cloud Networking
Google Cloud Platform
Google Cloud Storage
Google Compute Engine
Google Container Engine
gRPC
hadoop
Hardware
Helium
how to
IO2013
iOS
Kubernetes
Levyx
Local SSD
mapreduce
Media
Nearline
networking
open source
PaaS Solution
Partner
Pricing
Research
round-up
Server
Siggraph
solutions
Startup
Tableau
TCO
Technical
Windows
Wowza
Zync
Archive
2015
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Feed
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Follow @googlecloud
No comments :
Post a Comment