Google Cloud Platform Blog
Product updates, customer stories, and tips and tricks on Google Cloud Platform
Cloud Spin, Part 3: processing video using Google Cloud Platform services
October 2, 2015
Cloud Spin,
Part 1
and
Part 2
introduced the Google Cloud Spin project, an exciting demo built for
Google Cloud Platform Next
, and how we built the mobile applications that orchestrated 19 Android phones to record simultaneous video.
And now the last step is to retrieve the videos from each phone, find the frame corresponding to an audio cue in each video, and compile those images into a 180-degree animated GIF. This post explains the design decisions we made for the Cloud Spin back-end processing and how we built it.
The following figure shows a high level view of the back-end design:
1. A mobile app running on each phone uploads the raw video to a
Google Cloud Storage
bucket.
Video taken by one of the cameras
2. A
n
extractor
process running on a
Google Compute Engine
instance finds and extracts the single frame corresponding to the audio cue.
3. A
stitcher
process running on an
App Engine Managed VM
combines the individual frames into a video that pans across an 180-degree view of an instant in time, and then generates a corresponding animated GIF.
How we built the Cloud Spin back-end services
After we’d designed what the back end services would do, there were several challenges to solve as we built them. We had to figure how to:
Store large amounts of video, frame, and GIF data
Extract the frame in each video that corresponds to the audio cue
Merge frames into an animated GIF
Make the video processing run quickly
Storing video, frame, and GIF data
We decided the best place to store incoming raw videos, extracted frames, and the resulting animated GIFs was
Google Cloud Storage
.
I
t’s easy to use, integrates well with mobile devices, provides strong consistency, and automatically scales to handle large amounts of traffic, should our demo become popular.
We also configured the Cloud Storage buckets with
Object Change Notifications
that kicked off the back-end video processing when the Recording app uploaded new video from the phones.
Extracting a frame that corresponds to an audio cue
Finding the frame corresponding to the audio cue or beep poses challenges. Audio and video are recorded with different qualities and at different sample rates, so it takes some work to match them up.. We needed to find the frame that matched the noisiest section of the audio.To do so, we grouped the audio frames into frame intervals, each interval containing the audio that roughly corresponded to a single video frame. We computed the average noise of each interval by calculating the average of the squared amplitude of the samples. Once we identified the interval with the largest average noise, we extracted the corresponding video frame as a PNG file.
We wrote the extractor process in Python and used
MoviePy
, a module for video editing that uses the
FFmpeg
framework to handle video encoding and decoding.
Merging frames into an animated GIF
The process of generating an animated GIF from a set of video frames can be done with only four FFmpeg commands, run by a Bash script. First we generate a video by stitching all the frames together in order, extract a color palette, and then use it to generate a lower-resolution GIF to upload to Twitter.
Making the video processing run quickly
Processing the videos one-by-one on a single machine would take longer than we wanted. Next participants to have to wait to see their animated GIF. Cloud Spin takes 19 videos for each demo, one from each phone in the 180-degree arc.If extracting the synchronized frame from each video takes 5 seconds, and merging the frames takes 10 seconds, with serial processing the time between taking the demo shot and the final animated video would be (19 * 5s + 10s = 110s), almost two minutes!
We can make the process faster by parallelizing the frame extraction. If we use 19 virtual machines, one to process each video, the time between the demo shot and the animated GIF is only 15 seconds. To make this improvement work, we had to modify our design to handle synchronization of multiple machines.
Parallelizing the workload
We developed the extraction and stitching process as independent applications. This made it easy to parallelize the frame extraction. We can run 19 extractors and one stitcher, each as a
Docker
container on
Google Compute Engine
.
But how do we make sure that each video is processed by one, and only one, extractor?
Google Cloud Pub/Sub
is
a messaging system that solves this problem in a performant and scalable way. Using Cloud Pub/Sub, we can create a communication channel that is loosely coupled across subscribers.This means that the extractor and stitcher applications interact through Cloud Pub/Sub, with no assumptions about the underlying implementation of either application. This makes future evolutions of the infrastructure easier to implement.
The preceding diagram shows two Cloud Pub/Sub topics that act as processing queues for the extractor and the stitcher applications. Each time a mobile app uploads a new video, Cloud Pub/Sub publishes a message on the videos topic. The extractors subscribe to the videos topic. When a new message is published, the first extractor to pull it down has a lease on the message during which it processes the video in order to extract the frame corresponding to the audio cue. If the processing completes successfully, the extractor acknowledges the videos message to Cloud Pub/Sub, which causes Cloud Pub/Sub to publish a new message to the frames topic. If the extractor process fails, the lease on the videos message expires and Cloud Pub/Sub republishes the message, where it can be handled by another extractor.
When a message is published on the frames topic the stitcher pulls it down and waits until all of the frames of a single session are ready to be stitched together into an animated GIF. In order for the stitcher application to detect when it has all the frames, it needs a way to check the real-time status of all of the frames in a session.
Managing frame status with Firebase
Part 2 discussed how we managed orchestrating the phones to take simultaneous video using a
Firebase
database that provides real-time synchronization.
We also used Firebase to track the process of extracting the frame from each camera that corresponds to the audio cue. To do so, we added a status field to each extracted frame in the session as shown in the following screenshot.
When the Android phone takes the video it sets this status to RECORDING and then to UPLOADING when it uploads the video to Cloud Storage. The extractor process sets the status to READY when the frame matching the audio cue has been extracted. When all of the frames in a session are set to READY the stitcher process combines the extracted frames into an animated GIF and stores the GIF in Cloud Storage and its path on Firebase.
Having the status stored in Firebase made it possible for us to create a dashboard that showed, in real time, each step of the processing of the video taken by the Android phones and the resulting animated GIF.
We finished development of the Cloud Spin backend in time for the
Google Cloud Platform Next
events, and together with the mobile apps that captured the video, ran a successful demo.
You can find more Cloud Spin animations on our Twitter feed:
@googlecloudspin
. We plan to release the complete demo code. Watch
Cloudspin on GitHub
for updates.
This is the final post in the three-part series on Google Cloud Spin. I hope you had as much fun discovering this demo as we did building it. Our goal was to demonstrate the possibilities of Google Cloud Platform in a fun way, and inspire you build something awesome!
-
Posted by Francesc Campoy Flores, Google Cloud Platform
No comments :
Post a Comment
Free Trial
Labels
Android
Announcement
api
app engine
Atmosphere Live
bigquery
BigTable
CDN
Cloud Console
Cloud Dataflow
Cloud Datastore
cloud endpoints
Cloud Pub/Sub
Cloud SDK
cloud sql
cloud storage
Cloudera
Compute
Compute Engine
container cluster
customer
Dev Tools
developer tools
developer-insights
Developers
Developers Console
devfests
Disaster Recovery
Encryption Keys
ESG
Event
events
GA
Go Client
Google App Engine
Google Apps
Google BigQuery
Google Cloud Deployment Manager
Google Cloud Networking
Google Cloud Platform
Google Cloud Storage
Google Compute Engine
Google Container Engine
gRPC
hadoop
Hardware
Helium
how to
IO2013
iOS
Kubernetes
Levyx
Local SSD
mapreduce
Media
Nearline
networking
open source
PaaS Solution
Partner
Pricing
Research
round-up
Server
Siggraph
solutions
Startup
Tableau
TCO
Technical
Windows
Wowza
Zync
Archive
2015
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Feed
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Follow @googlecloud
No comments :
Post a Comment