Sync or Sink: Ensuring Real-Time Big Data AvailabilityJuly 23, 2012 No Comments
By Elad Efraim, eXelate CTO
Today’s hyper-connected consumers can search options on the Web and make informed buying decisions within minutes, if not seconds. This is putting greater pressure than ever on digital marketers and online advertisers to identify and deliver the most effective online ad for each individual site visitor at any given moment at the right price.
At the core of these real-time business decisions is data, and lots of it, measured in billions of objects and terabytes of storage. Moreover, information and insights have to be delivered in milliseconds, and downtime is not an option.
At eXelate, we understand the challenge first hand. We serve marketers by reliably processing some 60 billion real-time transactions per month, which are based on 20 billion unique data points and actionable across 400 million online consumers worldwide.
To ensure the real-time delivery of this actionable data to digital publishers and marketers, we have set up four data centers that serve two important purposes. First, they physically bring data closer to our customers to minimize latency. Second, they manage fully redundant data sets to ensure business continuity in the event that a data center fails.
The Big Data Challenge
One of our core decisions in deploying the data centers was what database we would use to power our real-time transactions. Working with Internet-driven big data presents challenges that are not well suited to traditional SQL-based relational databases. Therefore, we focused our evaluation on a number of the NoSQL databases specifically designed to manage the combination of structured and unstructured data that comes across the Web.
As we looked at NoSQL databases, several factors played into our decision. First we considered scale. Second, we required real-time performance, including write performance that is roughly equivalent to read performance. Third, we needed real-time replication at each of our four datacenters. The Citrusleaf real-time NoSQL database with cross data center replication delivered these three requirements.
An Always-On Solution
Today, the Citrusleaf database is deployed on standard Intel-based systems at each of our four global data centers: three across the US and one in Europe. The database at each data center holds approximately 4 TB of structured and unstructured data, consists of a single cluster with six to eight nodes, and is a fully redundant solution.
Replication enables fast and reliable synchronization across the Citrusleaf databases in our four data centers, so we have an always-on solution. In the event of one data center’s failure, any of the other three can pick up the slack while protecting our customers against any perceptible performance slowdowns.
A Winning Strategy
Deploying databases with replication in multiple data centers is a sophisticated process. However, the Citrusleaf support team helped us optimize our deployment. The architecture allows us to handle business continuity and reliability across our four data centers seamlessly, and with Citrusleaf, we can quickly expand our deployment to new data centers in less than a week.
We now are able to deliver the 24/7 real-time response that marketers and online advertisers require while maintaining the flexibility to support our expanding customer base and innovate new services.
# # #
Elad Efraim, eXelate co-founder and CTO, is responsible for eXelate’s overall technology direction and translating market demand and customer experience into technology solutions that simplify and scale data transactions. Prior to eXelate, Elad was a product manager at Oridian where he helped build the company into a global online advertising powerhouse. Previously, he led the development of the mobile product at TopImageSystems and was the lead system engineer at eMobilis. Elad served as the head of the programming education department at the IDF Computer Studies Academy (Mamram) and holds a B.A. in logistics and computer science from Bar-Ilan University.Fresh Ink