Electronic Ad Broker Uses NoSQL Database Engine for Instant Ad Placement

In the last five years the advertising sales market has changed radically, Dag Liodden,  co-founder and CTO at Tapad, said in a Wikibon Peer Incite meeting Tuesday, November 27. Just five years ago the online ad sales market was basically static, with ads developed for specific sites based on general profiles of the user population and placed based in part on recent activity by a user. So if you had been searching for information on new cars on Google, for instance, you might see car ads the next time you check your Gmail account. That was the extent of the customization.

Today, Liodden said, Tapad places ads based on a number of variables including what device the consumer is using at that moment, where the consumer is, the consumer’s demographics, what ads the customer has seen recently, and on specifications of the specific ad campaigns such as whether that customer has viewed that ad before more than a certain number of times in a specific time period. Based on those and other data the system determines this customer’s value at this moment to any number of ad campaigns and the ad agencies behind those campaigns enter bids for the ad space based on those computations. The ad exchange then places the ad of the highest bidder.

All of this must happen within 100 milliseconds!

Tapad did not invent this incredibly fast, totally automated custom ad placement environment by itself. However, it added a new degree of sophistication by being the first ad exchange to track users across multiple devices and keep track of what ads the user views and actions the user takes based on an ad across all those devices in real time. Ad campaigns often include limits on the number of times an ad should be shown to an individual within a given time, and Tapad is the first ad exchange that can provide that information. Other ad exchanges now use Tapad to track this data for them.

From a IT standpoint, one of the largest challenges, Liodden said, has been building an infrastructure that can support random, real-time reads of and writes to huge numbers of data sets, each representing an individual consumer, in millisecond time-frames. Spinning disk, which, he said, is pretty efficient at consecutive reads and writes, breaks down very quickly in a random access situation such as this. RAM disk can support the speed, but has two major problems that make it impractical:

  1. Building a RAM disk large enough for the Tapad database, which is over 1.5 Tbytes and growing, is very expensive.
  2. Reloading that database from disk, if a node loses power or a new node is added, takes too long.

As a result, Tapad has turned to NAND flash in a big way. Its total NAND installation is in the range of 3.5 Tbytes, which allows mirroring so that the system can drop two nodes without losing any data. Liodden knows this because it happened recently, when someone unplugged the wrong piece of equipment.

Physical storage is part of the issue, but Tapad also needs a database that is fast enough to serve all those reads and take in all those writes in real time. This requires a stripped down, specialized system. An RDBMS, for instance, is too encumbered by all its features to handle this specific application quickly enough. Hadoop is similarly not fast enough.

Tapad found its answer in Aerospike, a NoSQL database that keeps the keys in memory but the values on SSDs. This provides the speed of read and writes and of the specific kinds of analysis that Tapad needs.

Liodden is quick to say that Aerospike is not a “silver bullet” replacement for all other database technologies, or that SSDs are a practical replacement for disk for all data. Big data analysis, for instance, is better done using disk, which is cost effective for handling very large amounts of data that is not actively transactional.

As for Aerospike versus an RDBMS or Hadoop, Liodden says, “Today you cannot buy just a single system; you have to look at your use cases, your data growth, and other needs and pick the technology that best fits. The growth of NoSQL solutions is based on the proliferation of specific use cases where they work well, but you also will need RDBMS and Hadoop for applications where they provide the best solution.”

The twice-a-month Wikibon Peer Incite meetings present unique opportunities to hear from top IT professionals in the user, rather than vendor, community speak about the challenges they face and solutions they use. These frank discussions with opportunities for attendees to ask questions and add comments are open without charge to interested IT professionals. To receive notices of upcoming meetings, IT professionals are invited to register at Wikibon www.wikibon.org for a free membership.