UPDATED 16:57 EDT / MAY 31 2012

Google’s F1 Brings NoSQL Scale To Relational Databases

MySQL familiarity or NoSQL scalability seems like a binary choice. But Google’s F1 –  the new relational database management system (RDBMS) underpinning several of Google’s customer-facing, business-critical advertising services – lays claim to combining the best of both worlds.

The F1 system is detailed in a paper/presentation entitled “F1 – The Fault-Tolerant Distributed RDBMS Supporting Google’s Ad Business,” co-authored by several Googlers and published earlier this month.

“F1 implements rich relational database features, including a strictly enforced schema, a powerful parallel SQL query engine, general transactions, change tracking and notification, and indexing, and is built on top of a highly distributed storage system that scales on standard hardware in Google data centers,” as the abstract puts it.

This comes at a cost of higher write latencies, when compared to Google’s legacy MySQL deployments. But thanks to F1’s distributed nature, it was apparently relatively simply to deploy it underneath those aforementioned ad services with no downtime. Both the simplicity and the lack of downtime are critical, given the fact that Google’s ad business handles tens of terabytes replicated across thousands of machines over any given 24-hour period, as per the presentation.

The presentation describes the underlying architecture of F1 better than I could, but the general idea is that it was developed alongside Spanner, Google’s new low-level storage system and the descendent of BigTable. In addition to the stateless server and a pool of workers for query execution, F1 consists of sharded Spanner servers, with data stored in Google File System (GFS) and in memory.

F1 uses a relational schema that can run SQL and MapReduce in parallel. The system is replicated across five data centers to assure availability, with those replicas at least 100ms apart in case of regional disaster.

The bottom line is that Google found its own compromise that made internal developers happy even as it enabled greater operational scale. Developers get their SQL queries, but maintain a level of availability and fault-tolerance that MySQL can’t match.

It seems like the best of both worlds. But reading this presentation over, I suspect that if it were this easy, everybody would be doing it. Given the massive growth of the NoSQL ecosystem, my guess is that Google may have hit on an innovative solution for its own use case – but those use cases may be limited.

Of course, Google isn’t the first to hit on this kind of hybridization: Drawn To Scale has a similar, SQL-friendly big data offering, but built on a Hadoop core rather than F1’s GFS base. Rainstor has taken a similar tack. An Oracle/Cloudera partnership also facilitates a more roundabout route to the same by way of a connector between Oracle databases and Cloudera’s distribution of Hadoop.


A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+  
CUBE Alumni Network
C-level and Technical
Domain Experts
15M+ 
theCUBE
Viewers
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.