Drawn to Scale, the makers of the proprietary “relational-like” distributed database Spire, announced today that it has raised $925,000 in seed funding from RTP Ventures, a new $750 million fund, IA Ventures, which has its own big data fund, and SK Ventures. Up to now the company has been self-funded.
Spire, currently in private beta, aims to provide a highly scalable, real-time database built on top of Hadoop without giving up the familiarity of SQL.
CEO and co-founder Bradford Stephens, after stints as a political campaign manager and heavy metal musician, worked for Microsoft on SQL Server and later as a developer for the business intelligence vendor Visible Technology. He started Drawn to Scale about two years ago because he kept seeing how the database acted as a bottleneck for some many applications and business processes. “I was struck by how often the database is the problem – it goes down and breaks the site, or the database gets unusably slow after you add too much data.”
He was joined in his mission to build a scalable but usable database by Ryan Rawson, who was one of the primary architects of HBase and has a background at Google, Amazon.com and Stumbleupon and Alex Newman, who used to work for Cloudera.
Spire works like a traditional RMDBS with columns, tables and joins, but is natively distributed. And Stephens says that unlike batch processing with Hadoop, it returns queries in miliseconds instead of days. Spire sits on Hadoop and can run map-reduce jobs directly with Spire itself so no connector is needed. But unlike Hbase or other key-value store databases, Spire has support for a large subsection of SQL and full text search.
In addition to supporting SQL, Spire also diverges from many NoSQL databases, and from search engines like Lucene and Solr, by providing an indexing system that will only search the nodes that actually store the relevant information. This speeds up queries significantly.
Stephens says that the company is contributing to the Hadoop and HBase open source projects, but is keeping its own IP proprietary.
The beta customers are using Spire for applications such as running smart grids, where larger amounts of machine generated data is analyzed to determine how to allocate capacity to different locations, and for analyzing the Twitter fire hose. Stephens says there are also two large media companies using the solution to power content management systems that can quickly provide search results from massive archives.
Stephens says the Drawn to Scale has only tested Spire with “several hundred” terabtyes and most customers are using just a few terabytes, so the platform is not yet proven to handle multi-petabyte scale, but it’s not clear how many customers would need that level of scale.
The company currently has give beta customers and Stephens says they will soon have room for several more. He says the company is shooting for a general release in Q3.
Drawn to Scale is counting on its SQL support to compete with BigCouch, which will soon to be merged into Apache CouchDB and provides fast indexing on a distributed database and integrated Lucene search (see theCube’s interview with Cloudant developer Sam Bisbee for more). What’s really compelling is that although Spire can’t work as a drop in replacement for an RDBMS the learning curve should be much shallower for developers and administrators, which should ultimately cut down on the need for professional services to install, configure and maintain big data solutions. Check out Zettaset CEO Jim Vogt’s interview on theCube for another vendor’s take on this tension between vendors that want to sell a product and vendors that sell a service.
Drawn to Scale will also compete with a variety of “NewSQL” databases that attempt to improve the scalability of traditional RDBMSes. RainStor recently announced a partnership with HStreaming to add real-time features to Rainstor’s SQL-on-Hadoop offering. Spire could also be compared with columnar relational databases such as those from HP Vertica or EMC Greenplum, both of which are trying to harness Hadoop for massive scalability while providing fast query results relatively straight forward interface to data analysts. Keep watching this space for more developments.
Leave us a comment and let us know what you think about usability and the development of real-time capabilities in big data.