UPDATED 12:30 EDT / AUGUST 02 2016

BIG DATA

Attic Labs debuts a new kind of database for a Big Data world

Big Data is a big mess. Hundreds of versions of the same data, from government census data to company financial results to sports statistics, can be floating around the Internet and internal company databases at the same time. A lot of it is wrong, but it’s hard to tell what’s correct and what isn’t.

An ambitious San Francisco startup called Attic Labs is attempting to solve those issues with a new kind of database it’s announcing today. Noms is intended to be a decentralized, open-source database that provides a widely distributed repository of data for the Internet that other databases from Oracle Corp.’s to countless open-source versions don’t offer.

“This is a missing layer of the Internet,” Attic Labs Chief Executive Officer Aaron Boodman (pictured, left) said in an interview. That in turn is a problem for everyone from companies to government agencies to app developers that need to “publish” data such as customer records, stock trades, and retail sales reports for use by various departments or even consumers.

Future of data sharing

Noms aims to allow anyone to store lots of structured data such as numbers and dates, move it around and collaborate on it. “It’s the future of how people will share data,” Jerry Chen, a partner with Attic Labs’ main venture capital backer, Greylock Partners, said in an interview.

Let’s say a retailer publishes stats on sales and the like each week to relevant executives, salespeople and others with a need for the data. The finance team wants to derive same-store sales from the data, the procurement department wants to see which dresses sold how much, and marketing wants to know whether that 20 percent discount worked in four particular states. Currently, this often involves manual requests for data, including emailing spreadsheet files, which quickly creates multiple versions of the data and introduces mistakes.

By contrast, with Noms, the raw store sales data would be dumped into the database, and then marketing, say, could subscribe to the data and use it to derive same-store sales, without messing with the data itself. That means the underlying data doesn’t get duplicated or corrupted. That’s the promise, anyway. Attic Labs will have to prove it has the goods if it wants to compete in a crowded market full of other database software.

The idea of depositing masses of data without knowing in advance how each entity that might want to use it is similar in spirit to “data lakes” used with Hadoop and other open-source Big Data software, said Chen. Noms is another database on top of Hadoop alongside Hive or MongoDB or Cloudera Impala, to name a few. But unlike those databases, Noms is intended to provide versioning and archiving as well as allowing the data in it to be distributed widely across the Net.

Following Git’s lead

Boodman in 2005 created a popular Firefox browser extension called Greasemonkey that enabled people to mash up website services in a fashion that prefigured mobile apps, before joining Google’s Chrome team. He started tinkering with the idea of a new kind of database a couple of years ago after using Camlistore, an open source personal storage system.

He thought Camlistore could use a new kind of database that worked like Git, a software version control system popularized by the software project hosting service GitHub. So he joined with longtime friend Rafael Weinstein (right), a cofounder of AvantGo Inc. and longtime key Google staff engineer working on Chrome extensions, to make it. Weinstein is now Attic Labs’ chief technology officer.

Git quickly took over the software world, Boodman said, because its decentralized nature allowed code to move easily among various computers, groups and individuals. Likewise, he hopes, Noms will make this possible for all kinds of structured data on a much larger scale.

The database being released today is a test version, with a commercial version expected in six to nine months. Boodman said he anticipates that the company could have a business model similar to that of GitHub, which charges monthly fees for private repositories of code. Chen also said there are several possibilities like those already established in open source software such as Cloudera or Red Hat, including offering the Noms software as a product and service, selling versions to large corporate enterprises or providing support.

Attic Labs has raised an $8.1 million Series A round led by Greylock and including Harrison Metal and several angel investors.

And the product’s name? “Nom” which means “name” in French, is intended to imply that the software’s “content addressing” means every piece of data gets a unique name. “Nom” also is used to indicate an eating sound, in this case to connote that the database is an “append-only” system that only consumes data and doesn’t expel it.

Chen hinted at his interest in data collaboration near the end of a recent interview on theCUBE, owned by the same company as SiliconANGLE, at the Dockercon conference in Seattle:

Photo: Attic Labs

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU