UPDATED 06:00 EDT / MAY 03 2012


One NoSQL Language to Rule Them All

Last year Erik Meijer and Gavin Bierman of Microsoft Research published a paper making the case for a common language for NoSQL databases. A year later, there’s been some progress towards this goal, although not much.

UnQL, under development by Couchbase, Microsoft and SQLite is one of the most ambitious NoSQL query language projects I know of. The project seeks to deliver a universal SQL-like query language for document databases and other relatively well-structured non-relational databases. Here’s a lucid explanation of UnQL from H Online:

The language includes SELECT, INSERT, UPDATE and DELETE commands, but unlike SQL, they do not work on tables but on “collections” of unordered sets of documents. In UnQL, a document is an object that can be described in JSON (JavaScript Object Notation). Single integer numbers, floating point numbers and strings can also be documents.
Unlike traditional relational databases, a collection may contain differently structured documents. The SQL commands CREATE and DROP TABLE become CREATE and DROP COLLECTION in UnQL. The WHERE clause of queries now refers to a document’s properties which map to the fields of the stored object.

Couchbase SVP of Products James Phillips says that so far there aren’t any partners other than Couchase, Microsoft and SQLlite, but anyone could build support for the language into an open source document database. So even if MongoDB, Apache CouchDB and BigCouch don’t officially support UnQL, someone could build an extension to provide support for UnQL into them.

Phillips says that the project is currently focused only on document databases, not key-value stores or graph databases. That means other data stores with traction – such as Apache Cassandra, HBase and Riak – won’t be able to easily support UnQL, if at all.

There are a lot of databases that UnQL could eventually be used with, and a lot that it simply couldn’t be used with. Graph databases are so fundamentally different they require a much different approach. Gremlin is graph traversal language that has become the standard for working with graph databases. It’s supported by DEX, InfiniGraph, Neo4j and many others.

Meanwhile, other projects are trying to create their own custom built SQL-like domain specific languages. Cassandra as CQL (Cassandra Query Language). Hadoop has Pig. HPCC has ECL (Enterprise Control Language). Google has GQL. But each will be a bit different – knowing SQL may help you learn each of these languages, besides perhaps Gremlin, but they’re not cross-compatible.

In their paper, Meijer and Bierman made the case that Microsoft LINQ (Language INntegrated Query) could be the glue for between SQL and what “coSQL” (their name forNoSQL). LINQ is a query system for .NET languages that has also been implemented other languages such as Java. It can be used with Microsoft SQL as well as independent data sources such as XML documents and Twitter.

Meijer is referred to as the “father of LINQ” and is his involvement in UnQL suggests that the project will align with LINQ in some way. However, it seems that Microsoft is de-emphasizing LINQ. It discontinued its Hadoop competitor LINQ to HPC (formerly known as DryadLINQ) in favor of porting Hadoop to Windows and developing a JavaScript library for writing MapReduce jobs. Still, that doesn’t rule out the possibility of LINQ to Hadoop, as wished for by John Conwell, and Microsoft has made no announcement about abandoning LINQ altogether.

Magnus Mårtensson, author of a .NET client for Neo4j wrote in 2010 that he has considered building a LINQ connector for Neo4j and there’s also been work done towards creating LINQ support for MongoDB, but it doesn’t appear either project has legs at the moment.

LINQ proves that it’s possible to create a truly cross-platform query language for databases. But it will require a great degree of openness on Microsoft’s part, and acceptance of the standard from the community. Instead we might end up with something LINQ-like. The NoSQL market has become increasingly competitive in the past year and half, and it’s difficult to imagine 10gen and the Apache CouchDB team accepting a standard created by Coucbase and Microsoft. Demand will have to come from the community from the ground up.

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy