Deciding where to store enterprise data is no longer as simple as choosing between a relational database (RDB) and a data warehouse. Data volumes have exploded and so have choices for storing it. However, over the past few years one option has garnered an enormous amount of attention – NoSQL. Many praise the technology, but what is NoSQL? Is it a better option for enterprises to consider? How does NoSQL really compare to the tried and true relational database?
Back to the Basics
Let’s start by taking a step back. SQL is a programming language that has existed since the early 1970’s for interacting with relational databases. Which raises the question, “What’s a relational database?” Relational databases are the mainstay of enterprise data storage. From open source options like MySQL to Oracle’s rack, relational databases are what most technologists envision when someone mentions a database. These well established repositories store data organized in tables of columns and rows, like an excel spreadsheet. Each table has a name (e.g. customer), each column represents an attribute (e.g. prefixes (Mr., Ms., Mrs., Dr., etc.) and each row represents an object (e.g. customer Johnny Doe).
The relational aspect comes into play when tables are linked. Tables can be associated or linked to represent real world relationship such as a customers with orders. Relational technology is well understood and be used to store incredibly complex information.
Another major characteristic of a relational database is that they have ACID transactions, which ensure data isn’t left in an inconsistent state if something goes wrong (e.g. ACID transactions can prevent an order from being saved without the associated customer. For decades relational databases have worked reliably and provided the necessary functionality for the vast majority of business use cases.
However as data volumes continued to grow in the enterprise, some organizations began to rethink how they stored and managed data, and thus NoSQL was born. There are four basic categories of NoSQL repositories:
- Key-values Stores – In these a hash table exists that contains a unique key and pointer to a particular item of data. To maximize performance, the mappings usually come with cache mechanisms. This approach can give a very busy website lower latency, even with heavy usage. Since the model is so simplistic, the development is also simplified.
- Column Family Stores – These were created to store and process very large amounts of data distributed over many machines. There are still keys but they point to multiple columns. The columns are arranged by column family.
- Document Databases – similar to key-value stores. The model is basically versioned documents that are collections of other key-value collections. The semi-structured documents are stored in formats like JSON. These databases are useful if you have a lot of semi-structured data, and they are a good fit for object-oriented programming models.
- Graph Databases – built with nodes, relationships between notes and the properties of nodes. Instead of tables of rows and columns and the rigid structure of SQL, a flexible graph model is used which can scale across many machines.
NoSQL databases have the ability to scale out across multiple commodity servers and handle massive amounts of data. In addition, NoSQL repositories have a more flexible data model than RDBs. This allows businesses to quickly make changes and handle unstructured data without being so concerned with database schemas. What’s not to love?
No technology is perfect, and that includes NoSQL. The technology and products are still in a very embryonic state. The lack of maturity leaves room for a lot of caution. Most technology leaders adopting new technology want to ensure the platforms are stable and have support options available if something goes wrong – something absent from many NoSQL repositories. The market is maturing , but many of the familiar niceties from the relational world simply don’t exist for NoSQL.
Given the trade offs, is NoSQL a must have or bane for enterprise data? It depends. Technology pros should understand that NoSQL means “not Only SQL,” which implies leveraging NoSQL isn’t an all or nothing decision. Neither NoSQL or relational technology is a magic bullet for enterprise data. Organizations should not discount NoSQL because it’s new. NoSQL can be a powerful tool to have in an arsenal.
At Information On Demand 2012, IBM’s executive in charge of database software sat down with the Cube. He advised that, while NoSQL is better equipped to handle unstructured data than a relational database, SQL is still the best choice for a broad range of traditional workloads, which is why these new technologies should be considered a supplement rather than an alternative. Sounds like solid advice to me.