Datastax annuonced the new version of its commercial Apache Cassandra distribution today. The biggest new feature in DataStax Enterprise 2.0 is the integration of Solr on top of the Cassandra stack. Since Cloudant announced Cloudant Search for BigCouch and Basho rolled out Riak Search, enterprise search is becoming a must have for distributed NoSQL databases.
Datastax is a client of analyst Curt Monash, who notes that he “[likes] the core DataStax story — and indeed had some influence on it — but roll my eyes somewhat at the work-in-progress as to how it is phrased and told.” Here’s Monash’s no-nonsense rundown of the Datastax stack:
- Cassandra — the NoSQL DBMS, which DataStax sometimes calls “DataStax Server”.
- Hadoop MapReduce, which DataStax sometimes calls “Hadoop”.
- Sqoop — the general way to connect relational DBMS to Hadoop, which DataStax sometimes calls “RDBMS integration”.
- Solr — the search-centric Apache project, or big parts of it, which DataStax generally calls either “Solr” or “Solr compatibility”.
- log4j – an Apache project that has something or other to do with logging, or parts of it, which DataStax sometimes calls “log file integration”.
- DataStax OpsCenter — some management tools and so on around Cassandra and the rest of the product line.
The integrations that are new this time around are Solr, Sqoop and log4j. Alex Popescu notes that Lucandra could be an alternative for Solr integration on Cassandra and that flume-cassandra-plugin can be used to integrate Cassandra with Flume instead of log4j.
There are also a few new features in OpsCenter, Datastax’s proprietary set of management tools. Popescu notes what’s new:
- Multi-cluster monitoring
- Visual backup
- Search monitoring
Popescu muses on Datastax’s future:
DataStax has already showed this direction with what was called initially Brisk (or Brangelina for friends): Hadoop on top of the Cassandra cluster that became DataStax Enterprise 1.0. Solr on top of Cassandra is 2.0, but what will be the 3.0?
The newest version of Datastax Enterprise is compelling, but the competition from Cloudant, Basho and Amazon Web Services’ DynamoDB puts pressure on Datastax to innovate quickly and set Cassandra apart from the pack. Monash runs through the many production uses of Cassandra, but Netflix remains the biggest success story. Cassandra has an advantage in running in geographically dispersed data centers, but it needs to start scoring more victories and telling a more convincing story. One about what Cassandra enables that couldn’t be done with other technologies. I look forward to whatever the team rolls out with in 3.0.