Today Cloudant, the sponsor company behind the Apache CouchDB-based open source BigCouch project, announced a new customer acquisition: the controversial agriculture company Monsanto. Monsanto is using BigCouch to manage its genome sequencing data analysis, a commonly cited big data problem.
Monsanto isn’t going all out and using Cloudant’s hosted CouchDB service, but the company is doing a radical rethink of how it manages data. Cloudant CEO Alan Hoffman told me in an interview that Monsanto is replacing a “patchwork” of old systems with a big Cloudant instance. He says part of the problem the company is trying to solve is the classic silo issue – many groups within the organization are working on data in isolation, often without any knowledge of what others are working on.
BigCouch applies distributed computing concepts from Amazon.com’s Dynamo to CouchDB’s document database model, enabling better scalability for CouchDB. “The document-oriented nature of the open source BigCouch platform is ideal for computing over this complex and constantly changing body of data. We are building a distributed analysis platform that will automate our genome analysis and discovery efforts,” Monsanto Genome Integration Lead Ryan Richt was quoted in Cloudant’s announcement. Hoffman said Monsanto’s operation is the most sophisticated use of a NoSQL database he’s seen.
Cloudant’s founders worked for CERN on the Large Hadron Collider project, but Cloudant co-founder and Chief Scientist Mike Miller says he’s still been pleasantly surprised at the uptake of BigCouch for scientific computing. He had initially expected BigCouch to mostly appeal to companies dealing with the sort of analytics problems that the Web generates. He says the problems Monsanto faces are actually pretty universal problems across disciplines and industries. Monsanto is keeping its data on-premises, but other companies could easily take these sorts of problems to the cloud.