UPDATED 08:00 EDT / NOVEMBER 20 2015

NEWS

200-year-old publisher finds happiness with NoSQL database

When Ted Blizzard was presented with the idea of hosting mission-critical data on a NoSQL database way back in 2005, his initial reaction was predictable.

“I said absolutely no way,” said Blizzard (below right), CIO of the New England Journal of Medicine (NEJM), the prestigious publication of the Massachusetts Medical Social that many people consider the bible of healthcare. With 600,000 readers and a history that dates back to the James Madison administration, the NEJM was not what most people would consider to be a top candidate for cutting-edge technology.

Ted Blizzard, NEJMBut the Journal prides itself on running a modern IT environment, and the document-centric nature of the data NEJM manages made NoSQL worth a second look. The NEJM is best-known for its signature printed publication, but the organization actually produces a much larger volume of content online, ranging from radiological images to clinical videos to online courseware to podcast interviews. Nearly all of that content has something in common: it’s unstructured. And that’s where NoSQL excels.

The technology was still immature a decade ago, however, and Blizzard was skeptical. “I knew about the need for managing unstructured data, but I was unconvinced that the money and time we’d have to spend would be worth it,” says Blizzard, a veteran of nearly 20 years at NEJM. “I thought we could index text fields in SQL Server instead.”

Limitations of SQL

It turns out you can, but not very easily. For one thing, SQL indexes need a considerable amount of planning to change, and the XLM data that NEJM works with changes all the time. SQL databases also don’t know anything about the data they manage, which eliminates the ability to define semantic  relationships.

Blizzard and his team decided to take another look at MarkLogic, a NoSQL database from a then-fledgling company of the same name. “Before I say ‘no,’ I like to know what I’m saying ‘no’ to, so I took the Marklogic training course,” he said. “I realized quickly that this was not going to be a niche back-end tool.” With features like XQuery, a SQL-like query engine for XML data, MarkLogic promised to retrieve unstructured data at speeds no SQL database could approach.

Equally important was that MarkLogic can understand the documents it manages because of a feature called semantics. This enables the database to infer relationships without having them explicitly stated. For example, if the database understands that John is a cardiologist and that a cardiologist is a physician, it can infer that John is a physician. SQL databases aren’t tuned to do that.

For NEJM, this kind of functionality is important. “If you’re dealing with a specific disease that has a genetic element to it, MarkLogic can find other documents that have the same element and set up that relationship,” he said. “It can understand the meaning behind the documents themselves.”

The Journal initially implemented MarkLogic on a small part of its website for testing. The product’s speed and functionality quickly turned heads. “Over time it evolved into the cornerstone of our content repository,” Blizzard said. “All of the content that’s created in the company is now pumped into MarkLogic.”

The company still uses a SQL database for its financial records and even runs MongoDB, another leading NoSQL database, for special-purpose cases, but MarkLogic is the core engine for managing documents. Some of the company’s SQL programmers were able to quickly pick up the basic XQuery constructs, and MarkLogic’s training and technical support have been world-class, Blizzard said.

“The support people are excellent at what they do, and you can reach them at three in the morning,” he said. “That kind of support doesn’t come free; MarkLogic is probably one of the most expensive databases we have, but it’s also one of the most useful.”

The Journal is now eager to dig in to the API economy by exposing more of its data and applications as services. It turns out MarkLogic is well-suited for the task. “It’s like an API engine,” Blizzard said. “You can design and distribute from the database rather than building a lot of code around it. It’s easy to make the database accessible.”

The CIO is a believer, and NEJM’s experience demonstrates that even venerable companies can learn new tricks. “I’m not a tech fanboi. I’m a ‘Let’s get the job done as effectively as possible’ fanboi,” Blizzard said. “MarkLogic is a great way to do that.”

Photo by Phalinn Ooi via Flickr

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU