MemSQL5 seeks to bridge transaction, analytics processing
MemSQL, a scalable SQL-based in-memory distributed database, is getting major performance enhancements in version 5, announced today at Strata+Hadoop World in San Jose, CA.
Most significant is the addition of technology to deliver low-latency query compilation for interactive data exploration through the use of LLVM, a collection of modular and reusable compiler and toolchain technologies that Chief Marketing Officer Gary Orenstein said is “much more modern than the GCC compiler used in C++. We’re getting about a 10X improvement in first query speed,” he noted.
That’s important for improving the performance of interactive queries, which are created on the fly and not stored for repeated execution. The new byte code compilation architecture translates these queries into machine code upon first execution to deliver better performance. Previously, queries were initially interpreted and then compiled for subsequent use.
“If you have queries that you run a lot, then not a lot has changed,” Orenstein said. “We’ve focused on the interactive, ad hoc experience of first queries.”
That’s important for MemSQL’s overall objective of achieving what Forrester Research Inc. calls “translytical processing,” or the merging of transaction processing and analytical functions. Data analytics has traditionally meant crunching through historical data, but the emerging worlds of streaming and predictive analytics seek to analyze information as it crosses the wire.
An example would be a point-of-sale analytics engine that analyzes individual transactions and delivers promotional offers to customers while they are standing at the checkout counter. MemSQL 5 supports hybrid transaction/analytical processing (HTAP), which is a combination of online transaction processing (OLTP) and online analytical processing (OLAP). “Our goal is to turn analytics into predictive applications,” Orenstein said. “We believe this new code generation architecture will put us at the front of the pack for business intelligence code generation for the next couple of years.”
Another part of the equation is Streamliner, an analytics engine that integrates MemSQL with the Apache Kafka message broker and Apache Spark analytics framework. It enables users to create real-time data pipelines using a graphical interface and to eliminate the often time-consuming batch extract/transform/load (ETL) process, the company said
“You can now push data into MemSQL directly and query it upon arrival to bring live data to the data warehouse experience,” Orenstein said. Streamliner also provides for one-button provisioning of Spark instances across multiple nodes.
MemSQL uses the Linux file system as its default data store. It stores row data in memory and column data in both memory and on disk. There’s a free community edition and an enterprise edition that is custom-priced based upon memory usage.
The San Francisco-based MemSQL has raised more than $45 million in venture funding.
Photo by Ognian Setchanov via Flickr CC
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU