MemSQL extends in-memory database with Apache Spark connector
Continuing its campaign to evangelize the virtues of the SQL query language while also embracing alternatives, MemSQL, Inc. is rolling out a connector for the popular Apache Spark framework that it says enables rapid and seamless data transfer between the platforms.
The MemSQL Spark Connector combines the in-memory processing and distributed architectures of both MemSQL and Spark for high performance parallel throughput. MemSQL’s special sauce is that it enables both heavy-duty transaction processing and real-time analysis to be performed on unstructured data streams in the same environment, without the hassle of shuffling data back and forth. The company claims that its in-memory architecture provides unparalleled performance while preserving compatibility with the 30-year-old SQL language. The company clearly doesn’t want to be pigeonholed in the SQL niche, however. This announcement comes just a few weeks after MemSQL added open source connectors to external data sources like Hadoop and Amazon S3.
MemSQL executives pointed to similarities between the architecture of their namesake database and the Spark framework, including memory-optimized processing and a distributed architecture. The combination of MemSQL’s in-memory database for data caching and Spark’s data memory-optimized processing structure make the combination “the fastest way to operationalize anything you do with Spark today,” said Eric Frenkiel, CEO of MemSQL. The company also said it’s the simplest way to integrate Spark with an operational – rather than an analytical – database.
The connector is ideal for scenarios in which data stored in a production MemSQL database can be manipulated by Spark analytics and the results saved back to the production data store without a clunky extract/transfer/load (ETL) procedure. For example, a marketer who’s interested in behavior by customers who are at the outer edges of a bell curve, “may build a Spark model to extract the edge, analyze it, put it in MemSQL for persistence and then give it to the data analytics team to better understand outliers,” Frenkiel said.
Customers can also store data in the Hadoop File System and move it to MemSQL for production or Spark for analysis. “There’s never a single solution,” Frenkiel said. “It’s important to support both traditional and new tools.”
The connector also acknowledges the growing dominance of Spark as a flexible analytics engine and likely replacement for the Hadoop’s native MapReduce programming model. As noted by SiliconANGLE last week, some people now believe that Big Data buying decisions will soon be influenced more by the choice of Spark than Hadoop.
MemSQL is covering its bets. “The prevailing wisdom is that MapReduce has a limited life and Spark is widely viewed as the next iteration,” said Gary Orenstein, chief marketing officer of MemSQL.
Since you’re here …
Show your support for our mission with our one-click subscription to our YouTube channel (below). The more subscribers we have, the more YouTube will suggest relevant enterprise and emerging technology content to you. Thanks!
Support our mission: >>>>>> SUBSCRIBE NOW >>>>>> to our YouTube channel.
… We’d also like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.