UPDATED 01:39 EDT / AUGUST 06 2015

NEWS

Databricks casts its net wider with Apache Spark platform update

Just two months after launching its Apache Spark-powered cloud platform, Databricks has announced a major update with a range of new feautures designed to help users facilitate Spark app development and control access to date.

Databricks’ platform is a cloud-based Big Data processing engine based on Spark, offering a multi-user graphical interace and standard libraries like Spark SQL and MLlib. The new update offers features including access control, support for the R statistical programming language, notebook versioning and support for multiple versions of Spark. Called Databricks 2.0, the company says the new version does away with the need to contend with the operational complexities that are inevitable when using tools and systems associated with traditional data solutions.

Security enhancements are also among the new capabilities. Databricks has introduced a feature it calls Access Control, that boosts security and manageability of the platform for large teams. The company says users can now grant and restrict access to code and data on an individual basis.

Databricks also touts a new interactive notebooks feature that’s designed to make Spark app development and management easier. These notebooks come with interfaces where developers can create and schedule Spark jobs in Python, Scala or SQL. Notebooks can be run repeatedly as automatically-executing production jobs too, which means developers can manage and track their codebase via version-control tools like Git.

“Not only are our users writing increasingly complex code in their notebooks, but they are also widely sharing these notebooks as an easy way to disseminate information,” said Databricks head of engineering Ali Ghodsi in a blog post. “Having a way to checkpoint progress or revert to a previous version of the code quickly became an essential productivity booster. In response to this demand, we integrated our notebooks with the popular version control system, GitHub, to enable users to easily manage different versions of their code.”

Databricks now accomodates more diverse production environments too, with the ability to deploy multiple Spark versions in the platform. This ensures that users can maintain compatibility while experimenting with the latest features.

Finally, Databricks says its platform now supports the R programming language, allowing a new category of users to take advantage of the power of Spark. Users can now use R to explore data at scale, for example with one-click visualizations and instant deployment of R code into production.

According to Ghodsi, by allowing non data scientists to conduct explorative analysis and write jobs in Databricks in R, the company is making data much more accessible throughout businesses.

“The other people in the company, not the hard-core PhDs in Big Data, not the original guys because those original guys actually were fine with probing such low-level Spark stuff directly,” Ghodsi told ZDnet. “They’re savvy, they love this stuff, they’ve been using it since the early days of Spark, some of them from even before it was a huge success. But now you have people in the organisation who want to ask queries. Some of them know SQL. But what we saw last year is that more and more people were asking about R.”

Databricks told ZDnet that it’s secured more than 1,700 sign ups since the launch of its platform into general availability six weeks ago. This includes a number of significant enterprise deployments, at companies including MyFitnessPal and Edmunds.com.

Photo Credit: Chaval Brasil via Compfight cc

A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+  
CUBE Alumni Network
C-level and Technical
Domain Experts
15M+ 
theCUBE
Viewers
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.