UPDATED 16:45 EDT / SEPTEMBER 09 2013

NEWS

R Language Tops the Charts As Most Prefered Language for Data Science and Big Data Analytics

Demonstrating the potential of Big Data technologies requires expertise from different areas. Data Science, data mining, and big data analytics are some of the expert roles that bring together the diverse skills needed to deal with big data technologies, products, and services to optimize the operations of a company. Amid those skills are the languages an analyst knows, so when KDNuggets released its survey of languages and skills, we reviewed the results.

Data visualization is an essential skill for every Web Analyst and data scientist. Data Science demands a number of additional skills, most of which are not learned in a short time. A very strong general knowledge of statistics such as Bayes, linear regression, and logarithmic regression is required, as well as knowledge of algebra and linear algebra; natural language processing; predictive analytics (based on machine learning) and most importantly, knowledge of tools such as R, Python, SQL, and other programming languages.

KDNuggets has published its annual poll of top languages for analytics, data mining and data science, and just as in the two years prior, R language is ranked as the most popular. Based on a high response of over 700 voters, R’s usage grew 16% this year compared to the 2012 poll, followed by Python, and SQL.

“The most popular languages continue to be R (used by 61% of KDnuggets readers), Python (39%), and SQL (37%). SAS is stable at around 20%. The highest growth was for Pig/Hive/Hadoop-based languages, R, and SQL, while Perl, C/C++, and Unix tools declined,” says the report.

Among the most common languages, the largest relative increases in share of usage were found among Pig Latin/Hive/other Hadoop-based languages with 19% growth, from 6.7% in 2012 to 8.0% in 2013; R with 16% growth, and SQL with 14% growth. Similarly, the languages with the largest decline in share of usage were Lisp/Clojure (77% down), Perl (50% down), Ruby (41% down), C/C++ (35% down), UNIX shell/awk/sed (25% down) and Java (22% down).

Ben Podgursky, a Software Engineer at Liveramp, shared a statistic recently, saying that ActionScript yields the highest average household income of $108,119.47, followed by XSLT ($106,199.19), Java ($103,179.39), Groovy ($102,650.86), Objective-C ($101,801.60) and ColdFusion ($101,536.70). Puppet ($87,589.29) and Haskell ($89,973.82) were at the bottom of the list in the GitHub community.

Much like Linux, R has had a rather slow but steady evolution. R was created when a couple of university professors wanted an open source system that could work on big data that was being parallel processed, and it really took off in the academic community, beginning with research projects. Today, R is being used in pre-dated parallel processing, server clusters, and Hadoop and other cloud technologies.

The mix of skills in database query languages, statistics, predictive and advanced analytics, programming, business intelligence, and cognitive science make R such a popular language among developers. Today R can scale for Hadoop execution, in-database execution, parallelized user code, parallelized algorithms, multi-core processing, multi-threaded execution, memory management and fast math libraries.

At the same time, Python has been used for building massive web applications, scientific computing, data structuring, manipulation, query, analysis, and visualization in highly quantitative domains such as finance, oil and gas, physics, and signal processing. It has powered much of Google’s internal infrastructure. According to the TIOBE Software Index, Python is the 8th most popular programing language and the third most commonly used language on the Internet’s largest code repository (GitHub), ahead of Perl, Ruby, and JavaScript.


A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+  
CUBE Alumni Network
C-level and Technical
Domain Experts
15M+ 
theCUBE
Viewers
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.