3 Things You Need to Know About Data Scientists

EEG visualization by Enzo Varriale

EEG visualization by Enzo Varriale Developers may be the new king makers, as our pals at Red Monk have been saying for years, but they may soon be usurped by data scientists. Data scientist roles are even harder to fill than developer roles, and the greater emphasis being placed on big data is making this new type of worker a power player in the enterprise.

A new survey from EMC, which just launched a data scientist training program seeks to delve into the mind of the data scientist and illuminate how this new worker is different from traditional data analysts and business intelligence professionals. EMC worked with StrategyOne to survey 462 data scientists and BI professionals. Chuck Hollis, global marketing CTO of EMC, shares the details of the survey and shares some thoughts on his blog.

Here are the highlights of the survey:

1. Demand for Data Scientists May Outpace Supply

EMC survey - supply and demand

According to the survey, 83% of respondents expect to see an increased demand for data scientists. Another 63% respondents believe that demand will outpace supply, and few (10%) believe that traditional business intelligence professionals can fill the data scientist role. And according to Hollis, demand is already outpacing supply.

2. Data Scientists Definitely Put the Science in “Data Science”

EMC data science survey - life cycle of data

Why are data scientists so hard to come by? Well, for one thing they tend to be more educated and perform less traditional, less easily defined tasks. BI professionals often know what metrics to analyze, what data to look for. Data scientists are more in a more experimental position.

“Data scientists are essentially ‘data experimenters’ vs. rote analysis — and they’re likely to be interacting with IT functions in far more positive ways than the norm,” writes Hollis. Data scientists work across the organization and have wide ranging tasks, including finding data sources from outside the organization. Their experimentation with data tends to make the work of data scientists into “real” science. And that requires a background in working with quantitative data.

Data scientists tend to come from more “analytically-intensive” scientific fields – such as computer science, mathematics and engineering- rather than from traditional business backgrounds. And they have a tendency to be highly educated – 40% of data scientists have a master’s degree or better. “I recently met one fascinating gentleman who had three PhDs in seemingly unrelated fields,” Hollis writes.

3. Organizational Structure is an Obstacle for Data Scientists

The most common problems cited in the survey were lack of budget or resources and the lack of training and skills (32% each). But the third issue, reported by 14% of respondents, may be a deeper issue. Of course training is an issue for this new profession, and budget issues are always a problem for everyone. But the organizational structure matter shows the challenges that data scientists face when trying to work across departments.

EMC survey cross-organization collaboration

Data scientists work with business management, marketing, sales, HR, IT and more. This need to work across several departments may be part of why data scientists have a tendency to prefer smaller organizations – ones with fewer than 500 employees. Finding ways to make it easier for data scientists to collaborate across silos may help larger enterprises both retain data scientists and help them be more effective.

Social collaboration tools like enterprise microblogging, activity streams, video conferencing and unified communications may help. Also, adopting strategies from the DevOps movement could help improve relations with the IT department. Data scientists are not necessarily developers, though many do come from computer science backgrounds, but will have similar ways of working and will have similar demands of IT. The ability of IT to provide self-service tools, deploy changes quickly and support big data clusters will be of increasing importance.

Adding It Up

EMC survey

According to the survey, data scientists want to learn more about data storage and cloud computing. This may be self-serving since data storage is EMC’s historic core strength and cloud computing is the company’s current big push, but it’s worth mentioning the subjects hot on data scientists’ minds. You can bet our frequent coverage areas, such as storage, Hadoop and other big data processing tools, enterprise collaboration and DevOps will continue to be big areas of interest as data scientists rise in importance in more organizations.

Lead image by Enzo Varriale