Software is eating data science | #HPBigData2014
Automation is sending the data scientist the same way as the switchboard operator, according to Brian Weiss, the man who Hewlett-Packard Co. appointed to help along the transition. As the head of subject matter experts for the company’s Autonomy business, he is in charge of finding specialists with deep knowledge in their respective fields and translate that insight into software that can address business challenges customers had no choice but to handle on their own in the past.
Speaking on SiliconANGLE’s theCUBE at HP’s recently concluded Vertica Big Data Conference in Boston, Weiss highlighted that the large amount of manual work currently associated with turning raw information into useful intelligence leaves a lot to be desired from an operational standpoint. Nonetheless, data scientists remain among the most sought-after and highly paid professionals in the industry, but he sees the interest surrounding the role gradually fizzling out as organizations move to streamline their analytics investments.
“Over the next five years, someone will say ‘why am I spending $500,000 on people to do this work, when I can do it with software?’ So in the same way you see software starting to do what people are doing,” Weiss told hosts John Furrier and Dave Vellante. “Increasingly, companies like HP are figuring out how to automate that, but we’re still at the very early stages and there’s so much exciting work to do.”
The relative immaturity of the data analytics movement has manifested itself in a great deal of confusion among technology and business leaders over exactly how they should go about tapping into their fast-growing information troves, according to Weiss. The lack of clearly-defined expectations in particular had proven to be a major barrier to projects, but he said that is much less of an issue than it used be. Now that they’ve developed cohesive requirements for what they want out of their data, companies have an entirely new set of organizational challenges to grapple with. At the top of the list is the continuous tug-of-war between risk aversion and the pursuit of benefit.
“You can break it down into two different poles, one of which is cost and risk, and you got a whole group of people who look at information from that perspective, and that’s where the GC [General Counsel] sits,” Weiss detailed. “ And then there’s a group of people who see business value and money in that data, and they’re fighting.”
Yet as difficult as it is to balance the two priorities at the corporate level, the differences between the camps become all but blurred at the technological level. Weiss explained that a dataset an auditor might exploit to map out risk factors and identify fraud could also be useful for a front-end analyst seeking to gain a better understanding of employee sentiment. And often enough, the tools and techniques are similar as well.
“Being able to categorize and tag things is the same technology that I would need to do a Big Data conversation with my CMO and with the compliance folks,” he said. “So basically you get two pots of money to dig into.” That makes it easier to justify analytics projects, but the greater the demand, the bigger the implementation has to be. Unfortunately, data science is not particularly scalable when it’s a human that’s doing the heavy lifting, a barrier that Autonomy claims to address with its flagship IDOL search solution.
The software hooks into data sources such as document responsories and video feeds, creates a centralized index that eliminates the need to shuffle data across the network and puts it all into the context of other information, Weiss said. That serves to eliminate much of the manual work involved in tasks like categorization, which can be beneficial for a wide range of use cases beyond just compliance, including patient care delivery.
“Historically, what has happened in the medical industry is that all gets put down into one code, which is the billing code,” he detailed. “It’s the travesty of ETL in the medical world: at the end of everything I learned about you, I have to say ‘I have to bill it to the following code.’ So read a 500-page book and give me one code to describe it.”
Weiss detailed that IDOL enabled Stanford’s Lucile Packard Children’s Hospital to break the pattern and make use of previously neglected data such as doctor’s notes and patient descriptions to gain a better understanding of individual cases. The software has also made it possible for the hospital to tap into unstructured information from other institutions, he said, opening the door to a whole new dimension of medical collaboration and research.
photo credit: lauramappin via photopin cc
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU