‘Insights are worth pennies; decisions and actions are worth dollars’ | #GuestOfTheWeek
The industry is focused more than ever on using Big Data and analytics to garner real-time intelligent information that will catapult a business decision-making process. To continue the discussion on Big Data, open-source, data security and the ecosystem as a whole, industry experts met up at Hadoop Summit and Red Hat Summit, which both took place this week.
Joseph Sirosh, corporate VP of the Data Group at Microsoft, was a keynote speaker at both events, where he demonstrated the value Microsoft technology and open-source projects can produce. He is SiliconANGLE Media’s Guest of the Week.
Sirosh caught up with John Furrier (@furrier) and George Gilbert (@ggilbert41), cohosts of theCUBE, from the SiliconANGLE Media team, at the Hadoop Summit in San Jose, California, to discuss the past, present and future of Big Data, analytics and artificial intelligence (AI).
Data is the new currency
As the interview began, Furrier wanted to know about the impact that data has on society and why data is a premium. Sirosh quickly debunked that theory by explaining how data alone is useless.
“I have a saying: ‘Insights are worth pennies; decisions and actions are worth dollars.’ Data is useless. The gap between data and taking intelligent action … that’s now being empowered with machine learning and Android. That machine learning and Android is deeply embedded in databases, in real-time databases, so that applications can be built that are very powerful, empowered data and predictive intelligence.”
The perfect storm
Furrier spoke about the maturation of the industry, he asked Sirosh about his thought on the development of machine learning and AI. He responded that the gathering of data has rapidly advanced machine learning and AI.
“So why has AI dramatically changed in the last few years compared to 30, 40, 50 years ago when it was started? It’s all a part of data. There is data on every kind of behavior, on every kind of creature on the planet, on every human being and on so many scenarios. So you can learn from the data, and then you can become truly intelligent. You can model the phenomenon so you can be predictive. And when you have an enormous amount of data, you can integrate all that data in the cloud, with an enormous amount of compute. And that allows you to create.”
What’s ahead
Furrier then asked Sirosh about Microsoft’s upcoming innovation strategies for data. Sirosh spoke about combining APIs to further intelligent applications.
“We are packaging intelligence into cloud-hosted APIs. For example, face-detection APIs, speech-recognition APIs, translation APIs, all CR APIs, the list is actually very long. So, on the cloud, on Microsoft Azure today, you will see cognitive APIs for a large variety of tasks, and they are ever increasing.
“So when you have this finished API, application development becomes very simple. You just glue those applications and APIs together, and you get a very powerful application, all written in the cloud, supported with SLAs of the cloud and backed by a company like Microsoft. That creates unreasonable speed in developing intelligent applications.”
Approach to security
Following up on the API discussion, Furrier asked the next obvious question: “What is Microsoft’s approach to security?”
“This is one of the areas where Microsoft differentiates itself with its active directory for role-based access control, so the active directory is the way most enterprises run their Office 365 deployments and user authentication and access control. It is seamlessly integrated with Microsoft Azure, and you do have identity and verification. Then we have encryption support. We have all other types of things that we have layered on: for example, SQL database has fraud detection. It’s intelligently scanning all types of the SQL running against it and alerting the customer about the things they should know about.”
Filling a hole in the ecosystem
Gilbert talked about a hole in the Hadoop ecosystem and asked Sirosh how Microsoft is pulling everything together. SQL Server is his answer.
“This is one of the areas we focused on in a big way with SQL Server 2016 — operational analytics. Let me explain. So SQL Server 2016 has in-memory OLTP [On Line Transaction Processing] transactions. It can support up to 12 terabytes of main memory in one server, by the way, and at the same time, you have real-time updatable column stores for real-time analytics. Then we integrate R [The R Project for Statistical Computing], which is open source, with capabilities of machine learning, deep into the database so that you can have intelligent algorithms running next to aggregation created in real time off of real-time transactions that are coming in. So you can have real-time scoring of models happening in those systems.
“This all comes packaged in a database, which has great security, which [provides] great capabilities and high availability and replication — and all the things database engineers build over the years. … Here’s the transformation that happened; it’s not just a database, it’s an intelligence base. Meaning intelligent models built out of R are managed at the database system at the management system that now extends to predictive intelligence as well. So when you bring all that together, that is when you have the ability to build extremely powerful applications. Operational analytics applications that can take in real-time data can do real-time analytics and drive real-time decisions, such as quantitative trading in financial markets that customers are using.”
Analytics in the present
According to Furrier, the holy trinity of analytics is past, present and future. He asked Sirosh about the present. Sirosh explained how Microsoft has built a simplified answer to getting information in real time.
“In the cloud, for example, in the Cortana intelligence suite, we have this great service called Azure Stream Analytics. What they allow you to do is put a standing query in the flow of data, and because it is SQL query, it’s actually very easy to describe so you can have complex, even modeling. You don’t have to write Java code and compile it; it is just an SQL query that is standing in the flow, and it aggregates in real time as the data flows through. And those aggregates can go to dashboards … aggregates can go to all machine learning APIs and get results back, so you can do a thing like finding anomalies in the data stream, or if machines are about to fail, or the predictive analytics from the streaming data. That is an incredibly powerful capability.
“Then second, we’re in the cloud again on our HD; inside Hadoop service you have a layer of Spark. We’re a big fan of Spark as well, and Spark supports streaming analytics. You can now combine Spark streaming with batch analytics with powerful machine-learning capabilities from our server. All of that deployed in the cloud as a service.”
Watch the complete interview below, and be sure to check out more of SiliconANGLE and theCUBE’s coverage of the Hadoop Summit US.
Photo by SiliconANGLE
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU