What you missed in Big Data: Machine intelligence


Machine-generated data returned to the agenda last week after industrial heavyweight PTC Inc. and ServiceMax Inc. combined forces to give organizations much-needed visibility into the growing number of connected devices finding use throughout their operations. The duo is kicking off the effort with the introduction of a jointly-created tool that enables technicians to monitor for signs of equipment deterioration via the latter’s industrial automation platform.

The aptly-named Connected Field Service displays error warnings from malfunctioning hardware in real-time along with a wealth of diagnostic information that is meant to simplify troubleshooting. Most of the work involved in handling the data is carried out by PTC’s device automation platform, ThingWorx, one of the more popular options for manufacturers seeking to tap the sensory output of their gear. The software is complemented by an analytics toolkit that makes it possible to scan the raw telemetry for insights using machine learning, which became a central talking point in its own right last week thanks a new open-source contribution from Yahoo Inc’s research division.

The company released a 13.5-terabyte collection of user activity records that data science teams will be able to use in order to test the performance and accuracy of their analytic models. The massive batch consists of over 110 million demographic and page interaction logs aggregated from from the company’s biggest online properties, including Yahoo News and Yahoo Finance, over the course of four months. That makes the volume the biggest of its kind to have ever been made available under a free license, and milestone that fellow web giant Baidu Inc. couldn’t leave unanswered.

The company last week open-sourced a machine learning library originally developed to improve voice searches on its popular Chinese search engine that is touted as as many dozens of times faster than competing alternatives. The software also makes much more efficient use of memory, according to Baidu, a combination that should enable data scientists to whip their algorithms into working shape much quickly than they were able to until now.

Image via blickpixel