What you missed in Big Data: Training AIs


Last week witnessed yet another escalation in the fight to win over the hearts and minds of the machine learning community. The first shot came from Google Inc., which open-sourced a homegrown automation framework designed to simplify the deployment of the complex algorithms used in recommendation engines and ad delivery platforms.

TensorFlow Serving provides the ability to package every component of a machine learning model into a self-contained module that can be managed independently from the rest of the project. The approach removes the need to refresh the entire deployment whenever a sub-system needs to be updated, which drastically reduces the amount of work involved in the process. Organizations thus gain the ability to roll out new features and performance improvements much more frequently than traditional deployment methodologies allow.

Not wanting to leave the launch of TensorFlow Serving unanswered, Facebook Inc., another major player in the machine learning community, last Thursday published a 1.6-gigabyte training dataset that can be used to hone natural-language processing algorithms. It’s made up of classic children’s stories from the Gutenberg Project that the have been organized into a semi-structured format for easy parsing. The contribution is the latest in a wave of such releases that previously saw Yahoo Inc. make 13.5 terabytes of user activity logs from several of its biggest online media properties available for free.

Enterprise vendors are also starting to join fray. Salesforce.com Inc. led the way last week wih the acqusition of PredictionIO Inc. the startup behind the popular open-source algrithm development software of the same name. The financial terms of the deal weren’t disclosed, but the cloud giant did reveal that it’s planning to use the technology to improve the accuracy of the machine learning capabilities in its sales automation platform.

Image via Pixabay