UPDATED 19:51 EST / APRIL 21 2020

AI

AWS and Facebook join on new open-source projects for PyTorch

Amazon Web Services Inc. and Facebook Inc. jointly announced a couple of new open-source projects today for PyTorch, a popular open-source machine learning framework used to train artificial intelligence models.

PyTorch was created by Facebook’s AI research group as a machine learning library of functions for the programming language Python. It’s primarily designed for use with deep learning, a branch of machine learning that attempts to emulate the way the human brain functions and has led to major breakthroughs in areas such as language translation, image and voice recognition.

PyTorch is supposed to help speed up the development of these kinds of AI capabilities, and has previously been used to build more realistic avatars for Facebook’s Oculus virtual reality headset. Researchers at UC Berkeley have also used PyTorch to speed up their work on image-to-image transformation, for example.

The new PyTorch projects announced today include TorchServe, which is a model-serving framework for PyTorch that makes it easier for developers to move new models into production. The second is TorchElastic, which is a library developers can use to build fault-tolerant training jobs on Kubernetes clusters such as Amazon’s EC2 spot instances or its AWS Elastic Kubernetes Service.

According to a blog post by Amazon, the TorchServe library supports models written in both the Python and TorchScript programming languages. The main benefit is it enables developers to run multiple versions of a model at the same time, and even roll back to a previous version of that model.

As for TorchElastic, this enables users to scale up their cloud-based AI model training resources according to their needs. It’s meant to be used in large, distributed machine learning projects such as natural language processing and computer vision, the companies said.

“The integration of Kubernetes and TorchElastic allows PyTorch developers to train machine learning models on a cluster of compute nodes that can dynamically change without disrupting the training job,” Facebook’s blog post reads. “The built-in fault-tolerant capabilities of TorchElastic allow training to continue even if nodes go down during the training process. This can take the form of things like server maintenance events, network issues, or the preemption of a server node.”

Meanwhile, the PyTorch 1.5 release includes a stable C++ front-end application programming interface that enables the framework to translate models from a Python API to a C++ API.

PyTorch 1.5 also comes with upgraded torchvision, torchtext and torchaudio libraries, the companies said. The torch_xla package that enables PyTorch to be used with Google Cloud Tensor Processing Unit chips has also been updated. Facebook had first added support for Google Cloud TPUs at the annual PyTorch developer conference in San Francisco last October.

Image: Facebook

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU