Nvidia debuts speedier data center platform for AI-driven services
Nvidia Corp.’s graphics processing unit chips have become the foundation of the burgeoning field of machine learning, which uses software roughly emulating parts of how the brain works to enable computers to learn on their own. GPUs’ ability to run many tasks at once in parallel at has led to recent breakthroughs in speech and image recognition, movie recommendations and autonomous cars.
Late Wednesday, the chipmaker doubled down on its machine learning products with a new data center chip and software aimed at speeding up those services and enabling new ones such as more natural-language interactions between humans and machines.
In particular, the new platform, called the TensorRT Hyperscale Inference Platform, is focused at “inferencing,” the process of running deep learning neural network models, which infer things based on new data they’re presented with to carry out tasks. Distinct from training the models, which generally requires even more processing horsepower, inferencing has often been done using servers with standard central processing units inside.
At Nvidia’s GPU Technology Conference running Thursday in Tokyo, Chief Executive Jensen Huang (pictured) and his executives introduced several new products. For one, he revealed a new relatively small, low-power chip, called the Tesla T4, with a so-called Turing Tensor Core designed for inferencing. A successor to the current Tesla P4, the T4 has 2,560 cores and can run up to 260 trillion operations per second or teraflops.
Huang also announced a refresh of Nvidia’s TensorRT software, which can speed processing by up to 40 times faster than CPUs. It includes an inference optimizer called TensorRT 5 and the Tensor RT inference server, a microservice in a software “container” that can run popular AI software frameworks and integrate with container orchestrators Kubernetes and Docker. The latter is available on Nvidia’s GPU Cloud.
Ian Buck, vice president and general manager of Nvidia’s Accelerated Business, explained that currently data centers contain different software for various kinds of tasks such as image recognition, search and natural language processing, so it’s not as efficient. With Nvidia’s new inferencing platform, he said, applications can be speeded up using the same architecture and software. Google LLC, for one, will add the T4 to its public cloud, and the major server makers said they would use it as well.
Nvidia claimed the use of GPUs for inferencing has already, for example, helped Microsoft Corp.’s Bing search engine improve latency by 60 times and SAP SE provide real-time brand impact information to advertisers 40 times faster.
Also announced at the event was what Nvidia claimed is the first AI computing platform for autonomous machines, from cars to robots to drones. Specifically, there’s a new AGX Embedded AI HPC line of servers, part of the family that includes the DGX series for the data center and the HGX line for so-called hyperscaler companies such as Google LLC and Facebook Inc.
Another new product is the Jetson AGX Xavier, a developer kit that Rob Csongor, Nvidia’s vice president for autonomous machines, said is the first AI computer for applications such as robotics. Among the partners announced for it are Komatsu Ltd. in construction, Yamaha Motor Co. Ltd. in autonomous marine and drone vehicles and Canon Inc. in factory automation vision systems. “This is our next big market and we believe it’ll be transformational,” Csongor said.
The company also issued a seemingly not unreasonable number on the market for AI inference as well: $20 billion in the next five years. That may help Nvidia continue its long run of generally better-than-expected earnings results for some time to come.
Photo: Nvidia/livestream
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU