UPDATED 20:26 EDT / FEBRUARY 13 2024

AI

With Nvidia’s Chat with RTX, users can create personalized chatbots that run locally on PCs

by Mike Wheatley

Nvidia Corp. is pioneering yet another innovation in artificial intelligence with the launch of a new feature called Chat with RTX, which gives users the ability to create their own personal AI assistant that resides on their laptop or personal computer, rather than in the cloud.

The company announced Chat with RTX as a free technology demonstration today, saying it allows users to tap into personalized AI capabilities hosted on their device. The offering also leverages retrieval-augmented generation or RAG techniques and Nvidia’s TensorRT-LLM software, and yet it’s said to go easy on computing resources, so users won’t notice any decrease in the performance of their machine.

Moreover, because Chat with RTX is hosted on the user’s machine, it means all chats are totally private – so no one will ever know what they discuss with their personal AI chatbot. Until now, generative AI chatbots such as ChatGPT have largely been restricted to the cloud, running on centralized servers powered by Nvidia’s graphics processing units.

That changes with Chat with RTX, which enables generative AI to run locally using the computing power of the GPU that sits inside the computer. To take advantage of it, users will need a laptop or PC that’s fitted with a GeForce RTX 30 Series GPU or a later model, such as the newly announced RTX 2000 Ada Generation GPU. They’ll also need to have at least 8 gigabytes of video random-access memory, or VRAM.

The main advantage of having a local chat assistant is that users can personalize it to their liking by deciding what sort of content it’s allowed to access to generate its responses. There are also the aforementioned privacy benefits, and it will generate responses faster too, as there’s none of the latency associated with the cloud.

Chat with RTX uses RAG techniques that enable it to augment its basic knowledge with additional data sources, including local files hosted on the computer, while the TensorRT-LLM and Nvidia RTX acceleration software provide a nice speed boost. In addition, Nvidia said users can choose from a range of underlying open-source LLMs, including Llama 2 and Mistral.

Nvidia said the personalized assistants will be able to handle the same kinds of queries that people normally use ChatGPT for, such as asking for restaurant recommendations and so on. It will also provide context to its responses when necessary, linking to the relevant file where it sourced the information.

Besides accessing local files, Chat with RTX users will also be able to specify which sources they want the chatbot to use on services such as YouTube. So they can ask their personal chat assistant to provide travel recommendations based on the content of their favorite YouTubers only, for example.

In addition to those specifications, users will need to be running Windows 10 or Windows 11, and have the latest Nvidia GPU drivers installed on their device.

Developers will also be able to experiment with Chat with RTX via the TensorRT-LLM RAG reference project on GitHub. The company is currently running a Generative AI on Nvidia RTX contest for developers, inviting them to submit applications that leverage the technology. Prizes include a GeForce RTX 4090 GPU and an invitation to the 2024 Nvidia GTC conference that’s slated to take place in March.

With the launch of Chat with RTX, Nvidia is moving away from the cloud and data center and looking to become a software platform for PCs, said Holger Mueller of Constellation Research Inc. “It provides the key benefits of privacy, flexibility and performance for generative AI applications that can run locally on the machine,” he explained. “For Nvidia, this is primarily about developer adoption, and that is a smart move as the biggest winners in the AI race will be the software platforms that have the most developers using them.”

Image: Nvidia

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.