UPDATED 18:15 EDT / MARCH 09 2023

Computer vision a crucial bridge between AI and human intelligence, says Roboflow CEO

The ultimate aim of modern computing advancements, such as artificial intelligence and machine learning, is to make as much of the human experience as possible programmable.

And with the advancements in generative AI being led by companies such as Roboflow Inc., we might be witnessing the maturity of computer vision and the expansion of modern software capabilities all around.

“Roboflow exists to really make the world programmable,” said Joseph Nelson (pictured), co-founder and chief executive officer of Roboflow. “And our North Star is enabling developers predominantly to build that future. But the limiting reactant is how to enable computers and machines to understand things as well as people can. And, in many ways, computer vision is that missing element that enables anything you see to become software. If software is eating the world, computer vision makes the aperture infinitely wide.”

Nelson spoke with theCUBE industry analyst John Furrier at the AWS Startup Showcase: “Top Startups Building Generative AI on AWS” event, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed the current state of AI and how the playing field has advanced from just a few years ago. (* Disclosure below.)

LLMs and their impact on the AI landscape

Everyone’s talking about large language models, such as ChatGPT and Bard, and taking advantage of their vast spectrum of functions. However, even these super-capable tools have a notable deficiency, according to Nelson.

“The rise of large language models is showing what’s possible, especially with text,” he explained. “Although there’s this core missing element of understanding. The rise of large language models creates this new area of generative AI. In the context of computer vision, it is a lot of creating video and image assets and content. There’s also this whole surface area to understanding what’s already created — basically digitizing physical, real-world things.”

In essence, computer vision links virtual, AI-driven experiences to the physical ones with which we interact in our everyday lives. And mirroring these experiences will be crucial in cases such as the budding metaverse, Nelson added.

“The metaverse can’t be built if we don’t know how to mirror, create or identify the objects that we wanna interact with in our everyday lives,” he said. “Where computer vision comes to play, especially with what we’ve seen at Roboflow, is a little over 100,00 developers now have built with our tools over 10,000 pre-trained models using more than 100M labeled open-source images.”

Human intuition and decision-making, as advanced as it is, remain fallible. Generative AI, as expressed in these LLMs, imbues computers with the logic, reasoning and critical thinking to fully understand visual and auditory input cues and compensate for human shortcomings, Nelson concluded.

Computer vision today vs. a few years ago

Computer vision is used to describe a set of processes by which machines and computers are imbued with capabilities to act on visual data as effectively as humans. Typically, these capabilities have seen immense use in situations such as object identification, classification and manipulation.

“Then you have key point detection, which is where you see athletes on screen and each of their joints is outlined,” Nelson explained. “This is another more traditional type of problem in signal processing and computer vision.”

The subfield is bringing about a reimagining of what’s possible within artificial intelligence, setting the course for nano-level precision and accuracy in the carrying out of tasks. This has already occurred in the example of Rivian Automotive Inc., an electric car company and Roboflow customer.

“One of our customers Rivian, in tandem with AWS, is tackling visual quality assurance and manufacturing in production processes,” Nelson explained. “Now, only Rivian knows what a Rivian is supposed to look like. Only they know the imagery of what their goods that are gonna be produced are. And then between those long tails of proprietary data with highly specific things in the center of the curve, you have a whole kind of messy middle type of problem.”

ML model requirements are only going to become even more complex. And as that happens, companies are going to rely on techniques like computer vision to efficiently and effectively feed those models with the most important resource of all, data.

“My mental model for how computer vision advances is this: You have that bell curve, and you have increasingly powerful models that eat outward,” Nelson stated. “And multimodality has a role to play in that; larger models also have a role to play in that. The existence of more compute and data also has a role to play in that.”

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of the AWS Startup Showcase: “Top Startups Building Generative AI on AWS” event:

(* Disclosure: Roboflow Inc. sponsored this segment of theCUBE. Neither Roboflow nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Computer vision a crucial bridge between AI and human intelligence, says Roboflow CEO

LLMs and their impact on the AI landscape

Computer vision today vs. a few years ago

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

RAISE Summit 2026

Pure Accelerate 2026

FinOps X 2026

Snowflake Summit 2026

Freshworks Refresh 2026

Computer vision a crucial bridge between AI and human intelligence, says Roboflow CEO

LLMs and their impact on the AI landscape

Computer vision today vs. a few years ago

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

RAISE Summit 2026

Pure Accelerate 2026

FinOps X 2026

Snowflake Summit 2026

Freshworks Refresh 2026