UPDATED 13:30 EST / NOVEMBER 24 2025

INFRA

Kneron’s new AI chip brings LLM performance out of the cloud and onto devices

San Diego-based startup Kneron Inc., an artificial intelligence company pioneering neural processing units for the edge, today announced the launch of its next-generation KL1140 chip

Founded in 2015, Kneron is a leader in edge AI that designs reconfigurable chips and software to process large language models and AI algorithms directly on devices rather than in the cloud.

“The KL1140 is our response to the challenges of scaling LLMs in the cloud alone,” said founder and Chief Executice Albert Liu. “By running advanced models at the edge, we’re achieving a technical milestone that opens up entirely new applications for everyday devices, putting the power of LLMs directly into the hands of users.”

According to the company, the KL1140 is the first neural processing unit chip capable of running full transformer networks at the edge. That capability brings full LLMs out of cloud data centers and into portable, locally controlled devices. Four chips working in combination can deliver performance similar to a graphics processing unit for running models up to 120 billion parameters.

Kneron said the same configuration consumes one-third to one-half of the total power and reduces hardware costs by as much as tenfold compared to existing cloud solutions.

LLMs have traditionally required significant compute resources and constant network connectivity, which restricts them to cloud data centers. Edge NPUs matter because they make these models both practical and private on consumer and enterprise devices. By executing inference locally, NPUs reduce latency, cut cloud costs, eliminate the need to send sensitive data off-device, and allow AI features to work even without an internet connection.

The KL1140 is designed to support real-time natural language processing, voice interfaces, intelligent vision, robotics and more. Developers can use the chip to deploy AI applications locally and securely without the need to offload processing to the cloud.

Real-world practical uses for local AI

For example, a company might build a security system that understands natural language commands, monitors and watches video feeds and reports on complex situations. It could be used in automotive cases to run sophisticated AI models for voice commands and decision-making entirely in the car.

A private enterprise could offload its AI processing needs to a small edge server in an office. Thus, completely eschew any potential vulnerabilities involved in sending information outside of the corporate firewall.

“The arrival of the KL1140 is more than just another chip launch, it’s a tipping point in the journey towards practical, high-performance and sustainable AI,” said Liu.

Kneron enters a rapidly intensifying market for edge AI accelerators, where several established players are racing to support larger and more capable models on-device.

Qualcomm Inc. and MediaTek Inc. have each advanced their mobile NPUs with high-speed and improved transformer acceleration. Apple Inc. is continuing to expand its Neural Engine across the iPhone, iPad, and Mac lineup. Google LLC’s Edge tensor processing unit is focused on vision and lightweight models. Startups such as Hailo Technologies Ltd. and Mythic Inc. are targeting low-power embedded systems with custom analog or specialized digital architectures.

Few competitors directly claim the ability to run full transformer networks or LLM-scale workloads at the edge. By positioning the KL1140 as capable of multichip scaling and 100 billion-parameter-class inference, Kneron is attempting to carve out a unique niche between mobile NPUs and data center GPUs.

Kneron has raised more than $200 million to date, including a $49 million funding round in 2023 from Hon Hai Precision Industry Co. Ltd., better known as Foxconn, the world’s largest contract electronics manufacturer, along with several other investors.

The company has rapidly evolved from an edge chip designer into a full-stack solution and AI infrastructure provider. It now delivers local AI capabilities to hospitals, universities and government agencies, particularly those requiring strict privacy or regulatory compliance.

It’s currently growing out its edge AI ecosystem with the recent launch of the KNEO Pi developer platform, which claims more than 28,000 developers worldwide.

Image: SiliconANGLE/Microsoft Designer

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.