INFRA
INFRA
INFRA
XCENA Inc., a startup with a memory device designed to speed up artificial intelligence clusters, today announced that it has raised $135 million in funding.
The Series B round was led by Korean funds Atinum Investment and IMM Investment. XCENA says that the raise also included contributions from more than a half dozen other institutional backers. The company is now valued at $570 million.
XCENA was founded in 2022 by former employees of Samsung Electronics Co. and SK hynix Inc., the world’s top suppliers of memory for graphics cards. Its flagship product is a device called the MX1 that it describes as a computational memory controller. It’s designed to speed up the data management tasks involved in running AI inference workloads.
Large language models use a data structure called a KV cache to interpret user prompts. When the KV cache can’t fit in a graphics card’s built-in memory, it has to be offloaded to slower external DRAM, which creates processing delays. A similar issue affects the vector databases that many LLMs use to store information.
XCENA says the MX1 addresses the challenge. The device combine up to two terabytes of DRAM with several thousand central processing unit cores. It can hold an LLM’s KV cache and vector databases without the performance issues that affect traditional memory devices. The result is an increase in inference performance.
Another way the device accelerates AI workloads is by reducing the need for duplicate calculations. Many LLMs refresh their KV cache, the data structure they use to interpret prompts, after every user request. MX1 makes it possible to reuse the same KV cache across requests and thereby reduce processing overhead.
The company says the chip can also accelerate analytics applications such as Apache Spark. Such workloads regularly move data between the CPUs on which they run and the memory they use to hold data. The MX1’s memory pool and CPU cores are closer to each other than the components of a standard server, which reduces data travel times.
The device’s CPU cores are based on the open-source RISC-V architecture. They’re organized into four-core clusters that each have a dedicated L1 cache, a type of high-speed memory. The four-core clusters are organized into larger clusters that likewise have an integrated memory pool.
XCENA provides application programming interfaces that enable developers to port their AI workloads to the MX1 without major code changes. According to the company, customers with more advanced requirements have access to a second set of APIs that can be used to make low-level performance optimizations. It also provides a simulation tool that eases software reliability testing.
The company plans to make the MX1 using Samsung’s four-nanometer chip manufacturing process. According to TechCrunch, it will begin mass production by the end of the year and expects to start generating revenue in 2027.
The company will use the proceeds from its funding round to develop new computational memory products. In addition, it plans to accelerate its go-to-market efforts and establish partnerships with key industry players such as hyperscalers.
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.