d-Matrix Secures $275M to Scale AI Inference Chips
Silicon Valley chipmaker d-Matrix raised $275M in Series C funding, backed by Qatar Investment Authority, to scale its in-memory AI inference platform. The company aims to cut latency and power costs as global demand shifts from training to large-scale AI deployment.
Silicon Valley-based chip designer d-Matrix is positioning itself to tackle the escalating costs of AI compute, leveraging significant backing from the Qatar Investment Authority (QIA) to challenge the dominance of traditional GPUs in the rapidly expanding inference market.
With the artificial intelligence sector pivoting from model training to large-scale deployment, d-Matrix CEO Sid Sheth argues that the industry has hit a "memory wall" that legacy hardware cannot efficiently breach. Speaking at Web Summit Qatar, Sheth outlined how his company’s "in-memory computing" architecture aims to drastically reduce latency and power consumption for generative AI workloads, a critical requirement as data centers in the Middle East and globally scramble to meet soaring demand.
The Shift to Inference
While the last decade of AI development focused heavily on training—teaching models how to process data—the next phase is defined by inference, or the application of that knowledge. According to Sheth, inference already accounts for nearly 60% to 70% of total AI compute costs, a figure projected to rise as enterprises deploy reasoning models and real-time applications.
"Training is the first 20 years. Inference is the next 40," Sheth explained, drawing an analogy to human education and subsequent career application. "If you were to ask anybody which one is the bigger opportunity, it's obviously the next 40, where you monetize and learn."
The problem, he noted, is that the current infrastructure relies on GPUs designed primarily for training. Sheth described this approach as "using a sledgehammer to crack a nut," highlighting that retrofitting training hardware for inference results in massive inefficiencies in cost and energy.
Breaking the Memory Wall
The core technical bottleneck lies in the separation of compute and memory in traditional chip architectures. Data must constantly travel between the two, consuming time and power—a phenomenon known as the memory wall.
d-Matrix’s solution, the "Corsair" platform, integrates memory directly onto the compute substrate. Sheth compared this architecture to urban planning in space-constrained environments. "It's like building a skyscraper," he said, describing how d-Matrix stacks memory vertically on top of the compute layer. "All the data is sitting on the top... and the data is just dropping, raining down into the compute."
This approach allows for significantly higher memory bandwidth and capacity, which Sheth claims is essential for the next generation of "agentic" AI and real-time video generation. These applications require immediate responsiveness that current GPU setups struggle to deliver efficiently at scale.
Strategic Regional Alignment
The company’s recent $275 million Series C funding round, which included participation from QIA, underscores the strategic importance of the Middle East in the global AI infrastructure build-out. The region is aggressively expanding its data center capacity, driven by abundant energy resources and a push for digital sovereignty.
"Energy is the constraint in the US," Sheth noted, contrasting this with the Middle East's ability to support power-hungry facilities. He also pointed to the region's geographic advantage, serving as a connectivity hub between Europe, Africa, and Asia via extensive subsea cable networks. "Why would Netflix put a data center in the US? They will put a data center in Saudi Arabia, or they put it in Qatar... and they can serve their African customers, they can serve their European customers."
Future Outlook
As the market matures, d-Matrix predicts a fragmentation of the hardware landscape. While GPUs will remain dominant for training, Sheth envisions a heterogeneous environment where specialized chips handle specific inference workloads.
"It's not a one-size-fits-all problem," Sheth concluded. "It's going to be a world where it's GPUs plus solutions like d-Matrix... and that is the best way to really address the speed problem."
With a valuation now touching $2 billion and substantial sovereign backing, d-Matrix is betting that the future of AI will be defined not just by how smart the models are, but by how efficiently they can think.