2026-03-23 16:00 UTC

Startup Gimlet Labs is solving the AI inference bottleneck in a surprisingly elegant way

Gimlet Labs just raised an $80 million Series A for tech that lets AI run across NVIDIA, AMD, Intel, ARM, Cerebras and d-Matrix chips, simultaneously.

Startup Gimlet Labs is solving the AI inference bottleneck in a surprisingly elegant way Julie Bort 9:00 AM PDT · March 23, 2026 Stanford adjunct professor and successfully exited founder Zain Asgar just raised an $80 million Series A for a startup that solve the AI inference bottleneck problem in an astute way.

The round was led by Menlo Ventures.

The company, Gimlet Labs , has created what it claims is the first and only “multi-silicon inference cloud” which is software that allows an AI workload to be simultaneously run across diverse types of hardware.

It can split an AI app’s work across both traditional CPUs and AI-tuned GPUs, as well as high-memory systems.

“We basically run across whatever different hardware that’s available,” Asgar told TechCrunch.

A single agent may chain together multiple steps, and each “requires different hardware: Inference is compute-bound; decode is memory-bound; and tool calls are network-bound,” writes lead investor, Menlo’s Tim Tully, in a blog post about the funding.

← Back to latest posts