GenAI research startup Bud Ecosystem has launched Bud Runtime, a pioneering solution that enables generative AI deployment on CPU-based infrastructure, reducing costs and improving accessibility.
Aimed at tackling the rising financial and environmental costs of generative AI, Bud Runtime allows organisations to deploy AI models using their existing hardware, bypassing expensive and often scarce GPUs.
This platform supports CPU inference alongside GPUs, HPUs, TPUs, and NPUs from leading vendors such as Nvidia, Intel, AMD, and Huawei. Its standout feature—heterogeneous cluster parallelism—lets companies deploy AI workloads across mixed hardware environments, enabling smoother scalability and alleviating GPU supply constraints.
“We began our GenAI journey in early 2023 and quickly encountered the high cost of GPUs,” said Jithin VG, CEO, Bud Ecosystem.
To tackle the challenge, the team developed the initial version of the Bud Runtime, designed to operate smaller models on their current infrastructure. They also support mid-sized models on CPUs and ensured compatibility with hardware from various manufacturers, including Nvidia, AMD, Intel, and Huawei.
With Bud Runtime, companies can kickstart generative AI initiatives for as little as $200 per month, a fraction of the traditional cost. This makes it particularly valuable for startups and research institutions often priced out of the AI landscape.
The launch builds on Bud Ecosystem’s ongoing collaboration with major technology players, including Intel, Microsoft, Infosys, and LTIM. For the past 18 months, the company has been working with Intel to optimise generative AI inference on Intel Xeon CPUs and Gaudi accelerators.
“Our mission is to democratise GenAI at scale by commoditising it,” said Linson Joseph, chief strategy officer, Bud Ecosystem. “This is only possible if we use commodity hardware for GenAI at scale.”
Founded with a focus on fundamental AI research, Bud Ecosystem has made significant strides in transformer architectures for low-resource environments, hybrid inference models, and decentralised AI systems.
The company has published multiple research papers and released over 20 open-source models. Notably, it remains the only Indian startup to have topped the Hugging Face LLM leaderboard with a model on par with GPT-3.5.