Amazon and Cerebras Systems have announced a partnership to integrate their computing chips into a new artificial intelligence inference service hosted on Amazon Web Services (AWS). The service aims to accelerate AI workloads such as chatbots, coding assistants, and other generative AI applications.
Under the agreement, Cerebras processors will be installed inside AWS data centers and connected with Amazon’s Trainium3 custom AI chips through proprietary networking technology. The companies did not disclose the financial terms of the partnership.
Cerebras, valued at roughly $23.1 billion, has developed a unique AI chip architecture designed to compete with Nvidia’s processors. Unlike many AI accelerators, Cerebras chips do not rely on high-bandwidth memory, a costly component used in Nvidia’s flagship GPUs.
Cerebras Chief Executive Andrew Feldman said the partnership could expand access to the company’s technology through AWS’s large customer base.
“Every customer large or small is on AWS, from individual developers to the largest banks in the world,” Feldman said. “This will make it easy as a click to get on Cerebras.”
Divide-and-Conquer Inference Architecture
The joint service will focus on inference, the stage where trained AI models generate responses to user prompts. Instead of running the entire inference workload on a single chip architecture, Amazon and Cerebras plan to split the process between two specialized systems.
Amazon’s Trainium3 chips will handle the “prefill” stage, which converts user input into tokens that AI systems process. Cerebras processors will manage the “decode” stage, where the model generates output tokens to produce responses.
According to Feldman, this “divide and conquer” approach is intended to improve speed and efficiency for AI inference workloads.
The strategy reflects a broader industry trend toward specialized AI infrastructure where different processors handle distinct parts of AI pipelines to improve performance and cost efficiency.
Competition in AI Infrastructure
The partnership comes amid intensifying competition in AI hardware and infrastructure. Nvidia currently dominates the AI accelerator market, but several companies are attempting to develop alternatives to its GPU-based architecture.
Analysts expect Nvidia to unveil a similar multi-chip strategy soon. Reports indicate the company may combine its GPUs with chips from startup Groq, which Nvidia acquired in a deal valued at $17 billion late last year.
Amazon said its service, expected to launch in the second half of the year, could offer better price-performance compared with GPU-based solutions.
The company added that its Trainium chip roadmap, including the upcoming Trainium4 processor, is designed to provide an alternative to third-party AI hardware by optimizing cost and efficiency for AWS customers.