OpenAI and Broadcom Unveil Jalapeño, a Custom AI Inference Chip Nine Months in the Making

Image: Bloomberg AI
Main Takeaway
OpenAI and Broadcom revealed Jalapeño, their first custom AI inference chip, after a nine-month design sprint with early samples now testing in OpenAI's labs.
Jump to Key PointsSummary
What Jalapeño is and why it exists
OpenAI revealed its first custom artificial intelligence chip on Wednesday, a processor called Jalapeño that it developed with Broadcom. The chip is purpose-built for inference, the computationally intensive work of running trained AI models like ChatGPT and Codex to answer queries and generate code. OpenAI's own AI models assisted in the chip's development, according to TechCrunch. Early testing indicates Jalapeño delivers better performance per watt than existing state-of-the-art chips, though the company has not released specific benchmarks.
The announcement arrives eight months after OpenAI and Broadcom first disclosed their custom chip partnership in October 2025. At that time, the companies framed the collaboration around deploying 10 gigawatts of OpenAI-designed AI accelerators. Jalapeño represents the first tangible output of that agreement, with Broadcom handling manufacturing and Celestica contributing to the production pipeline. The chip has already taped out and early samples are running in OpenAI's laboratory, with deployment planned by year-end.
How Jalapeño fits into OpenAI's infrastructure strategy
Custom silicon is becoming a strategic imperative for AI labs struggling to secure enough computing power. OpenAI and competitors like Anthropic face persistent shortages of the high-end chips needed to run their most capable models. By designing its own processor, OpenAI gains control over the hardware-software interface and can optimize specifically for its model architectures rather than relying on general-purpose designs. This vertical integration mirrors strategies pursued by Google with its TPU line and Amazon with Trainium and Inferentia.
Bloomberg reports that the companies claim Jalapeño can cut costs by 50%, a figure that would dramatically reshape OpenAI's economics if it holds at scale. The chip targets inference workloads that already power ChatGPT, Codex, and OpenAI's API services. These represent the highest-volume, most cost-sensitive operations in OpenAI's business. Training new models remains enormously expensive, but inference costs dominate at scale and directly affect pricing and margins for consumer and enterprise products.
The competitive signal for Nvidia and the chip market
Jalapeño's arrival marks another crack in Nvidia's near-monopoly on AI infrastructure. Ars Technica noted that industry observers view the move toward custom solutions as a direct response to Nvidia's market dominance, which has concentrated pricing power and supply chain leverage in a single vendor. OpenAI is not alone in this pivot. Meta, Google, Amazon, and Microsoft have all invested heavily in custom or semi-custom AI chips, though most still rely on Nvidia GPUs for training and significant portions of their inference workloads.
Broadcom's role as manufacturing partner is notable. The company has built a substantial business designing custom chips for hyperscalers, including Google's TPUs, and brings deep expertise in high-volume semiconductor production. For OpenAI, partnering with Broadcom rather than attempting to build manufacturing capability internally avoids the capital intensity and technical risk of fab ownership. The nine-month design-to-tape-out timeline, reported by Digg, is aggressive by semiconductor standards and suggests the companies prioritized speed to market over maximum customization.
What happens next for deployment and scaling
OpenAI has not disclosed which foundry will manufacture Jalapeño at volume, though TSMC is the most probable given Broadcom's existing relationships and the advanced process nodes required for competitive AI chips. The companies plan deployment by the end of 2026, according to Constellationr, a timeline that would require rapid yield learning and qualification if volume production has not already begun. Early samples running in OpenAI's lab represent a critical milestone, but the gap between first silicon and scaled deployment typically spans quarters.
The 50% cost reduction claim will face scrutiny as Jalapeño moves from controlled testing to production workloads. Real-world inference efficiency depends on model architecture, batch sizes, latency requirements, and software stack optimization, all of which can diverge significantly from lab conditions. OpenAI's vertical integration strategy, outlined in its announcement, envisions a full stack from chip design through model training to consumer and enterprise applications. Whether Jalapeño becomes a meaningful volume product or remains a strategic hedge and bargaining chip depends on how these early metrics translate at data center scale over the next 18 months.
Broader implications for AI infrastructure economics
The Jalapeño announcement accelerates a structural shift in how AI compute is procured and optimized. As models grow larger and inference demand expands, the economics of generic versus custom silicon tip toward specialization. This creates both opportunity and risk for the ecosystem. Companies like Celestica, which participated in Jalapeño's development, and other contract manufacturers and design services firms stand to benefit from increased custom chip activity. Meanwhile, Nvidia's growth trajectory depends in part on maintaining sufficient performance and ecosystem advantages to justify premium pricing against these tailored alternatives.
For OpenAI specifically, custom silicon offers potential relief from one of its largest cost centers. The company has reportedly spent billions on compute infrastructure and faces continued pressure to improve unit economics as it scales. If Jalapeño delivers on its early promises, it could improve margins, enable more aggressive pricing, or fund expanded capacity. The chip also serves as a tangible demonstration of technical depth ahead of OpenAI's anticipated IPO, which CNBC reported the company has confidentially filed for. Custom silicon capability signals engineering maturity Challenger maturity to investors evaluating long-term competitive position.
Key Points
OpenAI and Broadcom revealed Jalapeño, their first custom AI inference chip, on June 24, 2026
Jalapeño taped out after a nine-month design sprint with Broadcom and Celestica manufacturing support
Early testing shows better performance per watt than existing state-of-the-art chips
Companies claim the chip can reduce inference costs by 50% compared to current solutions
OpenAI's own AI models assisted in designing and developing the processor
Questions Answered
Jalapeño is a custom AI inference processor developed by OpenAI in partnership with Broadcom, with Celestica also contributing to production. The chip is specifically designed to run large language model inference workloads for OpenAI's products including ChatGPT and Codex.
OpenAI states that early testing shows Jalapeño delivers better performance per watt than current state-of-the-art chips, and the companies claim it can cut costs by 50%. However, Jalapeño is specialized for inference rather than training, and Nvidia remains dominant in AI training infrastructure.
OpenAI and Broadcom plan to deploy JalMESO Jalapeño by the end of 2026. Early samples are already running in OpenAI's laboratory, though the timeline from first silicon to scaled data center deployment typically spans multiple quarters.
Custom silicon allows OpenAI to optimize hardware specifically for its model architectures, potentially reducing costs and improving efficiency. The move also reduces dependence on a single vendor and supports OpenAI's strategy of vertical integration across the AI stack.
OpenAI stated that its own AI models assisted in developing the chip, though the company has not detailed the specific nature of this contribution. This represents one of the first instances of AI systems participating in the design of specialized hardware for running AI workloads.
Source Reliability
45% of sources are highly trusted · Avg reliability: 67
Go deeper with Organic Intel
Simple AI systems for your life, work, and business. Each one includes copyable prompts, guides, and downloadable resources.
Explore Systems