VentureBeat · Jun 24, 2026 15:14 UTC

OpenAI unveils first custom AI inference chip, Jalapeño, with Broadcom — and its development was sped-up with OpenAI's own models

Summary

<a href="https://openai.com/index/openai-broadcom-jalapeno-inference-chip/">OpenAI</a> and <a href="https://investors.broadcom.com/news-releases/news-release-details/openai-and-broadcom-unveil-llm-optimized-intelligence-processor">Broadcom</a> this morning unveiled their first custom AI accelerator chip named "Jalapeño," positioning it is as a purpose-built processor for large language model (LLM) inference, rather than the more general GPUs offered by the likes of Nvidia or AMD. <div></div>According to its creators, Jalapeño is designed to support workloads behind ChatGPT, Codex, the API and future agentic products, though notably, both <a href="https://openai.com/index/openai-broadcom-jalapeno-inference-chip/">OpenAI</a>'s and <a href="https://investors.broadcom.com/news-releases/news-release-details/openai-and-broadcom-unveil-llm-optimized-intelligence-processor">Broadcom's news releases</a> position it as a product that could be made available to external AI firms as well — "built from the ground up for current and future LLMs across the industry." [Emphasis mine.]It <a href="https://x.com/BloombergTV/status/2069791304318427137">reportedly cuts inference costs</a> by about 50%, according to Bloomberg. Recall inference is when the finished AI model is served to end users to use, while there remain high costs for training, research and development. Jalapeño's engineering timeline set a blistering pace for the semiconductor industry, moving from early schematics to fabrication readiness within a brief nine-month window, when new processor development cycles are typically <a href="https://research.contrary.com/foundations-and-frontiers/evolution-of-chips">measured in years. </a>Indeed, the <a href="https://openai.com/index/openai-and-broadcom-announce-strategic-collaboration/">OpenAI and Broadcom partnership itself was only publicly announced i</a>n October 2025. The companies attributed this speed to a deep software-hardware co-development process that actively used OpenAI’s own models to accelerate parts of the chip design. Sources close to the firms told VentureBeat the development process relied on prior generation OpenAI models, though an OpenAI spokesperson declined to specify exactly which when asked by VentureBeat.After receiving an early physical model on Wednesday, OpenAI outlined plans to begin rolling out these processors across active data centers by the end of this year. OpenAI says it has already begun testing running at least one of its prior generation models, <a href="https://openai.com/index/introducing-gpt-5-3-codex-spark/">GPT‑5.3‑Codex‑Spark</a>, on the chips at a production workload, though in a test environment. The release marks a major strategic expansion for the ChatGPT creator as it attempts to build the full computational stack required to make advanced AI faster, more reliable, and more accessible. There remain, of course, <a href="https://x.com/IamEmily2050/status/2069789441984753707">many outstanding questions</a> — including how the new Jalapeño chip performs compared to direct competitors, its costs, and its manufacturing viability. Sources close to the company said the initial performance itself was (ironically): "outstanding." Greg Brockman, OpenAI's president and co-founder and Broadcom <a href="https://www.cnbc.com/video/2026/06/24/openai-president-greg-brockman-on-new-chip-this-is-a-real-performance-improvement.html">appeared on CNBC</a> alongside Broadcom CEO Hock Tan this morning to discuss the news, and Brockman noted in the interview that "this is a real performance improvement...on performance per watt and performance per dollar." In a <a href="https://x.com/gdb/status/2069809298612621629">separate post on X, Brockman wrote </a>that "Perf[ormance] per watt looking incredible." <h2>Why OpenAI Built an ASIC</h2>To understand why OpenAI is moving into chip design, it helps to look at the architecture. Jalapeño is an Application-Specific Integrated Circuit, or ASIC. Unlike a GPU, which can handle many types of workloads, an ASIC is tuned for narrower uses, as <a href="https://medium.com/@danny_54172/asic-inference-vs-non-inference-ai-chips-a5f1a5f05183">industry experts note</a>. That narrower focus can make it cheaper and more efficient for specific AI tasks, though less adaptable than Nvidia-style GPUs.In Jalapeño’s case, OpenAI is starting from a clean design focused on modern LLM serving, instead of adapting a broader accelerator to fit its needs. The company says the architecture is shaped by its experience running large-scale AI products and is meant to reduce unnecessary data movement while better matching compute, memory and networking resources.Broadcom is contributing core silicon implementation and networking technology, including Tomahawk networking silicon, while Celestica is helping with board, rack and system integration. The goal is to move the chip closer to its practical performance ceiling in real workloads, not just improve theoretical benchmarks.However, OpenAI's pivot into proprietary hardware is not just as a quest for technical supremacy: it may also make its core unit economics far more sustainable. Audited financial <a href="https://www.wheresyoured.at/exclusive-openai-financials/">documents posted recently by AI critic and AI public relations specialist Ed Zitron</a> revealed that while OpenaAI generated an impressive $13.07 billion in revenue throughout 2025, its total operational expenses for the year ballooned to $34 billion, resulting in an operating loss of nearly $20.92 billion. The primary culprit behind this cash hemorrhage involved pure compute requirements, though more is likely due to training than inference. In 2025 alone, research and development costs—driven largely by the infrastructure required to train and serve massive language models—accounted for $19.18 billion, or approximately 56 percent of the company's entire spending footprint. Furthermore, OpenAI reportedly paid Microsoft over $10.59 billion just for R&D and compute infrastructure last year.Still, as OpenAI lays the groundwork for a heavily anticipated public offering in 2026, the Jalapeño inference chip may offer some reassurance to private investors and public markets that OpenAI has a plan for digging itself out of the financial hole and moving toward profitability. If it can drive down the costs of AI inference, then maybe it can recoup some of the losses spent on costly training runs. "By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access," said Brockman included in <a href="https://investors.broadcom.com/news-releases/news-release-details/openai-and-broadcom-unveil-llm-optimized-intelligence-processor">Broadcom's release</a>.<h2>What Does This Mean for Nvidia and All of OpenAI's Other Chip Providers?</h2>The introduction of Jalapeño immediately raises questions about OpenAI's strategic positioning within the fiercely competitive semiconductor and GPU market. Since kicking off the generative AI boom in late 2022, OpenAI has remained one of the largest customers of GPU market leader Nvidia's premium products, but has also <a href="https://nvidianews.nvidia.com/news/openai-and-nvidia-announce-strategic-partnership-to-deploy-10gw-of-nvidia-systems">taken billions in investment dollars from the firm </a>(engendering <a href="https://www.theguardian.com/business/2025/oct/08/openai-multibillion-dollar-deals-exuberance-circular-nvidia-amd">accusations of "circular dealing"</a>), and expanded to work with other rival chipmakers to fuel its appetites.<ul><li>Nvidia: In February 2026, <a href="https://openai.com/index/scaling-ai-for-everyone/">Nvidia finalized a $30 billion direct investment into OpenAI</a> as part of a massive $110 billion funding round.This deal secured an agreement to deploy 10 gigawatts of computing systems—including 3 gigawatts of dedicated inference capacity and 2 gigawatts of training capacity—utilizing Nvidia's next-generation Vera Rubin platform. Sources close to the companies tell VentureBeat Nvidia will remain central to OpenAI, particularly on the model training and development side.</li><li>Amazon Web Services (AWS): As part of the same February 2026 funding round, <a href="https://openai.com/index/amazon-partnership/">Amazon invested $50 billion into OpenAI</a>. This deal included a commitment for OpenAI to consume approximately two gigawatts of AWS's proprietary Trainium computing capacity over the next eight years.</li><li>Advanced Micro Devices (AMD): OpenAI signed agreements with <a href="https://openai.com/index/openai-amd-strategic-partnership/">Nvidia's chief hardware rival, AMD</a> for the former's usage of the latter's AMD Instinct™ MI450 Series GPUs. </li><li>Cerebras: The company also struck a<a href="https://openai.com/index/cerebras-partnership/"> pact with Cerebras</a>, an AI chipmaker that executed its initial public offering in May 2026.</li></ul>Sources with knowledge of these deals said at present, they currently remain in place, unaltered. <h2>The Global Silicon Arms Race: OpenAI Joins AI Infrastructure Heavyweights</h2>Before the introduction of Jalapeño, OpenAI operated at a distinct structural disadvantage compared to the world's vertically integrated technology empires. Tech giants like Google and Amazon have for years utilized their own mature custom silicon programs— Google's Tensor Processing Units (TPUs) and Amazon's Trainium lines—to serve massive computational workloads at drastically lower margins.Microsoft, OpenAI's primary cloud provider and single biggest financial backer, aggressively entered the bespoke silicon market by launching the <a href="https://news.microsoft.com/source/features/ai/in-house-chips-silicon-to-service-to-meet-ai-demand/">Azure Maia 100 accelerator in late 2023.</a>Microsoft subsequently escalated this effort in <a href="https://blogs.microsoft.com/blog/2026/01/26/maia-200-the-ai-accelerator-built-for-inference/">January 2026 by introducing the Maia 200,</a> an inference powerhouse built on TSMC's 3-nanometer process that already actively powers OpenAI's GPT-5.2 models within Azure data centers.Similarly, Meta has aggressively expanded its Meta Training and Inference Accelerator (MTIA) portfolio in recent years, debuting the<a href="https://ai.meta.com/blog/meta-mtia-scale-ai-chips-for-billions/"> MTIA 300, 400, 450, and 500 series </a>to power its recommendation engines and generative artificial intelligence features without relying solely on Nvidia.Jalapeño provides OpenAI with the opportunity to match and offset the hyperscaler advantage. By baking its software architecture directly into a proprietary processor, OpenAI has the chance to replicate, at least in part, the playbook used by Google, Amazon, Microsoft, and Meta — transitioning from a captive cloud customer into a more independent AI infrastructure provider.The timing is ripe amid a rapidly escalating global silicon arms race. Driven in part by United States export restrictions, Chinese tech heavyweights are pursuing more of their own custom AI chip hardware, too:<ul><li>In May, Alibaba's semiconductor division, T-Head, unveiled the <a href="https://www.cnbc.com/2026/05/19/alibaba-reveals-more-powerful-zhenwu-ai-chip-new-llm.html">Zhenwu M890</a>, a proprietary processor expressly engineered for autonomous AI agents that require massive memory bandwidth and long-running context windows.</li><li>Huawei is reportedly gearing up to release its new <a href="https://www.huaweicentral.com/huawei-confirms-ascend-950dt-ai-chip-to-debut-in-august/">Ascend 950DT </a>chip next month</li><li>ByteDance, the corporate parent of TikTok,<a href="https://finance.yahoo.com/technology/articles/qualcomm-explores-custom-chip-partnership-105336403.html"> reportedly entered active negotiations with Qualcomm in June 2026 </a>to design custom application-specific integrated circuits for its data centers to escape third-party dependency.</li></ul>By successfully finalizing the Jalapeño design, OpenAI is seeking to move beyond the traditional confines of a software laboratory and stand shoulder-to-shoulder with international cloud and infrastructure titans. <h2>The Gigawatt Future</h2>This sprawling web of vendor agreements highlights the sheer scale of OpenAI's infrastructural ambitions. The ultimate goal of the OpenAI and Broadcom partnership involves deploying gigawatt-scale data centers with Microsoft and other partners beginning in 2026 — that is, data centers with compute <a href="https://www.reddit.com/r/technology/comments/1fqnmfp/openai_reportedly_wants_to_build_five_to_seven_5/">requiring energy on the order of cities. </a>For Broadcom, the partnership acts as a massive reputational catalyst. The company has been among the biggest beneficiaries of the generative AI boom, helping hyperscalers and frontier labs engineer custom silicon.Broadcom shares reflect this momentum, demonstrating an<a href="https://investors.broadcom.com/news-releases/news-release-details/broadcom-inc-announces-second-quarter-fiscal-year-2026-financial"> 18% year-over-year increase in the first part of 2026</a> and a nearly 7X boost since the end of 2022, according to <a href="https://www.cnbc.com/2026/06/24/openai-and-broadcom-reveal-jalapeno-first-ai-chip-in-partnership.html">CNBC</a>.Ultimately, Jalapeño confirms that OpenAI believes it is ready to move beyond software and code into the realm of real-world, custom hardware. By controlling the physics of its inference pipeline—while simultaneously leveraging the capital and hardware of Nvidia, Amazon, AMD, and Cerebras—OpenAI is attempting to rapidly rewrite its future unit economics of AI.

Original reporting

Open original source

Related coverage

Read full article on VentureBeat