NVDA’s $1 Trillion Era Begins

We spent the past few years teaching AI how to “think” (training), but 2026 is now officially the year we show it how to “work” (inferencing). 

On Monday at this week’s GTC event in San Jose, founder and CEO Jensen Huang (wearing the ubiquitous leather jacket, of course) unveiled a sweeping new initiative that declares we’re entering the “Inference Era.” 

Huang kicked things off by introducing Nvidia’s new Vera Rubin platform, a groundbreaking “architecture” that helps Nvidia transition from being a pure GPU play to “full-stack” hardware supplier. Moving forward, Nvidia isn’t just selling chips — it’s selling literal “AI Factories” optimized for generating tokens on an industrial scale.

Bottom line: Nvidia is now selling data centers as “AI Factories” complete with software, chips, and everything in-between for producing AI tokens at scale. By some estimates these new systems will generate $1 trillion in revenue for Nvidia through 2027. 

Multi-Chip Merchants 

Nvidia claims a portfolio of chips well-suited to AI inferencing, meaning it’s leaving its old “single chip wonder” ways behind. They announced a total of seven different chips in the new platform, which is absolutely a show of confidence they have every right to evince.

The new line is headlined by the launch of the Nvidia Groq 3 LPU (Language Processing Unit). The chip is a product of Nvidia’s $20 billion purchase of Groq and team earlier this year — an acquisition aimed squarely at upstarts Groq believed were faster at inferencing AI models than Nvidia’s traditional GPUs could ever be.

Here’s a breakdown of the chips Nvidia sees as pillars of its new “AI Factory.”

Rubin GPU: Does the heavy lifting. Up to 10x more inference throughput than Nvidia’s next-generation Blackwell GPU.

Vera CPU: Acts as the orchestrator. 88 custom “Olympus” cores designed specifically for “agent logic.”

Groq 3 LPU: Optimized for speed. Ultra-low latency that enables fully interactive agent loops.

BlueField-4 DPU: Handles “north-south data flow” and efficiently manages something Nvidia calls “context memory.”

Together, Nvidia says the Rubin GPU + Groq 3 LPU can provide 35x higher inference throughput per megawatt of power: basically, you get more inferencing speed and you pay less to do it on Nvidia’s new hardware.

On the software side, Nvidia is doubling down on agents — bots you instruct to complete a task, and then watch autonomously run around completing it. Agents aren’t necessarily reading you the answer to your question like ChatGPT, they’re opening up spreadsheets and booking flights. It’s a little spooky and a lot useful.

One clue agents are serious business is the fact that Nvidia plans to primarily deploy them using an open-source project it helped seed called OpenClaw (formerly Moltbot). After all, if these “claws” are going to run loose around your company servers, you’ll want to make sure there are some serious security guardrails around them. Enter NemoClaw, Nvidia’s commercial version of OpenClaw, which adds the “security guardrails” and “OpenShell” sandboxing needed to safely deploy these AI agents on your internal network.

The Proof in the Hardware Pudding 

How big is Nvidia thinking these new inference engines can be? The company is projecting $1 trillion in revenue flowing through these machines by 2027. It’s a number so enormous, it literally cannot be questioned — until someone does.

 

Think about how hyperscale data centers are buying CPUs today: Nvidia is gunning for Intel and AMD with its new “Vera CPU.” 

 

The theory goes that as agents get more complex, inferencing tasks move beyond the GPU (which does number crunching) and into the CPU (which performs the “reasoning” work). 

 

If Nvidia wins the conversation with hyperscalers that you need a perfectly balanced Nvidia CPU, GPU, and LPU “stack” to build these AI agents, it corners the market — not just on AI, but the modern data center itself.

 

Nvidia is eating its own lunch, in other words, and investors are loving it. The company is clearly aware its reputation as a “general purpose” player doesn’t move fast enough for modern AI. By building the special tools investors believe will own the next phase of AI, they’re jumping on a wagon that hasn’t even shown up yet — and they’re not waiting around for the competition to catch up.