Dwarkesh Podcast · 2026-04-15

Jensen Huang on Nvidia's AI Compute Strategy, TPU Competition, and China Chip Policy

Hosts: Dwarkesh

Guests: Jensen Huang

AI hardwareAccelerated computingCUDA ecosystemSupply chainTPU competitionChina chip policyAI compute market

Read summary Jump to transcript Go to episode

Why it matters

Nvidia focuses on accelerated computing, supporting diverse workloads beyond AI tensor operations, differentiating from TPU and ASIC approaches.

Key claims

Nvidia focuses on accelerated computing, supporting diverse workloads beyond AI tensor operations, differentiating from TPU and ASIC approaches.
The CUDA ecosystem and programmability are core competitive advantages, enabling rapid innovation and broad adoption across industries and clouds.
Nvidia secures supply chain advantages through large purchase commitments and close collaboration with suppliers like TSMC, memory makers, and packaging firms.
Bottlenecks in chip manufacturing and supply are temporary (2-3 years) and manageable through investment and ecosystem shaping.

Episode summary

Summary

In this episode of the Dwarkesh Podcast, Jensen Huang, CEO of Nvidia, discusses the company's unique position in the AI hardware ecosystem, emphasizing Nvidia's role in accelerated computing beyond just AI tensor processing. Huang explains Nvidia's strategy of building a broad ecosystem across the AI stack, focusing on programmability, software (CUDA), and supply chain commitments to maintain leadership. He addresses competition from Google's TPUs, highlighting Nvidia's flexibility and extensive market reach compared to ASICs.

Huang also delves into Nvidia's supply chain moat, describing how large purchase commitments and close partnerships with suppliers like TSMC and memory manufacturers secure scarce components critical for scaling AI compute. He acknowledges bottlenecks but is confident they can be managed within a few years. On the topic of selling chips to China, Huang argues that China already has significant compute capacity and AI talent, and that engagement and open ecosystems are preferable to isolation. He stresses the importance of maintaining U.S. leadership in AI hardware while supporting a global developer ecosystem built on American technology.

Throughout, Huang emphasizes Nvidia's philosophy of doing "as much as necessary, as little as possible," supporting AI startups broadly without picking winners, and enabling a diverse cloud ecosystem rather than becoming a cloud provider itself. He also discusses the evolving AI compute market, including segmentation by response time and token pricing, and the broader applications of accelerated computing beyond AI.

Nvidia focuses on accelerated computing, supporting diverse workloads beyond AI tensor operations, differentiating from TPU and ASIC approaches.
The CUDA ecosystem and programmability are core competitive advantages, enabling rapid innovation and broad adoption across industries and clouds.
Nvidia secures supply chain advantages through large purchase commitments and close collaboration with suppliers like TSMC, memory makers, and packaging firms.
Bottlenecks in chip manufacturing and supply are temporary (2-3 years) and manageable through investment and ecosystem shaping.
Nvidia supports AI startups and new cloud providers by investing and enabling their growth but avoids becoming a cloud provider itself.
China already possesses significant AI compute capacity, talent, and chip manufacturing; Nvidia advocates engagement and open ecosystems rather than isolation.
Maintaining U.S. leadership in AI hardware requires balancing export controls with global ecosystem participation to avoid ceding markets and influence.
Nvidia is innovating in AI inference segmentation, offering different response time and throughput trade-offs to meet diverse customer needs.

Source material

Transcript

We've seen the valuations of a bunch of software companies crash because people are expecting AI to come out of tight software And there's a potentially naive way of thinking about things which is like look in VDS ends GDS 2 file to TSMC TSMC, those logic dies, it builds the switches Then it packages them with the HBM that Eski Hi-Nex and Micron and Samsung make then it sends it to an ODM and Taiwan where they assemble the racks And so in VDS fundamentally making software that other people are manufacturing and if software gets commoditized This in VDS came out as I said Well in the end something has to transform electrons to tokens That transformation There's no The transformation of electrons to tokens And making those tokens more valuable over time I don't I think that that's hard to Hard to Completely commoditized The transformation from electrons to tokens is such an incredible journey And making that token You know it's like making a one molecule more valuable than another molecule Making one token more valuable than another The amount of artistry, engineering, science, invention that goes into making that token valuable Obviously we're watching it happening in real time So the transformation, the manufacturing, all of the science that goes in there It is far from un-deeply understood and it's far from The journey is far from far from over and so So I doubt that it will happen We're going to make it more efficient of course I mean the whole thing about Nvidia In fact the way that you framed the question is my mental model of our company The input is the electron the output is tokens That is in the middle Nvidia And our job is to do as much as necessary As little as possible To enable that transformation to be done at incredible capabilities And when I mean by as little as possible Whatever I don't need to do I partner with somebody and I make it part of my ecosystem to do And if you look at Nvidia today We probably have the largest ecosystem of partners Both in supply chain upstream, supply chain downstream All of the computers, computer companies and all the application developers And all the model makers and all the you know AI is a five layer cake if you will And we have ecosystems across the entire five layers And so we try to do as little as possible But the part that we have to do Turns out is insanely hard And I don't think that that gets commoditized In fact I also don't think that the enterprise software companies The tools makers You know most of the software companies today are tools makers Some of them are not But some of them are workflow codification You know systems But for a lot of companies there are tool makers for example You know excels a tool, power points a tool Cadence makes tools, synopsis makes tools I actually see the opposite of what people see I think the number of agents are going to grow exponentially The number of tool users are going to grow exponentially And it's very likely that the number of instances of all these tools Are going to skyrocket It is very likely the number of instances of synopsis Design compiler is going to skyrocket And the number of agents that are going to be using the floor planners And all of our layout tools and hard design Design world checkers the number of agents That are today were limited by the number of engineers Tomorrow those engineers are going to be supported by a bunch of agents And we're going to be exploring the design space Like you've never seen explore before When you use the tools that we use today And so I think tool users going to cause the software companies to skyrocket The reason why it hasn't happened yet Is because the agents aren't good enough at using their tools yet And so either of these companies are going to build the agents themselves Or agents are going to get good enough to be able to use those tools And I think it's going to be a combination of both I think in your latest filings You had almost a hundred billion dollars in purchase commitments With people foundries, memory, packaging And then semi analysis as reported That you will have two hundred and fifty billion dollars So these kinds of purchase commitments So one interpretation is Nvidia's mode is really That you've locked up many years of these scarce components That are, you know, somebody else might have an accelerator But can they actually get the memory to build it?

Can they actually get the logic to build it?

And this is really Nvidia's big mode for the next few years Well, it's one of the things that we can do that is hard for someone else to do The reason why we could, we've made enormous commitments upstream Some of it is explicit these commitments that you mentioned Some of it is implicit For example, a lot of the investments that are upstream Are made by our supply chain Because I said to the CEOs Let me tell you how big this industry is going to be And let me explain to you why And let me reason through it with you and let me show you what I see And so as a result of that process of Informing, inspiring, aligning with CEOs of all different industries upstream They're willing to make the investments Now why are they willing to make the investments for me and not someone else?

And the reason for that is because they know that I have the capacity To buy it, buy their supply and sell it through my downstream The fact that Nvidia's downstream supply chain And our downstream demand is so large They're willing to make the investment upstream And so if you look at GTC And people are marveled by the scale of GTC and the people that go It's a 360 degrees that the entire universe of AI All in one place and they're all in one place Because they need to see each other I bring them together so that the downstream could see the upstream Could see the downstream And all of them could see all the advances in AI And very importantly, they can all meet the AI natives And all the AI startups that are all being built And all the amazing things that are happening So that they could see firsthand all the things that I tell them And so I spend a lot of my time informing Directly or indirectly Our supply chain and our partners and our ecosystem About the opportunity that's in front of us You know, most of my keynotes Some of you will always say, you know, Jensen In most keynotes It's like one announcement after another Announcement after another announcement after another announcement Our keynotes are There's always a part of it that's a little torturous in the sense that It's almost comes across like an education And in fact, that's exactly on my mind I need to make sure that the entire supply chain Upstream and downstream The ecosystem understands What is coming at us?

Why is coming?

When is coming?

How big is it going to be?

And be able to reason about it systematically Just like I reason about it And so I think the The mode as you describe it We're able to, of course Build for a future If our next next several years Is a trillion dollars in scale We have to supply chain to do it Without our reach The velocity of our business You know, just as there's cash flow There's supply chain flow Their turns Nobody's going to build a supply chain For an architecture If the architecture the business turns as low And so our ability to sustain the scale Is only because our downstream demand Is so great And they see it And they all hear about it They see it all coming And so that's It allows us to do the things that we're able to do At the scale we're able to do I do want to understand more concretely Whether the upstream can keep up For many years now You guys have been two-exing revenue year over year You guys have been more than tripling The amount of slots you're providing to the world you're over year You're only about two-exing at the scale now It's really, really Exactly So then you look at Logic say You're the biggest customer on Tsmc's End three-note And you're one of the biggest items End two AI And as a whole, this year is going to be 60% of End three, it's going to be 86% next year According to some analysis How do you two-ex If you're the majority And how do you do that year over year?

So are we a reenergime now Where the growth rate in the AI compute Has to slow Upstream, do you see a way to get around these How do we build two-ex more fads You're over year ultimately?

Yeah, at some level The instantaneous demand Is greater than the supply Upstream and downstream In the world And it could be at any instance In any instance We could be limited by the number of plumbers Which actually happens The plumbers are invited to next year's GTC You know by the way, great idea But that's a good condition You want a market, you want an industry Where the instantaneous demand is greater than the total supply Of the industry The opposite is obviously less good If we're too far apart If one particular item went But if one particular component is too far away Obviously the industry swarms it So for example Notice people aren't talking very much about co-oss anymore Yeah And the reason for that is because for two years We swarmed a living daylines out of it And we doubled double double on several doubles And now I think we're in a fairly good shape And TSMC now knows That co-oss supply has to keep up with the rest of the logic demand And the memory demand And so they're scaling co-oss And they're scaling future packaging technologies At the same level as a scale logic Which is terrific Because for a long time co-oss was rather specialty And HBM memory with rather specialty But they're not specialties anymore People now realize they're mainstream computing technology And then of course We're now much more able To influence a larger scope of our supply chain In the past In the beginning of the AI revolution All the things that I say now I was saying five years ago And some people believed in it And invested in it For example Sanjay and the micron team I still remember the meeting really well Where I was clear about exactly what's going to happen And why's going to happen And the predictions, the predictions that Of today And they really doubled down on it And we partnered with them And across LPDDR Or across HBM memories They really invest in it And it obviously has been tremendous for the company Some people came a little bit later But now they're all here And so I think the Each one of these general Each one of these bottlenecks It gets a great deal of attention And now we're prefetching the bottlenecks Years in advance So for example The investments that we've done With momentum and coherent And all of the silicon photonics ecosystem The last several years We really reshape the ecosystem And the supply chain So silicon photonics We built up an entire supply chain Around TSMC We partnered with them on Koop Invented a whole bunch of technology We licensed those patents to the supply chain Keep it nice and open And so we're preparing the supply chain Through invention of new technologies New workflows A new test new testing equipment Double-sided prototyping Investing in companies Helping them scale up their capacity And so you can see that we're trying to shape the ecosystem So that it's ready The supply chain So that's ready to support the scale It seems like some bottlenecks Are easier than others And so Scaling up Koos We're just scaling up I went to the hardest one by the way Which is Plummers Yeah Yeah I actually went to the hardest one Yeah Plummers and electricians And the reason for that is Because And this is one of the concerns That I have about About the dooms Describing the end of End of work and killing of jobs And one of the things that That If we discourage people from being soft-wrenching ears We're going to We're not soft-wrenching ears And the same prediction 10 years ago Some of the dooms We're saying that We're telling people Whatever you do Don't be a radiologist And you might hear some of those Some of those videos are still on the web You know, radiology It's going to be the first career to go Nobody's the world's not going to need any more radiologists Guess what we're short of Radiologists Oh, but okay, so going back to this point about Well Some things you scale Other things like How do you actually get How do you actually manufacture Two XD amount of logic You're ultimately that's bottleneck But memory and logic are bottleneck But you've How do you get to two XDs Many UV machines a year Yeah You're over year None of that is impossible To scale quickly You just need to You could do All of that is easy to do Within two or three years You just need to demand signal It's not Once you can build one You could build ten And once you can build ten You can build a million And so these things are not Not hard to replicate How far down the supply chain do you go Where do you go to ASML and say hey If I look out three years or now For me for a video To be generating two trillion year in revenue We need way more AUV machines And some of them I have to Directly Some of my indirectly And some of them If I can convince TSMC As ASML will be convinced And so that's that you know We have to think about the critical Critical pinch points But if TSMC is convinced You'll have plenty of UV machines in a few years And so none of that My point is that None of the bottlenecks last longer Than a couple two three years None of them And meanwhile Meanwhile we're Improving computing efficiency By ten X Twenty X in the case of Hopper to Blackwells I'm thirty fifty X We're coming up with new algorithms Because CUDA is so flexible We're developing all kinds of new techniques So that we drive efficiency In addition to increasing capacity And so So those are those are things That none of that worry me It's the stuff that's downstream from us Energy policies that prevent energy from You know you can't grow You can't create an industry without energy You can't create a whole new manufacturing industry without energy We want to re-industrialize United States We want to bring back Chip manufacturing and computer manufacturing And packaging And we want to build new things Like EVs and robots And we want to build AI factories And you can't build any of these things Without energy And those things take a long time But more chip capacity That's a two-three-year problem More co-oscopacity two-three-year problem Interesting I feel like I have guessed Tell me the exact opposite Think sometimes in this case I just don't have the technical knowledge to educate Well the beautiful thing is talking to the expert Yeah That's true Okay I want to ask about You competitors Yeah So If you look at TPU Arguably Two out of the top three models in the world Claude and Gemini Retrained on TPU What does that mean for a video going forward?

Well we have a very different We build a very different thing You know what What Nvidia built Is accelerated computing Not a tensor processing unit And accelerated computing is used for all kinds of things You know molecular dynamics and quantum chromo dynamics And it's used for data processing Data frames, structured data, unstructured data It's used for fluid dynamics, particle physics You know, in addition We use it for AI And so accelerated computing is Is much more diverse And although AI is the Conversation today is obviously very important and impactful Computing is much broader than that And what Nvidia has done is reinvented Re-invented the way computing is done From general purpose computing to accelerated computing Our market reach is Far greater than any Any TPU Any ASIC impossible you have And so if you look at our position Or the only company that That accelerates applications of all kinds We have a gigantic ecosystem And so all kinds of frameworks and algorithms All run on video And because our computers Are designed to be operated by other people Anyone who's an operator could buy our systems Most of these home built systems You have to be your own operator Because we're never designed to be flexible enough for other people to operate And so as a result of the fact that anybody can operate our systems We're in every club including Google Holland, Amazon And you know, Azure and OCI And so whether you want to operate it to rent or operate it If you want to operate it to rent you better have large ecosystem of customers And many industries that be the off-takers If you're operating it If you want to operate it for yourself We, you know, we obviously have the ability to help you operate yourself Like for example, for you along with XAI And because we could enable operators In any, any company in any industry You could use it to build a super computer for a scientific research And drug discovery at Lily And so we can help them operate their own super computer And use it for the entire diversity of drug discovery and biological sciences That, that we accelerate And so, so they're just, you know, a whole bunch of applications that we can address That you can do so with TPUs Because Nvidia's built Cuda as a fantastic tensor processing unit as well But it does, you know, it does every, every life cycle of data processing and computing And AI and so on and so forth And so our, our market opportunity is just a lot larger Our reach is a lot greater And because we have such a large We basically support every application in the world now You could build Nvidia systems anywhere and know that there will be customers for it And so, it's a very different thing This is going to be sort of a long question But, you know, you have spectacular revenue And this revenue is mostly, you're not making 60 billion a quarter from Farma and Quantum, you're making it because AI is unprecedented technology It is growing unprecedentedly fast And so then the question is, what is best for AI specifically And I'm not in the details, but I talked to my researcher friends and they say, look When I use a TPU, it's this big systolic array That's perfect for doing makeshift multiplies Whereas a GPU is very flexible It's a great when you have lots of branching When you have irregular memory access But with these, you know, what is AI Just like these very predictable matrix multiplies again and again And you don't have to give up any diarrhea for Warp schedulers for, you know, switch between threads and memory banks And so the TPU is really optimized for the majority The bulk of this growth in revenue and use case for a compute That is coming online right now Yeah, I wonder how you reacted that Matrix multiplies is an important part of AI But it's not the only part of AI And if you want to come up with a new attention mechanism Or if you want to disaggregate in a different way If you want to come up with a whole new type of architecture It allows for all together, for example, you know, hybrid SSM If you want to use a, you want to create a model That that fuses diffusion and auto regressive somehow You want an architecture that's just generally programmable And we run everything you can imagine And so that's the advantage It allows for invention of new algorithms a lot more A lot a lot more easily And so because it's a programmable system And the ability to invent new algorithms Is really what makes AI advance so quickly You know, TPUs like anything else Is impacted by Moore's Law And we know that Moore's Law is increasing about 25% per year And so the only way to really get 10x leaves 100x leaves Is to fundamentally change the algorithm And how it's computed every single year And that's Nvidia's fundamental advantage The only reason why We were able to make black while the hopper 50 times I said it was 35 times And when I first announced it was going to black while it was going to be 35 times More energy efficient than hopper Nobody believed it And then Dylan wrote an article he said in fact In fact I sandbagged it's actually 50 times And you can't reasonably do that which is Moore's Law And so the way that we solve that problem Is new models M-O-E's Paralyze and disaggregated and distributed Across a computing system And without the ability to really get down And come up with new kernels with kuda It's really hard to do And so the combination of The programmability of our architecture The fact that Nvidia's an extreme co-design company Where we could even offload Some of the computation into the fabric itself And be linked for example Into the network spectrum X And that we could And that we could affect change Across the processors The system, the fabric, the libraries, the algorithm All of that was done simultaneously Without kuda to do that I wouldn't even know where to start My sponsor Crusoe was among the first clouds to offer Nvidia's black well And black well ultra-platforms And they just announced their Nvidia VR Roombing deployment Schedule for it later this year To state of the art hardware is only part of the story For example, most infrasentions Already do kvcaching for a single users forward passes But Crusoe is at a cross-users in GPUs So if 1000 agents are running on the same system prompt Crusoe only has to compute the kvcash once For it to become available to every single GPU in the cluster This is especially important Systems get more authentic And require much longer prefixes in order to Use tools and access files In a recent benchmark Crusoe was able to deliver Up to 10 times faster Time to first token and up to 5 times better throughput Than VLO1 This is just one among many reasons that you should run your Infrasoric load with Crusoe And if you need GPUs for training, you don't need to switch clouds Crusoe has got you covered there too Go to Crusoe.ai slash torque hash to learn more So this gets an interesting question about Nvidia's Clientel Where in 60% of your revenue Is coming from these Big five hyperscalers You know in a different era Different customers let's say it's professors We're running experiments and they are helped a bunch by they need kuda They can't use another accelerator They need to just run PyTorch with kuda and have everything optimized But if you got these hyperscalers, they have the resources To write their own kernels In fact, they have to to get that extra last 5% That they need for their specific architecture And throw up a Google or Mostly running their own accelerators Or running GPUs And training them, but even opening i Using GPUs has Has tried in which They're like we need our own kernels They've down to kuda c++ They've instead of using kublas And nickel and everything They've got their own stack, which compiles to other accelerators as well And so if most of your customers Can Can and do make replacements for kuda To what extent is kuda really Is going to make frontier AI happen on Nvidia Kuda is a rich ecosystem And so if you want to build on any computer first Building on kuda first Is incredibly smart And because the ecosystem is so rich We support every framework If you want to create custom kernels If you need, for example, we contribute And enormously to try And so the back end of try and Huge amounts of Nvidia technology More delighted to help Every framework Become as great as it's going to be And there's lots and lots of frameworks There's try and there's a BLN And there's more And now there's a whole bunch of new Reinforcement learning frameworks coming out You got Varro, you got NemoRL You got a whole bunch of new And then the now with post training And reinforcement learning Higher areas just exploding And so if you want to build on an architecture Building on a kuda makes no sense Because you know that the ecosystem is great You know that if something happens It's more likely in your code And not in the mountain of kuda underneath You know don't forget the amount of code that you're dealing with When you're building these systems When something doesn't work Was it you or was it the computer You would like it always to be you And to be able to trust the computer And and and obviously we still have lots of lots of lots of Lots and lots of bugs ourselves But but our system is so well Run out That you could at least build on top of the foundation So that's number one Is that the richness of the ecosystem Or programmability of it the capability of it The second thing is If you were a developer and you were building anything at all The single most important thing you want More than anything is install base You want the software that you run To run on a whole bunch of other computers You don't want to build a software You're not building software just for yourself You're building software for your fleet Or for everybody else's fleet Because you're a framework builder And and various kuda ecosystem Is ultimately it's great treasure We are now I don't know how many Several hundred million GPUs Every cloud has it Goes back to 810 A100 H100 H200 You know The L series The P series I mean there's a whole bunch of them And they're they're in all kinds of sizes and shapes And if you're robotics company you want that kuda stack to actually run And the kuda in the robot itself We're literally everywhere And so the install base Says that once you develop the software Once you develop the model It's going to be useful everywhere And so the install base is just too incredibly valuable And then lastly In fact that we're in every single cloud Makes us genuinely unique Because you're an AI company And you're an AI developer You're not exactly sure which CSP you're going to partner with And where you would like to run it And we'd run it everywhere Including on-prem for you if you like And so I think that The richness of the ecosystem The expansiveness of the of the install base And the versatility of where we are That combination is makes kuda invaluable That makes a lot of sense I guess the thing I'm curious about is Whether those advantages Matter a lot to your main customers Like there's many people who they might matter For the kind of person who can actually build our own software stack Who are make up most of your revenue Especially if you go to a world where AI is getting especially good at the things Which have tight verification loops Working are all on them And then this question of how do you write a kernel that does attention Or MLP the most efficiently across a scale up It's a very verifiable sort of feedback loop And so, oh can everybody Can all that hyperscalers write these custom kernels for themselves And they might still in media So has still has great price performance So they might still prefer to use in media But then the question is Is it just become a question of Who is offering the best Specs the best Flops and memory and memory bandwidth for a given dollar Where historically in media has just had in still highs You know the best margins and all of AI across hardware and software 70% plus because of this kuda mode And the question is, oh can you sustain those margins If for most of your customers They can actually afford to build Build instead of the kuda mode The number of engineers we have assigned to these AI labs And the reason for that is because Because nobody knows our architecture better than we do And these architectures are not as general purpose as a CPU The reason why a CPU is so You know a CPU is kind of like a catalog It's just always, it's a nice cruiser It never goes too fast Everybody drives it pretty well It's got cruise control Everything is easy But in a lot of ways And videos GPUs are accelerators are kind of like F1 Racer's And yeah, I could imagine everybody is able to drive it At 100 miles an hour But it takes quite a bit of expertise to be able to push Until limit And we use a ton of AI to create the kernels that we have And I'm pretty sure we're going to still be native for quite some time And so our expertise helps our AI labs partners Get another 2x out of their stack Easily, oftentimes It's not unusual that we, you know by the time that we're done Optimizing their stack or optimizing a particular kernel They're modeled, sped up by 3x 2x, 50% That's a huge number Especially when you're talking about the install base of the fleet that they have Of all the hoppers and black walls that they have When you increase it by a factor of 2 That doubles the revenues That directly translates the revenues And video computing stack is the best performance Partizio in the world bar none Nobody can demonstrate to me That any single platform in the world today has better performance TCO ratio, not one company And in fact, the benchmarks are out there Dillon's right, inference max is sitting out there for everybody to use And not one TP won't come, training won't come I encourage them to use inference max And demonstrate their incredible inference cost It's really, really hard Not nobody wants to show up ML Perf I would welcome Trainium to demonstrate their 40% That they claim all the time I would love to hear them demonstrate The cost advantage of TP use It makes no sense in my mind It makes absolutely zero sense On first principles and makes no sense And so I think the I think the reason why we're so successful is simply because our TCO is so great There's a second, you say 60% of our customers are to top five But most of that business is external For example, most of AWS is most of Nvidia In AWS is for external customers Not internal use Most of our customers at Azure obviously all of our customers are external All of our customers at OCI external Not internal use The reason why the favor us is because Our reach is so great We can bring them all of the great customers in the world They're all built on Nvidia And the reason why all these companies are built on Nvidia is because our reach and our versatility is so great And so I think the flywheel Is really install base The programability of our architecture The richness of our ecosystem And the fact that there's so many AI companies in the world There's tens of thousands of them now And if you were one of those AI startups What architecture would you choose?

You would choose an architecture that's most abundant Where the most abundant in the world The one has the largest install base Where the most largest install base And one that has a rich ecosystem And so that's the flywheel that's the reason why Between the combination of one Our per dollar is so great That they have the lowest cost tokens Second, our per watt is the highest in the world And so if one of these companies If our partners built a one gigawatt data center That one gigawatt data center Better deliver the maximum amount of revenues That number of tokens Which directly translates to revenues You wanted to generate as many tokens as possible Maximize the revenues for that data center Where the highest tokens per watt Architecture in the world And then lastly, if your goal is to rent the infrastructure We have the most customers in the world And so that's the reason why the flywheel works Interesting I guess the question comes down to What is the actual market structure here?

Because even if there's other companies There could have been a world where there's tens of thousands of AI companies That have roughly equal share of compute But if even through these five hyperscalers Really the people on Amazon using the computer And throughout that company And these big foundation labs who can themselves afford And have the ability to make different accelerators work I think your assumption is premises wrong Let me ask you a slightly different question Come back and make me correct your premise Let me just ask you a different question which is Everything you say Make sure to make me come back and fix Because it's just too important to AI It's too important to the future of science It's too important to the future of the industry That premise The premise Look Let me just first look at that Interesting together So what do you think if all these things are true About price performance and performance rewards It's in our true Why do you think it is?

The case that say Anthropic for example just announced a couple days ago They have a multi-kick a lot deal with broad calm and Google for TPUs And majority of their compute obviously for Google It's TPUs in my computer But look at these big AI companies It seems like a lot of there There are some point where there's all in video And now it's not And so I'm curious how to square If these things are true on paper Why are they going with other accelerators?

Yeah Anthropic is a unique instance And not a trend Without Anthropic Why would there be any TPU growth at all?

It's 100% Anthropic Without Anthropic, why would there be any training growth at all?

It's 100% Anthropic I think that's fairly well known And well understood It's not that there's an abundance of asic opportunities There's only one Anthropic But open the eyes deals with AMD They're building their own Titanix elevator Yeah, but they're mostly I think we could all acknowledge their vastly In video And we're going to still do a lot of work together Yeah And we're not I'm not offended by other people Using something else and trying things If they don't try these other things How would they know how good are this?

You know, and sometimes you've got to be reminded of it And we have to continuously earn Earn the position that we're in There are always big claims And look at the number of asic set I've been cancelled Just because you're going to build an asic You still have to build something better than Nvidia And it's not that easy building Something better than Nvidia It's not sensible actually You know, it's weep Nvidia's got to be missing something seriously You know, and because our scale Our velocity We're the only company in the world that's cranking it out Every single year Big leaps every single year I guess their logic is that Hey, it doesn't need to be better It just needs to be Not more than 70% worse Because they're taking you 70% margins No, no, don't forget Even in asics margins really quite high Nvidia's margin's 70% let's say But in asic margins 65 What are you really saving?

Are you mean from broad commerce?

Yeah, sure You got to pay somebody Yeah And so I think the asic margins are incredibly good From what I can tell And they believe they believe it's so too And so they're quite proud of their incredible asic margins And so you ask the question why A long time ago We just didn't have the ability to do it And this is this is and at the time At the time I didn't deeply internalize How difficult it would be To build a foundation AI lab like OpenAI and Anthropic And the fact that they need a huge Investments from to supply themselves We just weren't in a position to make the Multibillion dollar investment into Anthropic So that they could use our user compute But Google and AWS War And they put in huge investments in the beginning So that Anthropic in return use their compute We just weren't in a position to do so at the time Nor did I I would say my mistake is I didn't deeply internalize That they really had no other options That a VC would never put in Five ten billion dollars investment Into an AI lab With the hopes of it turning out to be Anthropic And so that was my miss But even if I understood it I don't think we would have been in a position to do that at the time But I'm not going to make that same mistake again And I'm delighted to invest in OpenAI And I'm delighted to help them scale And I believe it's essential to do so And then when I was able to Anthropic came to us I'm delighted to be an investor Delived to help them scale But we just weren't at the time able to do so If I could rewind everything And video could have been as big back then as we are now I would have been more than happy to do it This is actually quite interesting Which is for many years And video has been this D company in AI Making money Making lots of money And now you're investing it It's been reported that you've done up to 30 billion in OpenAI and 10 billion in Anthropic But now their valuations have increased And I'm sure they'll continue to increase And so over these many years You were giving them the compute You saw where I was headed And then they were worth like One tenth what they are now a couple years ago Or even a year ago in some cases And you had all this cash There's a world where either in video They themselves become a foundation lab Does the huge investment to make that possible Or has made the deals you made now Current valuations much earlier on And you had the cash to do it So I'm curious actually why not have done it earlier We did it as soon as we could have We did as soon as we could have And if I could have I would have done it even earlier At the time that Anthropic needed us to do it We just weren't in a position to do it It wasn't it wasn't in our sensibility to do so How is it like a cash thing or just Yeah the level of investment We never invested outside the company at the time And not that much And we didn't realize we needed to You know I always thought that they could just go Raise the VCs for God's sakes Like all companies do But what they were trying to What they were trying to do Couldn't have been done through VCs What opening I wanted to do Couldn't have been done through VCs And I recognized that now I didn't know it then You know but that's their genius That's what they're smart And so they realized they realized They had to do something like that And I'm delighted that they did And even though even though We we caused Anthropic to have to go to Somebody else I'm still happy that it happened Anthropics existence is great for the world I'm delighted for it I guess you still are making a ton of money And we're making way more money Quarter out of dollars It's still okay to have regrets So the question still arises Okay well now that we're here You have all this money that you keep making Watch it in video and be doing with it And there's one answer which has Look there's this whole middle man ecosystem That is popped up for Converting Capacs into objects for these labs So they can rent compute Because the ships are really expensive They make a lot of money over their lifetime Through because the AML is getting better The value of the generate the tokens is increasing But their expenses are set up And video has the money to do the Capacs So and in fact you are It's been reported you're back Stop the core we have up to 6.3 billion And have invested to be But yeah why isn't in video become A cloud themselves If I didn't become a hyper-skiller themselves I'm in this computer How long does cash do it?

This is a philosophy of the company And I think it's wise We should do as much as needed As little as possible And what that means is The work that we do with building our Computing platform If we don't, if we don't do it I genuinely believe it doesn't get done If we didn't take the risk that we take If we didn't build MV length If we didn't build the whole stack If we didn't create the ecosystem The way we did it If we didn't dedicate ourselves To 20 years of Kuda While losing money most of that time If we didn't do it, nobody else would have done it If we didn't create all the Kuda X libraries So that they're all domain specific You know this several decade and a half ago We pushed into domain specific libraries Because we realized that if we didn't create These domain specific libraries Whether it's for ray tracing or image generation Or even the early works of AI These models if we didn't create Um for data processing structure Data processing or vector data process If we didn't create them, nobody would And I am completely certain of that We created a library for Computing on the choreography called Kuda If we didn't create it, nobody would have And so Excellerated computing went advance The way it has if we didn't do what we did And so we should do that We should dedicate our company All of our might Wholeheartedly go do that However, the world has lots of clouds If I didn't do it, somebody show up And so following the recipe The philosophy of doing as much as needed But as little as possible As little as possible That philosophy exists in our company today And everything I do, I do it with that lens In the case of clouds If we didn't support coreweave to exist These neo clouds, these AI clouds, when exist If we didn't help coreweave exist, they would not exist If we didn't support end scale, they wouldn't be where they are today If we didn't support nebius, they wouldn't be where they are today Now they are doing fantastically Is that a business model where no?

We should do as much as needed as little as possible And so we invest in our ecosystem Because I want our ecosystem to thrive And I want the architecture And I want AI to be able to connect with As many industries as possible As many countries as possible And make it possible for the planet to be built on AI And to be built on the American tech stack And so that vision I think is exactly what we're pursuing Now one of the things that you mentioned There are so many great amazing foundation model companies And we try to invest in all of And this is another thing that we do, we don't pick winners And we like, we need to support everyone And it's part of our part of our joy of doing so It's imperative to our business But we also go out of our way not to pick winners And so when I invest in one of them, I invest in all of them Why do you go out of your way to not pick winners?

Because it's not our job two Number one, number two When Nvidia's first started There were 60 graphics companies, 63 graphics companies We are the only one that survived If you were to take in those 60 companies 60 graphics companies and ask yourself Which one was going to make it?

And video would be the top of that list not to make it You know, this is long before you But Nvidia's graphics architecture was precisely wrong It's not a little bit wrong We created an architecture that was precisely wrong And it was an impossible thing for developers to support It was never going to make it We reasoned about it for good first principles But we ended up in the wrong solution And everybody would have counted us out And here we are And so I have enough humility to recognize that You know, don't pick winners Either let them all take care of themselves Or take care of all of them One thing I didn't understand is You said look we're not prioritizing these New clouds just because there are new clouds And we want to prop them up But you also said you listed a bunch of new clouds And you said they wouldn't exist if it wasn't for Nvidia Yeah And so how are those two things?

First of all, they need to want to exist And they come to ask us for help And when they want to exist and have a business plan And they have expertise and they have the passion for it They obviously have to have some capability Some selves But at the end of the day they need some investment And we're together off the ground We would be there for them But the sooner they get their flywheel going You know, your question was do we want to be in the financing business The answers now Yeah, we don't want to be Because there are people in the financing business And we rather work with all of the people who are financing business Then to be a finance here ourselves And so I think the our goal is to focus on what we do Keep our business model is simple as possible Support our ecosystem When someone like like open AI needs an investment of 30 billion To our scale because it's still before their IPO And we deeply believe in them We deeply believe that they're going to be They're going to be an extraordinary company already today They're going to be an incredible company The world needs them to exist The world wants them to exist and want them to exist And they have everything they have the wind at their back Let's support them and let them scale So those investments will do because they need us to do it And but we're not trying to do as much as possible We're trying to do as little as possible I've been away too much time Copypasting tax back and forth for a Google Docs to chat about And so I built what's basically a cursor for writing Which operates the way I think an AI co-research or should operate I can tag it and it can talk with me through in-line comment threads And help me dig deeper and brainstorm I brought this entire thing over the weekend with cursor And they're new composer to a model With a lot of identity coding tools I feel like I have no idea what's going on under the surface I just have to relinquish control and hope for the best But cursor let me try a bunch of different ideas While staying on top of the implementation I did most of my brainstorming in the agents window And after I got some basic files in place I used a diff window to track changes The few times that I needed to make a quick tweak by hand I just used the editor If you want to try my AI co-research yourself I have linked the GitHub repo in the description And if you have a tool that you've been wanting to build You should make it happen Go to cursor.com slash to work hash to get started This might be sort of an obvious question But we've lived many years in this situation Where there's a shortage of GPUs And it's grown now because models are getting better We have a shortage of GPUs Yes And in video is known for Diving up the scarce allocation Not just based on high-spitter but rather on Hey, we want to make sure that these new clouds exist Let's give some to Corbyl, let's give some to Crusoe Let's give some to Lambda Why is it good for Nvidia?

First of all, would you agree with this characterisation No, I'm not sure No, your premise is just wrong Yeah We're sufficiently mindful about these things We're very mindful about these things First of all, if you don't place a PO All the talking in the world won't make a difference And so until we get a PO, what are we going to do?

And so the first thing is we work really hard with everybody To get a forecast done Because these things take a long time to build And the data centers take a long time to build And so we align ourselves with demand and supply And things like that through forecasting Okay, that's job number one, number two Everybody who, you know, we've tried to forecast with as many people as possible But in the final analysis, you still have the place in order And maybe, maybe for whatever reason you didn't place your order, what can I do?

And so at some point first and first out But beyond that, if you're not ready Because your data center's not ready Or serving components aren't ready to enable you to stand up a data center We might decide to serve another customer first That's just maximizing the throughput of our own factory And so we might do some adjustments there Aside from that, the prioritization is first and first out Yeah, you got to place a PO, maybe on place a PO Now of course, there are stories about that For example, all of this kind of started from It was an article about Larry and Elon having dinner with me Where they begged for GPUs That never happened We absolutely had dinner And we absolutely had dinner And it was a wonderful dinner And no time did they begged for GPUs And so they just had the place in order And once they placed an order, we do our best to get the capacity to them We're not complicated Okay, so it sounds like there's a queue And then based on whether your data center is ready And when you place a purchase order, you get them a certain time But it still doesn't sound like high-speed or just gets it Is there a reason to do it?

We never do that Okay We never do it I just do high-speed or Because it's a bad business practice You set your price, you set your price And then people decide to buy it or not And they're I understand that Others in the chip industry Change their prices when demand is higher But we just don't That's just never been a practice of ours You can count on us I prefer to be dependable To be the foundation of the industry And you don't need the second gas If you, if I quoted you a price, we quoted you a price That's it And if demand goes through the roof, so be it And on the other end, that's why you have productive relationship But TSMC Yeah Yeah, yeah And video has been in business We've been doing business with them for I guess coming up on 30 years And NVIDIA and TSMC don't have a legal contract There's always some rough justice And sometimes I'm right, sometimes I'm wrong Sometimes I got a better deal Sometimes I got a worse deal But overall in the whole The relationship is incredible And I can completely trust them I can completely depend on them And one of the things that we you can count on With NVIDIA Is that next year?

This year, Vera Rubin's going to be incredible Next year, Vera Rubin Ultra will come The year after that, Feynman will come And the year after that, I haven't introduced the name yet And so, so every single year you can count on us And this is a You're going to have to go find another ace A team in the world Pick your ace team Where you can say I can bet the farm of my I can bet my entire business That you will be here for me every single year Your cost, your token cost Will decrease by an order of magnitude Every single year, I can count on it Like I can count on the clock Well, I just said something about TSMC No other foundry in history Can you possibly say that?

You can say that about NVIDIA today You can count on us every single year If you would like to buy a billion Or if AI factory compute No problem If you like to buy a hundred million dollars No problem If you like to buy ten million dollars Or just one rack Not a problem Or just one graphics card Okay No problem If you would like to place an order for A hundred billion dollar AI factory No problem Where the only company in the world Where you can say that today I can say that about TSMC as well I want to buy one Buy one billion No problem We just got to go through the process of planning for it All the things that mature people do So I think the ability for NVIDIA to be the foundation Of the world's AI industry This is a position that has taken us Several decades to arrive at And enormous commitment and enormous dedication And the stability of our company The consistency of our company is really important Okay, I want to ask about China Yeah And I always like to take I don't actually don't know what I think about Whether it's good to sell to China or not But I've played devils that get against my guess So when Dario is on who supports tax work control As well why can't America and China both have Country of Geniuses in the Data Center But since during the opposite side I'll ask you in the opposite way And look what I'm going to think of Is anthropic actually announced a couple days ago This model myth is not even releasing publicly Because they say it has such cyber offensive Keepability is that we don't think the world is ready Until we get we make sure these zero days are passed up But they say it found thousands of high severity Venerabilities across every major operating system Every browser if found one in open BSD Which is this operating system specifically designed To not have zero days and found one For 27 years it's existed And so if Chinese companies and Chinese labs and Chinese government had access to the AI chips to train a model like Claude Mithos with the cyber offensive capabilities And run millions of instances of it with more compute The question is, is that a threat to American companies to American national security First of all Methos was a train on fairly mundane capacity And a fairly mundane amount of it By an extraordinary company And so the amount of capacity in the type of compute That's it was trained on is abundantly available in China And so You just have to first realize that Chips exist in China They manufacture 60% of the world's mainstream ships Maybe more It's a very large industry for them They have some of the world's greatest computer scientists As you know Most of the AI researchers and all of these AI labs Most of them are Chinese They have 50% of the world's AI researchers And so the question is, if you're concerned about them What is the considering all the assets they already have?

They have an abundance of energy They have plenty of chips They've got most of the AI researchers If you're worried about them, what is the best way To create a safe world Well, victimizing them Turning them into an enemy Likely isn't the best answer They are an adversary We want United States to win But I think having a dialogue and having research dialogue Is probably the safest thing to do This is an area that is glaringly missing Because of our current attitude about China's an adversary It is essential that our AI researchers And their AI researchers are actually talking It is essential that we try to Both agree on how to what not to use the AI for With respect to finding bugs in software Of course, that's what AI is supposed to do Is going to find bugs in a lot of software Of course There's lots and lots of bugs There's lots of bugs in the AI software And so that's what AI I be supposed to do And I'm delighted that AI has reached the level Where it could help us be so much more productive One of the things that is is Under-Up-Assized is the richness of ecosystem In fairness to data emphasized, is the richness of ecosystem around cybersecurity, AI cybersecurity, and AI security, and AI privacy, and AI safety.

That whole ecosystem of AI startups that are trying to create this future for us, where you have one AI agent that's incredible, surrounded by thousands of AI agents keeping it safe, keeping us secure.

That future surely is going to happen.

And the idea that you're going to have an AI agent running around with nobody watching after it is kind of insane.

And so we know very well that this ecosystem needs to thrive.

It turns out this ecosystem needs open source.

This ecosystem needs open models.

They need open stacks.

So that all of these AI research and all these great computer scientists can go build AI systems that as are as formidable and can keep AI safe.

And so one of the things that we need to make sure that we do is we keep the open source ecosystem vibrant.

And that can't be ignored.

That can't be ignored.

And a lot of that is coming out of China.

We add to not suffocate that.

You know, with respect to China, we want to have, of course, want United States staff as much computing as possible.

We're limited by energy.

But we got a lot of people working on that.

And we had to not make energy a bottleneck for our country.

But what we also want is we want to make sure that all the AI developers in the world are developing on the American tech stack and making the contributions, the advancements of AI, especially when it's open source available to the American ecosystem.

And it would be extremely foolish to create two ecosystems.

The open source ecosystem and it only runs on the Chinese tech tech a foreign tech stack and a closed ecosystem.

And that runs on the American tech stack.

I think that would be that would be a horrible outcome for United States.

Since there are a lot of things, let me just triage the response.

I mean, I think the concern going back to the flop difference in the hacking is yes, they have compute, but there's some estimates that because they're at seven nanometer, they don't have UVs because of chip making expert controls, the amount of flops they're about to actually produce, they have like one tenth amount of flops that the US has.

And so with that, could they train eventually a model like mythos?

Yes, but the question is, because we have more flops, American labs are able to get to these level capabilities first.

And because in Thropega, to the first, they say, okay, we're going to hold on to for a month.

While all these American companies, we get the maxes to it, they're going to patch up all their vulnerabilities and now we release it furthermore, if they even if they train a model like this, debilitated to deploy that scale, you know, if you had a cyber hacker, it's much more dangerous if they have a million of them versus a thousand of them, so that inference compute really matters a lot.

And in fact, the fact that they have so many every searchers are so good is the thing that makes it so scary because what is it that makes those engineer researchers more productive is compute.

If you talk to any, I love in America, they say the thing that's bottleneck in the miscompute.

So, and there are quotes from Deepsey Founder or Coin Leadership or whatever, they say like the thing we're bottleneck on is compute.

So then the question is, isn't it better that we get to get American companies because they have more compute, get to the level of spot or mythos, level capabilities first, prepare our society for it before China can get to it because they have less compute.

We should always be first and we should always have more.

But in order for that outcome for you to, what you describe to be true, you have to take it to the extremes.

They have to have no compute.

And if they have some compute, the question is how much it is needed.

The amount of compute they have in China is an enormous.

I mean, you talk about a country.

There's a second largest computing market in the world.

If they wanted to deploy, aggregate their compute, they got plenty of compute to aggregate.

But is that true?

I mean, there's people do these estimates and they're like, well, this make it is actually behind on the process.

No, it's at the end.

I'm about to tell you.

Okay.

The amount of energy they have is incredible, isn't that right?

AI's a parallel computing problem, isn't it?

Why can they use put four?

Ten times as much chips together because energy is free.

They have so much energy.

They have data centers that are sitting completely empty, fully powered.

They, you know, they have ghost cities, they have ghost data centers.

They have so much capacity of infrastructure.

If they wanted to, they're just gang up more chips, even if they're seven nanometer.

And their capacity of building chips is one of the largest in the world.

The semiconductor industry knows that they monopolize mainstream chips.

They over capacity.

They have too much capacity.

And so the idea that China won't be able to have AI chips is completely nonsense.

Now, of course, if you ask me, would, would, would, would, would a United States be be further ahead of if the entire world had no compute at all?

But that's just not an outcome.

That's not a scenario that's true.

They have plenty of compute already.

The amount of threshold they need for the concern you're worried about, they've already reached that threshold and beyond.

And so, so I think the you missed understand that AI is a five layer cake.

And at the lowest layer layer's energy, when you have abundant of energy, it makes up for chips.

If you have abundance of chips, it makes up for energy.

For example, United States is scarce on energy, which is the reason why Nvidia has to keep advancing our architecture and do this extreme code design so that with the few chips that we ship.

Okay, with the few chips, because the amount of energy is so limited, our throughput per watt is off the charts.

But if your amount of watts is completely abundant, it's free.

What do you care about performance for a watt for?

You can use all chips to do so.

So seven nanometer chips are essentially hopper.

The ability to, for hopper, I gotta tell you today's models are largely trained on hopper.

Hopper generation.

And so, so hopper, seven nanometer chips are plenty good.

The abundance of energy is their advantage.

But then there's a question of, okay, well, can they actually manufacture enough chips given there?

But they do.

What's, what's the evidence?

Huawei just had the largest single year in the history of the company?

How many chips did they ship?

A ton.

Millions.

Millions is way more way more than anthropic house.

So there's a question of how much logic is to make an chef.

Then there's a question of how much memory.

I'm telling you what it is.

They have plenty of, they have plenty of logic and they plenty of HBM2 memory.

Right, but as you know, the bottleneck often in trading and doing inference on these models is the amount of bandwidth.

If you HBM2, I don't have a numbers off hand, but like versus the newest thing you have, you know, you can be almost an order of magnitude difference in memory bandwidth, which is Huawei's a networking company.

Huawei's a networking company.

But that doesn't change the fact that you need to EV for the most advanced HBM.

Not true.

Not at all true.

You could gang them together just like we gang them together with MBLink 72.

They've already demonstrated silicon photonics, sub-connecting all of these compute together into one giant supercomputer.

Your premise is just wrong.

The fact that it matters that AI development is going just fine.

The best AI researchers in the world because they are limited in compute, they also come up with extremely smart algorithms.

Remember, I just, what I said, I said that Moore's Law is advancing about 25% per year.

However, through great computer science, we could still improve algorithm performance by 10X.

What I'm saying is great computer science is where the lever is.

There is no question.

MOU is a great invention.

There's no question all the incredible attention mechanisms reduce the amount of compute.

We have got to acknowledge that most of the advances in AI came out of algorithm advances, not just the raw hardware.

Now, if most of the advances came from algorithms and computer science and programming, tell me that their army of AI researchers is not their fundamental advantage and we see it.

Deep-seek is not in consequential advance and the day that deep-seek comes out on Huawei first.

That is a horrible outcome for our nation.

Why is that?

Because I mean, currently, you can have a model like Deep-seek.

Because Deep-seek can run on any accelerator if it's open source.

Why would that stop being the case in the future?

Well, suppose it doesn't.

Supposed to optimize for Huawei.

Supposed to optimize for their architecture, it would put others at a disadvantage.

You described the situation that I perceived to be good news that a company developed software developed the AI model and it runs best on the American Tech Stack.

I saw that as good news.

You set it up as a premise that it was bad news.

I'm going to give you the bad news that AI models around the world are developed and they run best on not American hardware.

That is bad news for us.

I guess I just don't see the evidence that there's these huge disparities that would prevent you from switching accelerators.

There's American labs, you know, are running their models across all the clouds across all the different areas.

You take a model that's optimized for Nvidia and you try to run something else.

But the American labs do that.

And they don't run better.

And video success is perfect evidence.

The fact that AI models are created on our stack runs best on our stack.

How is that illogical to understand?

I'm just looking, look, in DropX models are run on GPUs, they're running on training on their run on GPUs.

A lot of work has to go into it to change.

But go to the global south, go to the Middle East, coming out of the box.

If all of the AI models run best on somebody else's tech stack, you've got, you've got to be arguing some ridiculous claim right now that that's a good thing for a United States.

But I guess I don't understand arguments are like, if, say, Chinese companies get to the next mythos first.

They find that all the security money really is an American software first.

But they can do that in video hardware.

And they ship it to the global south.

They doesn't have any video hardware.

How is that?

How is that good?

I mean, I just, okay, the runs are very good.

It's not good.

Right.

It's not good.

So let's not let it happen.

Why do you think it's perfectly fungible that if you didn't ship them, computer would exactly be replaced by Huawei, they are behind, right?

They have, they have worse shifts than you.

It's completely, there's evidence right now.

Their chip industry is gigantic.

You can just look at the flop for bandwidth or memory comparisons between the H200 and the Huawei 910C.

It's a graph.

I use more of it to use twice as many.

I guess it seems like argument is they have all this energy that's ready to go, right?

And they need to fill in with chips.

And they're good in manufacturing.

And I'm sure, eventually, they would be able to just, I would manufacture everybody.

But there's this few critical years.

What, what is the critical year you're talking about?

These next three years, we've got these models that are going to be doing all the cyber attacks.

If the critical years, the next critical year is this critical, then we have to make sure that all of the world's AI models are built on American tech stack.

These critical years.

Okay.

How would that prevent, if they're built on American tech stack, how would that prevent them from, if they have more advanced capabilities from launching the mythos equivalent cyber attacks.

There's no guarantee either way.

But if you have it earlier, we're going to prepare for it.

Listen, why are you, why are you causing one layer of the AI industry to lose an entire market so that you could benefit another layer of the AI industry?

There's five layers.

And every single layer has to succeed.

The layer that has to succeed most is actually the AI applications.

Why are you so fixated on that AI model?

That one company for what reason?

Because those models make possible.

These incredibly offensive capabilities and you need computer energy, the chips, the ecosystem of AI researchers make it possible.

A few months ago, Jane Street spent about 20,000 GPU hours trading back doors into three different language models.

Then the challenge may not be to find the trigger phrases.

I just kind of with the work said, who designed the puzzle about some of the solutions that Jane Street received.

If you think the base model was here and the back door model was here, you can kind of linearly interpolate the weights to adjust the strength of the back door, but you can also extrapolate it to make the back or even stronger.

And in some cases, if you make it strong enough, the model will just record to take what the response phrase was supposed to be.

So if you keep amplifying the difference between the base version and the back door version, eventually it should spread out the trigger phrase.

But this technique only worked on two out of three models.

Even Rickson isn't sure why he didn't work on the other.

They're able to verify that a model only does what you think it does is one of the most important up-and-questions in AI security.

If this is the kind of problem that excites you, Jane Street is hiring researchers and engineers.

Go to janestreet.com slash torque ash to learn more.

Okay, stepping back, it has to be the case that China is able to build enough seven nanometer capacity.

And remember, there's still stuck on seven nanometer while you will move on to three nanometer and then two nanometer or one minus nanometer with Feynman.

So while you're on one point six nanometer, they're still going to be on seven nanometer.

And they have to produce enough of it to make up for the shortfall.

And they have so much energy that the more chips you give them, the more compute they'd have.

Right?

Like, so there's, it comes out to the question of ultimately, they are getting more compute.

Computers rally in an important trading in France.

I just think you you speak in absolute.

I think that United States ought to be ahead.

The amount of compute in United States is a hundred times more than anywhere else in the world.

The United States ought to be ahead.

Okay, the United States is ahead.

And video builds the most advanced technologies we make sure that the US labs are the first to hear about it in the first chance to buy it.

And if they don't have enough money, we even invest in them.

The United States ought to be ahead.

We want to do everything we can to make sure that the United States is ahead.

Number one point, do you agree?

And we're doing everything we can to do that.

But how is shipping chips to China keeping you?

No, it's a bottleneck.

We got very, we got very Rubin for United States.

We have very Rubin for United States.

Now, United States.

Am I in United States?

Do you consider me part of the United States?

Yes.

In video.

You consider a United States company.

Okay.

Number one.

Why is it that we don't come up with a regulation that's more balanced so that Nvidia can win around the world instead of giving up the world?

Why would you want United States to give up the world?

The chip industry is part of the American ecosystem.

It's part of American technology leadership.

It's part of the AI ecosystem.

It's part of AI leadership.

Why?

Why is it that your policy, your philosophy, leads to United States giving up a vast part of the world's market?

The, the claim you're is, I'll, I'll phrase, Darryu had this quote where he said, it's like Boeing bragging that we're selling North Korea NewX, but the missile casings are made by Boeing.

And that's somehow enabling the U.S. Technology stack.

Like, fundamentally, you're giving them this giving me a very AI to anything that you just mentioned is lunacy.

But AI similar to enriched uranium, right?

And then it can have positive uses.

You know, negative uses.

We still don't want to send enriched uranium to other countries.

Who's, who's sending enriched, what the analogy is enriched uranium is because it's a lousy, it's a lousy analogy.

It's a, a logical analogy.

But if it's if that compute can run a model that can do zero day exploits against all the Americans offer, how is that not a weapon?

First of all, we added, the way to solve that problem is to have dialogues with the researchers, the dialogues with China and dialogues with all the countries, to make sure that people don't use technology in that way.

That's a dialogue that has to happen.

Okay.

Number number one.

Number two, we also need to make sure that United States is ahead.

Everything, Ruben, Vera Ruben, Blackwell, is available in United States in abundance.

Mounds of it, obviously, are, are results with show it.

A abundance, a tons of it, tons of it.

The amount of computing we have is, is great.

We have amazing AI researchers here.

It's great.

We ought to stay ahead.

However, we also have to recognize that AI is not just a model, that AI is a five-year-old cake that AI industry matters across every single layer.

And we want United States to win at every single layer, including the chip layer.

And conceding the entire market is not going to allow United States to win the technology race long-term in the chip layer in the computing stack.

That is just a fact.

I guess then the crux comes down to how it is selling them chips now, help us win in the long-term, like Tesla sold extremely good electric vehicles to China for a long time.

iPhones are sold and China are extremely good.

They didn't cause them lock in.

China will still make their version of EVs and they're dominating in this market.

I'm going to start with the conversation today.

You would acknowledge and you acknowledged that Nvidia's position is very different.

Use words like vote.

The single most important thing to our company is our richness of our ecosystem, which is about developers.

50% of the AI developers in China.

We don't want to wish the United States should not give that up.

But we have a lot of Nvidia developers in the U.S. And that doesn't prevent American laughs from also being able to use other accelerators in the future.

In fact, right now they're using other accelerators as well, which is fine and great.

I don't see why that wouldn't be the case in China as well.

If you sell them in video chips, just the same way that Google can use TPUs and Nvidia.

We have to keep innovating.

As you probably know, our share is growing, not decreasing.

The premise that even if we competed in China, that we're going to lose that market anyways.

I don't, you're not talking to somebody who woke up a loser.

And that loser attitude that loser premise makes no sense to me.

We are not, we're not a car.

We are not a car.

It the fact that I can buy a car this car brand one day and use another car brand another day easy.

Computing is not like that.

There's a reason why the X86 still exists.

There's a reason why arm is so sticky.

These ecosystems, these ecosystem are hard to replace.

It costs an enormous amount of time and energy and most people don't want to do it.

And so it's, it's our job to continue to nurture that ecosystem to keep advancing a technology so that we could compete in the marketplace.

Conceiting a marketplace based on the premise you described, I simply can't acknowledge that.

It makes no sense.

Because I don't think United States is a loser.

Our industry is now a loser.

And that losing proposition that losing mindset makes no sense to me.

Okay, I'll move on.

I just want to make sure you don't have to move on.

I'm enjoying it.

Okay.

Yeah.

I appreciate that.

Yeah.

But I think that maybe the crux and thanks for walking around the circles with me because then I think it helps bring out what the crux here is.

The crux is you're going to extremes.

Your argument starts from extremes.

That if we give them any compute at all in this narrow moment, we will lose everything.

No, I think what my argument.

Those extremes, they're, they're, let me just build this.

They're childish.

Yeah.

The idea is not that there is some key threshold of compute.

It is that any marginal compute is helpful.

Right.

So if you have more compute, you can train a better model.

And I just want you to acknowledge that any marginal sales for American technology industry is beneficial.

Actually, I mean, if the AI models that run on those chefs, yeah, are capable of cyber offensive capabilities.

We're training models, they're capable of cyber defense is running more models with those instances.

It is not a nuclear weapon, but it is, it enables a weapon of a kind.

The logic that you use, you might as well say it to microprocessors and DRAMs.

You might as well say it to electricity.

But in fact, we do have actual controls on the technology that is relevant to making the most advanced DRAM.

Right.

We have all kinds of extra controls on China for all kinds of treatments.

We saw a lot of DRAM and CPUs into China.

And I think it's right.

I guess this is back to the fundamental question of this is AI different.

Right.

If you have the kind of technology, they can find these year days in software, is that something where we want to minimize China's ability to get their first to do not necessarily be ahead?

We can control that.

How do we control that if the chips are already there and they're using that to train that model?

We have tons of compute, we have tons of AI researchers, we're racing as fast as we can.

Again, we have more nuclear weapons and AMD else, but we don't want to send in rich and uranium anywhere.

We're not enriched uranium.

It's a chip and it's a chip that they can make themselves.

But there's a reason they're buying it from you, right?

And if you have quotes from the founders of Chinese companies that say they were bottlenecked.

Because our chips are better.

On balance, our chips are better.

There's just no question about it.

In the absence of our chip, in the absence of our chip, can you acknowledge that while we had a record year?

Can you acknowledge that the whole bunch of chip companies have become public?

Can you acknowledge that?

Can you can also acknowledge that the fact that we used to have a very large share in that market?

And we no longer have the large share in that market?

We can also acknowledge that China is about 40% of the world's technology industry.

That market to leave that market, can see that market for a United States technology industry is a disservice to our country.

It is a disservice to our national security.

It is a disservice to our technology leadership.

All for the benefit, all for the benefit of one company.

It makes no sense to me.

I guess I'm confused of if you're making two different statements.

One is that we're going to win this competition with Huawei because our chips are going to be way better for a lot to compete.

And another is that they would be doing the same exact thing without us anyways.

How can those two things be the same for the same time?

It's obviously true.

In the absence of a better choice, you'll take the only choice you have.

How is that illogical?

But so much reason they want in many chips is they're better.

Better is more compute.

More compute means you can treat better.

It's better because it's easier to program.

We have a better ecosystem.

But whatever the better is, whatever the better is, and of course we're going to send them compute.

So what?

The fact that a matter is we get the benefit.

Don't forget, we get the benefit of American technology leadership.

We get the benefit of developers working on the American tech stack.

We get the benefit as those AI models diffuse out into the rest of the world.

The American tech stack is therefore the best for it.

We can continue to advance and diffuse American technology.

That I believe is a positive.

It's a very important part of American technology leadership.

Now, the policies that you're advocating resulted in the American telecommunication industry being policy out of basically the world to the point where we don't control our own telecommunications anymore.

I don't see that as smart.

It's a little narrow-minded and let to unintend the consequences that I'm describing to you right now that you seem to have a very hard time understanding.

Okay.

Let's just a back.

It seems like the crux here is there's a potential benefit and there's a potential cost and we're trying to figure out is the benefit worth the cost.

I guess I'm trying to get you to acknowledge the potential cost that compute is an input to training powerful models.

Powerful models do have powerful offensive capabilities like cyber attacks.

It is a good thing that American companies got to cloud mythos level capabilities first and then now they're going to hold off on this capability so that the American companies and American government can make their software more protected before this level capabilities announced.

If China had had more computer, I've had more car compute.

We could have had made a mythos level model earlier and deployed it widely.

That would have been very bad.

One of the reasons that hasn't happened is that we have more compute things to companies like America.

That is a cost of sending to China and so let's see if the benefit is set for second.

Do you acknowledge that this is a potential cost?

I will also tell you the potential cost is we allow one of the most important layers of the AI stack, the chip layer, to concede an entire market.

The second largest market in the world so that they could develop scale so that they could develop their own ecosystem so that future AI models are optimized in a very different way than the American tech stack.

As AI diffuses out into the rest of the world, their standards, their tech stack will become superior to ours because their models are open.

I guess I just believe enough in Nvidia's kernel engineers and Kudo engineers to think that they could optimize more than kernel optimizations as you know.

Of course, but there's so many things you can do from distilling to a model that's well fit for your choice.

We're going to do our best.

You have all this offer.

It's a long term lock-in.

So your Chinese ecosystem is the they have as excitedly better open source model for a while.

China is the largest contributor to open source software in the world.

Fact.

Right.

China is the largest contributor to open models in the world.

Fact.

Today it's built on the American tech stack in videos.

Fact.

All five layers of the tech stack for AI is important.

You know, it states how to go win all five of them.

They're all important.

The one that is the most important, of course, is the AI application layer.

The layer that diffuses into society, the one that uses it most will benefit from this industrial revolution most.

But my point is that every layer has to succeed.

If we scare this country into thinking that AI is somehow a nuclear bomb so that everybody hates AI and everybody's afraid of AI.

I don't know how you're helping the United States.

You're doing a disservice.

If we scare everybody out of doing software engineering jobs because it's going to kill every software engineering job and we don't have any software engineers as a result of that.

We're doing a disservice to United States.

If we scare everybody out of radiology so nobody wants to be a radiologist because computer vision is completely free and no AI is going to do a worse job than radiologists.

And we we miss understand the difference between a job and the task, the job of a radiologist, patient care, task to read a scan.

If we miss understand that so profoundly and we scare everybody out of going to radiology school, we're not going to have enough radiologists and good enough health care.

And so I am making the case that when you make these make a premise that is so extreme, everything goes from zero or infinity.

We end up scaring people in a way that's just not true.

Life is not like that.

Do we want United States to be first?

Of course we do.

Do we need to be a leader in every layer of that stack?

Of course we do.

Is today you're talking about mythos because mythos is important?

Sure, that's fantastic.

But in a few years time, I'm making you the prediction that when we want the American tech stack, when we want American technology to be diffused around the world, out to India, out to the Middle East, out to Africa, out to Southeast Asia, when our country would like to export because we would like to export our technology.

We would like to export our standards.

On that day, I want you and I to have that same conversation again.

And I will tell you exactly about today's conversation about how your policy and how what you imagined.

Literally causing our state to concede the second largest market in the world for no good reason at all.

We shouldn't concede it.

If we lose it, we lose it.

But why do we concede it?

Now, nobody is advocating.

Nobody is advocating in all or nothing.

Nobody's advocating.

All or nothing meaning we ship everything to China at all times.

Nobody's advocating that.

We should always have the best technology here.

We should always have the most technology here and the first.

But we should also try to compete and win around the world.

Both of those things can simultaneously happen.

It requires some amount of nuance, some amount of maturity instead of absolute.

The world is just not absolute.

Okay.

The argument hinges on the built models that are specified for their architect, the best ships that they make in a few years.

And those ships can export around the world.

That's a standard.

Because of EUV, export controls, as we said, you're going to move on to 1.6 centimeters.

There's going to be 1.7 nanometer, even after a few years from now.

And it makes sense that domestically, they would prefer, hey, we got so much energy.

We can manufacture such scale.

Most of it would be just a 7 nanometer.

But the exporting thing, their 7 nanometer ships have to be competitive against, well, you're 1.6 nanometer ships.

And their models have to be so far optimized for the 7 nanometer that's better to run their models on 7 nanometer than to run their models on your 1.6 nanometer.

Can we just look at the facts then?

Okay.

Is blackwell 50 times more advanced lithography than hopper?

Is it 50 times?

Not even close.

I just kept saying it over and over again.

More laws dead.

Between hopper and blackwell, from the transistors themselves, call it 75%.

It was three years apart.

75%.

Blackwell is 50 times hopper.

My point is architecture matters.

Computer science matters.

Some can conduct a physics matter as well.

The computer science matters.

AI, the impact of AI, largely comes from the computing stack, which is the reason why CUDA is so effective, which is the reason why CUDA is so beloved.

It's an ecosystem, a computing architecture that allows for so much flexibility that if you wanted to change an architecture completely, create something like MOE, create something like diffusion, create something that's disaggregated.

You could do so.

It's easy to do.

The fact of the matter is AI is about the stack above as much as it is about the architecture below.

To the extent that we have architectures and software stacks that are optimized for our stack for our ecosystem, it is obviously good because we started the conversation today about how Nvidia's ecosystem is so rich where people always love programming and CUDA first.

They do.

And so did researchers in China.

But if we are forced to lead China, if we're forced to lead China, it would be, well, first of all, it's a policy mistake.

Obviously has backlash.

Obviously, it has far, you know, has turned out badly for the United States.

It enabled it accelerated our chip industry.

It forced all of their AI ecosystem to focus on their internal architectures.

It's not too late, but nonetheless, it has already happened.

You're going to see in the future, they're not stuck at 7 nanometer, obviously.

They're good at manufacturing.

They will continue to advance from 7 and beyond.

Now, is there 10x difference between 5 nanometer and 7 nanometer?

The answer is no.

Architecture matters.

Networking matters.

That's why Nvidia bought melanox, networking matters.

Energy matters.

And so, all of that stuff matters.

It's not simplistic like the way you're trying to steal it.

We can move on from China.

But it actually raises an interesting question about, we were discussing earlier at these bottlenecks at TSMC and memory and so forth.

And so, if we're in this world where, you know, you're already in majority of N3.

At some point, you'll be in two.

You'll be in majority of that.

Do you see that you could go back to N7, the spare capacity at an older process node and say, hey, the demand for AI is so great and our capacity to expand the leading edge is not meeting it.

So, we're going to make a hopper ampere about everything we know about a numerous today and all the other improvements you described.

Do you see that world happening within before 2030?

It's not necessary too.

And the reason for that is because with every generation, the architecture, the architecture is more than just the transistor scale.

It also, you're doing so much engineering and packaging and stacking and the numerics and, you know, the system architecture.

When you run out of capacity to easily go back to another node, that's a level of R&D that no one could afford.

You know, we could afford to lean forward.

I don't think we could afford to go back.

Now, if the world simply says, if on that day, on that day, let's do the thought experiment, on that day, we go, listen, we're just never going to have more capacity ever again.

What I go back and use 7th in a heartbeat.

Of course I would.

What questions somebody I was talking to had is, why in video doesn't run multiple different chip projects at the same time with totally different architectures.

You could do like a three-brist style way for scale.

You could do a dojo style huge package.

You could do one without Kuda, you know, you have the resources of the engineering talent to do all of these in parallel.

So I put all the eggs in one basket given who knows where yeah, I might go and architectures my go.

Oh, we could.

It's just that that we don't have a better idea.

Yeah, yeah, we could do all of those things.

It's just not better.

And we simulate it all.

They're in our simulator, proveably worse.

And so we went to it.

Yeah, we're doing, we're working on exactly the projects that we want to work on.

And if the workflow were to change dramatically, and I don't mean the algorithms, I actually mean the workload.

And that depends on the shape of the market.

We may decide to add other accelerators.

Like, for example, recently we added a grock.

And we're going to fold grock into our Kuda ecosystem.

And we're doing that now because the value of tokens have gone up so high that you could have different pricing of tokens.

Back in the old days, and just a couple of years ago, tokens are either free or barely, you know, barely expensive, right?

And so, but now you can have different customers and those customers want different answers.

And so, because the customers makes so much money, like for example, our software engineers, if I can give them much more responsive tokens so that they're even more productive than they are today, I would pay for it.

But that market is only recently emerged.

And so, I think that we now have the ability to have the same model based on the response time have different segments.

And that's the reason why we decided to expand the Pareto frontier and create a segment of inference that is faster response time even though it's lower, lower throughput.

Until now, higher throughput is always better.

We think that there could be a world where there could be very high ASP tokens.

And even though even though the throughput is lower in the factory, the ASP's make up for it.

That's the reason why we did it.

But otherwise, from an architecture perspective, I think Nvidia's architecture is, I would rather put, if I have more money, I would put more behind the architecture.

I think this idea of extremely premium tokens and just the disaggregation of the inference market is very interesting.

The segmentation.

Yeah.

I find a question.

Suppose the deep learning of revolution didn't happen.

What would Nvidia be doing?

Obviously, game is, but given, accelerated computing.

Accelerated computing.

The same thing we've been doing along.

The premise of our company is that Moore's law is going to, general course computing is good for a lot of things.

But for a lot of computation, it's not ideal.

And so, we combined an architecture called a GPU, CUDA, to a CPU so that we can accelerate the workload of the CPU.

And so different, different kernels of code or algorithms could be offloaded onto our GPU.

And as result, you speed up an application by 100x200x.

And where can you use that?

Obviously, engineering and science and physics and so on.

So data processing, computer graphics, image generation.

I mean, all kinds of things, even if AI doesn't exist today, Nvidia will be very, very large.

Yeah.

And so, so I think the reason for that is fairly fundamental, which is, which is the ability for general purpose computing to continue to scale, has largely run its course.

And not the only way, but the way to do that is through domain-specific acceleration.

And one of the domain that we started with was computer graphics.

But there are many, many other domains.

I mean, there's all kinds of scientific particle physics and fluids and structure data processing, all kinds of different types of algorithms that benefit from CUDA.

And so our mission was really to bring accelerated computing to the world and advance the type of applications that general purpose computing can't do and scale to the level of capability that helps break through certain fields of science.

And so some of the early applications were molecular dynamics, seismic processing for energy discovery, image processing, of course.

And so all of those kind of fields where general purpose computing is simply too inefficient to do so.

And so, yeah, if there's no AI, I would be very sad.

But because of the advances that we made in computing, we democratized deep learning.

We made a possible for any researcher, any scientist, anywhere, any student to be able to access a PC or a G4's adding card and do amazing science.

And that fundamental promise has in change, not even a little bit.

And so if you see GTC, if you watch GTC, there's the whole beginning part of it, none of it's AI.

That whole part of it with computational lithography or or our quantum chemistry work or all of that stuff, data processing work, all of that stuff is unrelated to AI.

And it's still very important.

I mean, I know that AI is very interesting and quite exciting.

But, but there's a lot of people doing a lot of very important work that's not not AI related and tensors that's not the only way that you compute with.

And we want to help everybody.

That's it.

Thank you so much.

You're welcome.

I enjoyed it.

Thank you.