NVIDIA AI Podcast · 2025-05-07

NVIDIA's Rama Akkiraju on Building the Right AI Infrastructure for Enterprise Success

Hosts: Noah Kravitz

Guests: Rama Akkiraju

enterprise AI infrastructureAI platform architectagentic AIRAG and vector databasesLLM observability and evaluationAI-native enterprise architecturedomain-specific modelsAI in software developmentNVIDIA NIM and DynamoNVIDIA AgentIQ/AIQ

Read summary Jump to transcript Go to episode

Podcast feed URL

Open feed

Why it matters

NVIDIA VP: enterprise AI needs dedicated platform architects as SaaS fades

Key claims

NVIDIA frames enterprise AI as requiring a dedicated 'AI platform architect' role that is both technical expert and strategic thinker, separate from traditional platform architects.
Akiraju says the gap from perception AI to generative AI was ~25–30 years, but generative to agentic AI took only ~2 years, accelerating enterprise transformation.
She argues generative AI fundamentally rewires enterprise processes, claiming 'SaaS may be dead' as business logic shifts into AI layers rather than staying in traditional SaaS apps.
The enterprise AI stack she describes includes vector DBs, RAG pipelines, GPU-optimized inference (NVIDIA NIM), LLM gateways, LLM observability, auto-evaluation, agentic frameworks (LangChain, LangGraph, NVIDIA AgentIQ/AIQ), content security, and data flywheel/fine-tuning pipelines.

Episode summary

Summary

In Episode 255 of the NVIDIA AI Podcast, host Noah Kravitz speaks with Rama Akkiraju, VP of IT for AI and ML at NVIDIA and former IBM Fellow who worked on Watson, about the emerging role of the AI platform architect and what it takes to operationalize AI inside large enterprises. Akkiraju frames her team's mission as driving AI adoption across NVIDIA itself, building chatbots, copilots, AI agents, and an enterprise-wide generative AI platform that other business units consume via APIs and low-code/no-code interfaces.

She traces the evolution of enterprise AI from perception/classic ML through the generative AI era and into the current agentic AI phase, noting that the leap from generative to agentic took roughly two years versus the 25–30 years from perception to generative AI. She argues the impact on the enterprise is not just incremental automation but a fundamental rethinking of business processes and software—suggesting even that "SaaS may be dead" as AI logic moves into the business layers themselves. She walks through the complexity of building enterprise AI: data ingestion pipelines, RAG over vector databases, role-based access control, content security, model selection, evaluation, observability, and continuous improvement via data flywheels.

Akiraju lays out the components of a full enterprise AI stack—enterprise data management, content security, vector databases and retrievers, compute/infra, GPU-optimized inference serving (mentioning NVIDIA NIM), LLM gateways, LLM-level observability, auto-evaluation frameworks, agentic frameworks (LangChain, LangGraph, and NVIDIA's own AgentIQ/AIQ released at GTC), low-code/no-code tooling, and fine-tuning/data flywheel pipelines. Looking ahead, she highlights three trends: unification of AI and traditional enterprise architecture, rise of domain-specific and smaller models alongside specialized hardware (including edge/mobile/browser), and the proliferation of more autonomous agentic systems requiring long-term memory, context management, and workflow chaining—with NVIDIA's Dynamo and related hyperscale compute stack positioned as the infrastructure response. She points listeners to ai.nvidia.com for workflows, blueprints, and code samples.

NVIDIA frames enterprise AI as requiring a dedicated 'AI platform architect' role that is both technical expert and strategic thinker, separate from traditional platform architects.
Akiraju says the gap from perception AI to generative AI was ~25–30 years, but generative to agentic AI took only ~2 years, accelerating enterprise transformation.
She argues generative AI fundamentally rewires enterprise processes, claiming 'SaaS may be dead' as business logic shifts into AI layers rather than staying in traditional SaaS apps.
The enterprise AI stack she describes includes vector DBs, RAG pipelines, GPU-optimized inference (NVIDIA NIM), LLM gateways, LLM observability, auto-evaluation, agentic frameworks (LangChain, LangGraph, NVIDIA AgentIQ/AIQ), content security, and data flywheel/fine-tuning pipelines.
NVIDIA's IT team 'drinks its own champagne'—building on early NVIDIA hardware/software to create zero-reference enterprise solutions—giving product teams feedback.
Three forward-looking trends: (1) AI-native enterprise architecture unifying traditional stacks, (2) domain-specific smaller models plus specialized hardware for edge/mobile/browser, (3) more autonomous multi-step agentic systems needing long-term memory and workflow orchestration, supported by NVIDIA's Dynamo stack.
AI is reshaping software development itself—not just coding but planning, debugging, testing, and PRDs—requiring GPU-backed infrastructure, RAG, and text-to-SQL pipelines as a new layer in the dev stack.
Listeners directed to ai.nvidia.com for NVIDIA's models, workflows, blueprints, and downloadable code samples.

Source material

Transcript

[Music] Hello and welcome to the NVIDIA AI Podcast.

I'm your host, Noah Kravitz.

As AI continues to evolve, organizations need to think carefully about how best to develop and integrate scalable AI systems within their existing infrastructure.

This is where the AI platform architect becomes so important.

It's a role that's part technical expert, part strategic thinker, and critical to driving AI innovation and transformation within a company.

With us to explore the importance of the AI platform architect and AI infrastructure in the enterprise more broadly is Rama Akkiraju.

Rama is VP of IT for AI and ML at NVIDIA, where she leads AI and ML initiatives for enterprise use cases.

She's a former IBM fellow who worked on the Watson Artificial Project as part of her two plus decades at IBM.

Rama has also been honored as a top 20 woman in AI by Forbes and a team for AI by Fortune magazine among numerous other accolades.

Rama, it's a pleasure to have you here.

Welcome to the NVIDIA AI Podcast.

Thank you so much Noah.

Thanks for having me here.

Thank you for taking the time.

So maybe you could set the stage for the audience by talking a little bit about what your role is.

What do you ever see?

What are your teams working on?

And what are some of these enterprise use cases that you're leading the delivery of?

Sure.

I lead the enterprise AI and automation team at NVIDIA, where I drive AI adoption across the company, enhancing developer productivity, IT operations, and enterprise workflows.

So as part of this effort, my team and I build chatbots, copilots, AI agents for improving our own employee productivity and developer productivity.

We also build AI platforms that are enterprise grade for everyone in the company to use.

So let me give you a couple of examples of the kind of things that we develop.

There is one that sits on our intranet, what is called as NV info, a chatbot that answers employees' questions.

I think many things related to the company, internal documentation, company policies, and financial information, and so on.

And then we build chatbots and copilots that sit within our developer productivity tools like our bug management tools.

And we also work with our sales supply chain, marketing, finance, and other teams to help them build their own transformation AI projects and such using the platforms that our team builds.

We built a generative AI platform in the company, and everybody can use it both with APIs and low code, no code type of interfaces to build their own generative AI solutions and deploy them for various other use cases.

And you've been with NVIDIA just about three years now?

Not yet, two and a half.

Two and a half.

And so prior to that, you were at IBM for quite a while and worked on some amazing things.

Can we take just a second and maybe speak to your background a little bit and what led you to come to NVIDIA?

Sure.

Well, I worked at IBM Research and IBM Product Divisions for several years.

And throughout my career, I have been actually applying AI for solving different kinds of problems.

More recently, just before coming to NVIDIA, I led the AI ops, which is the applying AI for IT operations management, the product suite development and IBM Watson division.

And prior to that, I led a lot of pre chat GPT kind of natural language understanding suite of services that were part of IBM's Watson.

My passion and interests have always been about solving real world problems with AI.

With opportunity came from NVIDIA to look at all of the enterprise related use cases.

And, you know, in the new era, chat GPT was just about it wasn't there yet, but within a few months after I came, but then the previous versions of GPT were all there.

So we were at the, you know, this introduction point where AI could really now deliver on the promise that it had all along that and people, you know, talked about, but now it was about to happen.

And this role and this opportunity to look at pretty much all the use cases in a company and apply AI to solving them and transforming them into rethink the way problems are solved and business processes are done and implemented was just too exciting to pass up.

Right.

It sounds like it hit that sweet spot of the leading edge research, but being applied, as you said, to solving real problems.

That's great.

So you alluded to this a little bit, just talking about your own background, but can you talk a little bit about the evolution of AI kind of, you know, talking about machine learning and generative AI, as you alluded to, and some of the things going on now in the enterprise, but with that lens of the enterprise, what this evolution of AI over time has meant and means for the enterprise.

Yeah, it's been fascinating.

Actually, you know, since I've been in this space before natural language processing and understanding really even kind of took off, you know, from grad school on days, we were looking at perception AI, right, where computer vision models are running on, you know, whatever available compute at the time, and then mostly applied to structured data with data mining as the main discipline that was in vogue, and then statistical machine learning was slowly starting to make an impact in the enterprise, which was being actively applied for various use cases for supply chain and those types of use cases.

So that's the era of classic ML or perception AI, if you will, computer vision was there, but you know, not enough data, not enough compute and all of that.

Right.

Cut short now to generative AI era that happened with the generative retrained transformers, the GPT models and generative AI era, where the language understanding really took off, right, with even multimodal inputs, not only text, but images and all of that.

So that really unleashed a whole slew of other use cases in the enterprise where you can now start to tap into the unstructured data, which was pretty much off limits for any kind of automated insight derivation before, which was 80% of enterprise data.

And now you can not only do data mining and unstructured data, which is constitutes for 20% of enterprise data to now open it up to all data, which includes the unstructured data.

Can you give an example kind of from the enterprise of the difference between structured and unstructured data?

Yeah, sure.

So structured data is, you know, everything that you are capturing in your, you know, database tables, transactions, your purchases and your sales data and all of that, right?

Whereas unstructured data is, you know, you have all these, hey, this is how my company runs, this is a documentation about my product.

And these are my notes from this meeting.

And so all of this information that sits in, say you put them in Google on Google Drive or SharePoint or a Confluence pages or, you know, they're sitting there, but humans have to read process and then create some kind of structured tables if you have to do some insights from derive some insights from them.

And that was a laborious process and in the in a long way, because we couldn't tap into it.

And we couldn't question it, we couldn't get insights from it easily.

It was all kind of manual effort.

And people spend a lot of time reading it, you know, if you have to go to do a customer presentation, you would have to read so many documents, you have to go find where all that information is in the enterprise and pull all of them and create some succinct summary that pertains to what you need to prepare for this particular customer meeting.

Right.

So all of that you could now imagine with, you know, tools like chat GPT or even deep seek models, you can just, you know, ask it, you upload a few of the documents and check create me a nice summary for this particular customer meeting, given that these were waiting minutes from last meeting, right, you can your productivity is shoots up significantly.

And because you are able to now tap into all of that information to write one example from this non-structured data.

Great.

Yeah.

So from that classic model, AI perception, AI to generative AI, which is where now majority of the use cases are being explored are with large language models with multimodal inputs, even where you can also include PDF documents and and PPTs and those sort of things to derive input to now we're already in agentic era where, you know, people are talking about AI systems and models that can reason and plan and act autonomously, right, but and they can integrate into various tools, workflows, and and you can automate things.

So examples being, you know, you could say I have these five what if scenarios to run in my supply chain planning, do them and tell me if there is an, you know, disruption to my supply chain from Hong Kong for this week because of whatever that's happening geopolitical situations, what changes do I need to make, let's say, you would start with those kind of what if scenarios and the system would do planning and figure out all of those things.

In fact, you could even go further upstream and say what other what should be the what if scenarios that I should run in order to derive the get to the best allocation of my demand or, you know, to ensure that my supply meets the demand, given these particular set of situations that are unfolding around me between between my supply network partners.

So that is even before you can run the what if scenarios, this AI systems can help you plan and reason.

So that's an agent AI example in the enterprise.

Okay.

And then of course, the physical AI with more sensors and where you have digital twins types of things, if you're modeling your data centers, and you can have a or even your inventory warehouse, you can monitor everything rack to shelves to things that are sitting on it and have full monitoring.

So with physical AI that takes it to the next level, combined with agentic AI and generative AI.

So there is the AI evolution, while it took maybe 30 25 to 30 years from perception AI to get to generative AI to go from generative AI to agentic AI, it took two years yet.

So physically AI already work is underway.

So it's having a huge impact on the enterprise.

And, you know, if I have to summarize the impact on the enterprise, it's not just summarization.

And, you know, doing things the same way, but automating things a little bit more, it's no longer that it's actually enabling us to fundamentally rethink the way we do things and write business logic, because you can now do more personalization, you can now do more reasoning and planning.

So even the things that you weren't able to do before, are doing it with multiple business processes can all be now combined into a single process that's a lot more efficient.

So that's why people are even saying, you know, SAS may be dead, the business applications need to be rethought, because the logic now starts to go into the AI layers and the business logic, it starts to expand into all of the business layers, the AI capabilities, right?

So we need to rethink so the enterprises are up for huge transformation, and we're only at the beginning of that in terms in that rethinking journey.

And it requires not only re-formatting the use cases, as they are known to rethinking, reprocessing fundamentally, and also upskilling, re-skilling in the enterprise, and looking across all of the business functions, HR, IT, finance, legal, marketing, sales, everything, all of those functions are up for transformation.

And in doing so to enable everybody to get their faster exploration, experimentation, and building platforms is going to be a very critical aspect of it, because the platforms are the ones that enable you to quickly leverage these, because the stacks still are very complex to build and to test and all of that.

So the more they're fully baked and tested, that makes it easy for everybody to really leverage the capabilities to transform their use cases.

Right.

So much has happened, and I was thinking of it as you were speaking.

I didn't want to interrupt, and I also always hesitate to speak to somebody who's in the thick of the technical building, the things that we're talking about, the technologies, to say, "Well, from the outside, it seems like it's moving so fast."

But I mean, you said it two years, give or take, from after a 30-year gap from perception to generative, two years to agentic just really seems like a blur.

I want to ask you to get into some of the details about developing agentic AI applications for the enterprise.

But first, and this is kind of going up a level again, so it was not a fair question to ask, that's fine, but can you kind of quantify for the business user, the technology leader at an organization who has been hearing about agentic AI and understands, "Oh, it can reason, it can plan and do these more complex tasks."

But is there a way to sort of talk about how big of a step forward of a leap, going from entering a prompt into a chatbot, an LN behind it, and going step by step like that to using agentic AI and talking about some of the workflows and applications that you're working with in the enterprise?

Yeah, it's a non-trivial process, definitely, and we need more tools and automations to simplify that to get there.

But enterprise data is complex, right?

Let's start there first.

If you look at enterprise data, structured unstructured data is the two sets that we talked about.

There is something that's a combination, semi-structured data, and then this is all human-generated data.

There is machine-generated data, like the logs, tickets, alerts, metrics, and all of the IT systems and infrastructure and others generate.

So first of all, enterprise data is complex, and moreover, the data tends to be kind of very distributed in different places, and there is access control permissions that one has to deal with and all of that.

So to build, going from prompt, you put something in a prompt, to really getting value out of it really requires a full-blown multi-layered stack together, starting with enterprise data ingestion pipelines, which ingest all of the enterprise data so that we can enable this fresh enterprise data and make it available to LLMs to use.

Because LLMs are trained on public domain data, they don't know anything about your enterprise data.

So we need to leverage all of the enterprise data and still leverage the goodness of LLMs.

So the way to do that is there are techniques like retrieval augmented generation where you essentially load all of your enterprise data into these vector databases where all this unstructured data is vectorized, and structured data can still stay in the structured tables.

And then you put these AI pipelines which can go at, do the inferencing, by means of inferencing, go retrieve the relevant documents for a given question or an insight that you need to generate from either the unstructured database, which is through the vector database pipeline, or to the structured database, which could be through top-tier supply chain data like top-tier data types of pipelines, generate text to SQL.

So that is one way where you can make these pipelines have access to fresh enterprise data.

So first you have to solve for all of the things related to that, which includes continuous ingestion of your data into the right kind of databases, vector databases, and so on, so that you can do the information retrieval and insights generation from it.

And that requires role-based access control management and all of that.

So that whole layer needs to be managed and built.

Then the accuracy of insights generation from unstructured data is in itself is still very much an art where you have fine-tuned for various parameters like how do you process the data, how do you chunk it, and how good is the retrieval relevancy, how are you going to re-rank the results that you get, and what embedding models do you use, are LLMs doing the right thing or hallucinating which model is the right model for me and all of that.

So that whole layer needs to be put together.

Then once you have the initial set, you need to make sure that you have the full stack of testing, validation, pipelines, and all of that built out.

Then you need to make sure that no overly shared sensitive documents are lying around in the company, which could accidentally be revealing some sensitive information.

So you have to work for that with enterprise content security kinds of platforms.

So you have to build out this whole stack of platforms and only then you are able to now build your agentic workflows on top of this stack with all of the enterprise controls that passes your security with the right kind of guardrails and everything.

So the stack is pretty complex and that's where the architecture and the platform story comes up really, the wanting to build that up.

So pretty much right now, every company is having to either build that platform or get it from some kind of a vendor who supplies that platform, but still they have to hook it up with all their enterprise data because that nobody can do it for you.

You have to either find the right kind of hoses and the plug points to connect to them so that the data starts flowing into the stack.

And that's where you then you start building the workflows, then you have to test it.

The accuracy has to be good.

That's when you then do the pilots and then do the deploy and then start to observe the usage.

Sometimes it takes what you build needs to be embedded as co-pilots into the development environments or whatever environment where the users are doing their work for them to really be able to access these tools without friction.

Right?

Something that changes their workflow significantly, they won't use it because they have to go out of their way.

So you have to think about all those deployment related aspects to deploy it and then ensure that people are using it, monitor the usage metrics based on that compute, maybe the productivity gains that you're getting or not getting and why and take the feedback and continuously improve the models.

And for that also you need a whole platform stack, which is the data flywheels continuously improving from user feedback data, if need be improve the prompts, if need be improved your models by fine tuning them or if need be improve the retrieval relevance accuracy or embedding model.

So there are many control points along the way in each one of these layers in the stack that need to be carefully looked at for continuous improvements as well.

So it's a non-trivial process to go from a use case that you think about to making it really work in the enterprise and platforms will play a significant role in...

Right.

And this is where the role of the architect comes in.

Exactly.

I mean, the role of architects and any kind of a platform architect in an enterprise IT has always been there.

Right.

But what is now kind of taken to the next level is this opportunity where you can really derive insights from all of your enterprise data as opposed to what was originally only structured data.

Right.

And the opportunity to now significantly rethink the way the business processes are done by...

Because now you can plan and reason and automate a lot of things and workflow automations and all.

So to build all of those things out, you need on top of your existing stack, which always has been there and will continue to be there, you need this whole new levels of stacks of platforms, if you will, that are specific to AI, generative AI.

And that's where this role becomes very important.

Like for example, vector databases management, container orchestration has been there, but now you are talking about GPU level container orchestration.

Okay.

Right.

And for that, you need GPU optimizations, quantization, and you are talking about new microservices that you now have to manage.

Then you are observing, you already have maybe application performance monitoring and observability platforms in your IT, but now you have to monitor the LLM.

So you need the next level of LLM observability.

The pipelines will write logs about, this is the prompt that came in.

This is the retrieved chunks.

This is how I reranked it.

And this is how the new prompt got constructed to send to an LLM.

This is how the citations were generated.

There could be mistakes anywhere in the pipeline.

So how do you know?

You have to have the LLM level observability at that level.

You have to have an auto-evaluation framework.

And if you're calling external LLMs, you have to have an LLM gateway so that you're monitoring the cost and subscriptions and all of that.

And also, carefully making sure that no sensitive data is going out.

You need different kinds of storage for storing your training data and even the checkpoints of the models that you may be fine tuning.

And you may need agentic frameworks like Langchain, Langgraph, all those kinds of things.

So it's like a whole slew of set of new things that you now have to have in your platform to manage the APO workflows that you might be building.

As I was listening, I kind of went from thinking, so is there a sort of difference in, I don't know, obviously in skill set, it's continual, you know, upscaling and staying on top of things.

But in terms of like personality and strategic approach from a sort of pre-AI platform architect, as opposed to an AI platform architect, like we were talking.

But then by the end of what you're just saying now, I thought, oh, we can just take what Rahm is saying and unpack it afterwards, run it through an LLM perhaps and build out a blueprint for all the things that this role should cover.

So it's terrific.

But is it a matter of an IT professional who's a platform architect?

It sounds like there's AI, I mean, we know this, that AI and everything you've been talking about has brought us to this moment of real transformation for, as you said, how businesses do business, how we think and rethink our processes, even how we rethink software.

So is it more of a, than an evolution of a platform architect, is it a sidestep into just a whole new world?

It definitely is a whole new set of skills that the same existing architects upskill themselves to learn all these things and get there, or new roles that emerging, you know, if I look at my own team, for example, I mean, we have this unique program within Nvidia where we, you know, kind of drink our own champagne.

So we test a lot of our Nvidia's own hardware and software technologies, even program to give early feedback to our product teams, but also really build out enterprise grade solutions on our own stack to provide that feedback to the market to zero reference.

So this is how you can build things.

So as part of that, our team had to be pretty sophisticated in terms of really understanding and very quickly taking very early technology that product teams are putting out to test them out.

So in our case, yes, we ended up actually creating a machine learning team who deeply understands all of these technologies, both from an engineering perspective and also from a data science perspectives in terms of how best to evaluate the models, how best to improve the accuracy, how to construct the pipelines, what are the control points and all of that.

So for a company that may not have this kind of a drink your own champagne are building these kinds of actual LLM models and the full stack set of software like how Nvidia does.

So you may not have the luxury to hire sophisticated ML teams to build this out.

So that's why it becomes all the more important to build platforms that make it as easy as possible to operate that can then be used by an engineer who is skilled enough in some of the core technologies of cloud maybe and managing Kubernetes and those sort of things can upskill themselves to leverage these platforms to build things out.

I do want to add from what I'm seeing my observation perspective though, at least most medium to large companies have some amount of machine learning teams that are actively exploring, experimenting and building things out and kind of paving the way, defining and deriving the recipes within their own companies to help other teams to come along.

So there is some of this central AI ML team that either builds the platforms or tests and derives the best practices and creates those recipes to enable others to run fast.

That IT has to happen for the foreseeable future until these tools and platforms become so easy that anybody can operate there.

Right.

Makes sense.

What are the key components that go into an AI stack these days?

And I was an IT leader, a platform architect, go about deciding between cloud-based and on-prem, on-premises solutions for the organization.

Yeah.

So if you look at the stack, I mentioned a few of those things before.

Say for example, enterprise data management, first you have to have that, either for structured data or unstructured data.

Where does my enterprise data sit?

Is it properly protected with role-based access control?

And are there right kind of APIs available for me to fetch that data, load it into either my vector databases or something else?

So that's enterprise data management.

It's something that every company has to have and that has to be the foundational starting point for anything.

Then some of the other things like enterprise content security, again, ensuring that you cannot bypass any of those things.

You have to make sure that any data that is being used for any particular use case is access control protected and also sensitive data is protected in the company because sometimes people put confidential documents out in the open by mistake.

These powerful tools can now go and search all of that data and could accidentally expose sensitive.

So that has to be there as well.

So enterprise data management, content security, then comes the rest of the AI stack.

Like for example, you need a vector database or you're tapping into unstructured data and you need a retriever that sits on top of the vector database to allow you to operate with it with embedding models and all vectorize the data and get the insights and the chunks or documents from it.

Then you need of course the full compute and infrastructure that needs to be set up.

If you're going with SaaS applications, then of course the SaaS vendors will provide all of that for you as a services.

But if you're setting up on-prem for whatever reason, could be a data sensitive and cannot move to cloud or whatever those reasons may be.

And some companies have regulatory requirements and such and they need to be on-prem within the country and different kinds of restrictions.

And if that is the case, you have to have the full compute and infra management layers for your setting up the data centers, the operating systems, the container platforms, the storage and all of that.

So those are all part of the stack.

Again, if you're doing it on-prem, there is the AI model inference serving.

How do you serve the GPU optimized versions of the inference models, whatever you pick, either a public open access one like the LAMA models that are like for example available through NVIDIA as NVIDIA inference microservices.

Or if you're calling some external ones OpenAI's GPT for or cloud or anything else, then you may want to go through LLM gateway.

So LLM gateway should be part of the stack.

Then as I mentioned, LLM observability tools need to be there or you build them.

And then the basic core container and application performance monitoring has to be there.

An auto evaluation framework is a critical part of that stack because whenever you build something, you want to automatically evaluate it against accuracy, how helpful the answer is, are the citations good, maybe use LLM itself as a judge to compute all of this and then measure the latency.

So all of that automated evaluation framework should be part of the CI/CD, part of the development process.

Then any kind of agentic frameworks, you have Lang chain, Lang graph and with NVIDIA, we now have AI, agent IQ and AIQ which have released at the GTC conference.

So these are agentic frameworks that sit on top, allow you to quickly configure the agents and tools so that you can set up things for planning and reasoning.

Then you probably, if you're talking about in an enterprise, you want to have some kind of a user experience that everybody don't have to go build and figure it out.

So there is an UI that's built into this platform that comes ready.

If you want to build a chatbot, you get a chatbot with the feedback gathering mechanism, all of those things baked in, the platform should have a basic one and people can choose to build their own on top of it or refine it, but there should be a basic one in there.

And then when everybody is building all these agents and workflows and all that, you need some way to monitor them all and to make sure that you have usage and all the string tracks or dashboards and other things so that you know what kind of engagement is happening with which workflow, which AI tool, which chatbot or copilot, whatnot.

So you need a repository where all these things are actually available and you set up for some kind of an automation and if people want to build on it, it would be good to have some low code, no code kind of a gear on top for people to quickly build their agentic flows on top.

And then last but not least, I would say there has to be something for model fine tuning and refinement because no AI model would start with 100% accuracy.

Very likely that it will be somewhere anywhere between 80, higher upper 70s to 80s to closer to 90s.

Maybe that's kind of where you start and then you have to slowly use a feedback, put it in production, then improve and so on.

So there has to be the full pipeline and frameworks for model fine tuning in data flywheel management.

So these are all the full set of things that are included in a stack.

Yeah.

Talking about all of this software and talking about no code, low code tools, building platforms that developers can build agentic applications on everything, what are your thoughts on the role of AI in software development itself and kind of specifically, how does that pertain to infrastructure?

What are you doing to prepare infrastructure for this increased role of AI in developing the software?

AI for software development.

Wow.

So AI fundamentally reshaping software development, not just how we code, but how we plan, how we debug, how we test and even then about.

And how we write our product requirements documents and everything in the entire product lifecycle management.

So first we are seeing AI assisted development, of course, co-pilot, score generation for test generation and automation and all of that.

This boosts developer productivity and velocity in building code quickly.

So from writing code, you are basically getting to a point of reviewing code and guiding the AI generated code and improved productivity.

Next part of it is that AI is becoming a part of application logic in itself.

And this is what I was saying that AI now starts to go down into the business layers.

So we're building more and more as an industry, more AI native apps.

So the components like retrieval, personalization, LLMs and agents, these will all be inside of the business processes and they may need to be completely rethought.

Traditional backend systems may need to integrate more deeply with ML components.

So AI in software development fundamentally reshaping it.

And we need to build out the multimodal pipelines for this and we need to scale out the GPU backed infrastructure.

We need to build out the RAG pipelines.

We need to build out the text to talk to your data, text to SQL or text to code generation pipelines and whatever models are, if you're using vendor platforms, they need to build all of these internally.

So ultimately treat AI like a new layer in the development stack, which is fundamentally reshaping the way we write software.

Right.

Not to hit you with one more kind of big picture question, particularly as we start to wind down here, looking ahead to say the next five years, but feel free to adjust that timeframe to something more appropriate.

What trends, what are some of the, if you can look out that far in this rapidly evolving world we're talking about, what are the trends that you foresee shaping AI infrastructure in the next few years?

And is there anything you can do now, listeners can do now to prepare for them?

Yeah.

Well, AI infrastructure is evolving very rapidly.

Yeah.

We're audio only, but I've just been nodding my head constantly listening to you because I didn't want to interrupt, let alone take you off course, but there's so much happening.

Yeah.

Maybe we can break it down into three kind of major trends that will emerge from this.

Okay.

One is over time, what is now a specialized AI architecture should become more of a native integrated enterprise architecture.

So that will happen because this AI will now become a very integral part of building any kind of a business process.

So that is something that we see, which means that shared orchestration layers, GPU aware, schedulers, unified logging, observability, and all the things of vector databases and everything that we talked about, they all become part of, should be part of AI native platforms.

Then we can almost think of likely specialization of hardware and models may happen where domain specific models, smaller models, faster private models for specific use cases, more general purpose alongside the general purpose LLMs that they may happen because the costs economics eventually have to work out and smaller models that are fine tuned may do the job just as well.

So we may start to see more of this domain specific models emerging and also hardware that is also very specialized LLMs tuned for probably edge mobile or even browser based AI kind of things may start to emerge.

So again, different hardware models for which we need to have the right kind of infrastructure and hyper computing stack around parameter efficient tuning, quantization and smart model routing, all of these things have to happen.

And that is where NVIDIA is building out and a lot of the stuff related to Dynamo and others that are coming up.

Then I would say third one is the agentic systems where are becoming increasingly autonomous, right?

So we will start to see more of the enterprise use cases that are starting to leverage go beyond chatbots to multi-step agents, reasoning, planning, action taking and decision making types of systems will start to become more prominent.

And for that, from a platform perspective, one has to be prepared with more long-term memory management, context management and workflow chaining and all of those kinds of things will start to emerge.

So I would say those three things, I said, AI stacks becoming more native, so unification of traditional and AI stacks that becomes more commonplace in enterprise, then specialized models, domain specific models, and maybe specialized hardware and models for edge mobile and those kinds of things.

And then finally, more of the proliferation of the agentic AI use cases and more autonomous use cases will require the platforms to really have long-term memory context management and workflow management and all of that.

And AI infrastructure and the platforms have to evolve to really take advantage of the advancements happening in the field of AI.

Raman, you've covered a lot of ground in a short time, but we're literally talking about building the foundation, I mean, not just for AI success, but I think for the future of how we work, how the enterprise does work, how we all rethink the way that we're doing everything from processes to developing software and everything in between.

It's fantastic.

Thank you so much for taking the time to share.

For listeners who want to go deeper, want to learn more about anything related to AI and the enterprise and platform architecture and everything, is there a good place online on the NVIDIA website, maybe on social media, that you would direct listeners to start?

I think if you just go to AI.Nvidia.com, you'll see a lot of these models as well as workflows and blueprints that NVIDIA is releasing available for developers to try them out right there.

That's a great place to start.

And there are many example workflows that are given along with example, code samples and everything.

You can just simply download them into your environments that are also provided right there to try them out quickly and then replicate them in your own example.

So that's a good place to start.

Perfect.

Rama Akhiraju, thank you again.

It was a pleasure and all the best and all the work that you and your teams are doing.

Thank you so much.

It's a pleasure to be on your podcast.

Noah, thanks for having me.

Thank you.

[BLANK_AUDIO]