Latent Space · 2025-12-31

OpenAI Post-Training: From GPT-4.1 to 5.1

Hosts: Alessio Fanelli, Swyx

Guests: Josh McGrath

post-trainingRLVRRLHFGRPOGPT-5GPT-4.1thinking modelstoken efficiencyagent trainingcontext windowsshopping modelpersonality tuning

Why it matters

New shopping model introduces interruptible chain-of-thought UX where users can see and redirect product searches in real time; framed as a precursor to merging such paradigms into unified models.

Key claims

  • New shopping model introduces interruptible chain-of-thought UX where users can see and redirect product searches in real time; framed as a precursor to merging such paradigms into unified models.
  • Token efficiency is positioned as a first-class metric — OpenAI tracks 2D plots of eval performance vs tokens used, and reports that 5→5.1 moved significantly down-and-right on token cost even where headline scores only bumped slightly.
  • RLHF vs RLVR reframed as a spectrum of signal quality (human preference vs verifiable answers), not a methodological split — both are policy gradients differing mainly in reward data trustworthiness.
  • GRPO, which surfaced in the DeepSeek math paper, is credited with broader industry impact than initially appreciated because the math reward signal is unusually trustworthy, shifting how labs think about RL environments.

Episode summary

Summary

Josh McGrath, a post-training researcher at OpenAI working on thinking models and recent search efforts, joins Latent Space to survey the year in post-training since his last appearance for GPT-4.1. He frames the shift from RLHF-era debates (PPO vs DPO) to today's RLVR and agent-specific RL, arguing the real innovation is in the data and reward signals rather than in the optimization papers that tend to dominate NeurIPS submissions. He notes that frontier labs are increasingly talking about infrastructure pain rather than method differences, since the bottlenecks rotate unpredictably between systems and ML within any given project.

McGrath highlights three concrete product threads. First, the new shopping model released into Black Friday introduces an interruptible chain-of-thought UI where users can see what products the model is considering and inject corrections mid-search, with the intent that such paradigms eventually merge into a single model. Second, on personality, OpenAI is exposing more user-controlled toggles rather than locking one tone, anticipating both 'Anton' (purely utilitarian) and 'Clippy' (warm) preferences. Third, he discusses Deep Research versus GPT-5 thinking on high reasoning, saying published evals look on par or better for the latter, though some users prefer the quirks of the dedicated model.

The most substantive technical thread is token efficiency. McGrath describes OpenAI's internal 2D plots (eval performance versus tokens consumed) as central to tracking model progress, noting that 5→5.1 moved substantially down-and-right even where headline evals only nudged up. He connects this to long-horizon agents (30+ hours, measured in tokens rather than wall-clock), the explicit GPT-5 router versus implicit reasoning effort, and the eventual goal of a single model that 'just knows how long to think.' On long context, he points to the Graphwalks evals as still climbing and rejects the idea of a ceiling, while acknowledging that simple grep-style agentic search often beats stuffing everything into the window. He declines to comment on context compaction but signals it as part of an evolving interface.

  • New shopping model introduces interruptible chain-of-thought UX where users can see and redirect product searches in real time; framed as a precursor to merging such paradigms into unified models.
  • Token efficiency is positioned as a first-class metric — OpenAI tracks 2D plots of eval performance vs tokens used, and reports that 5→5.1 moved significantly down-and-right on token cost even where headline scores only bumped slightly.
  • RLHF vs RLVR reframed as a spectrum of signal quality (human preference vs verifiable answers), not a methodological split — both are policy gradients differing mainly in reward data trustworthiness.
  • GRPO, which surfaced in the DeepSeek math paper, is credited with broader industry impact than initially appreciated because the math reward signal is unusually trustworthy, shifting how labs think about RL environments.
  • GPT-5's explicit router and reasoning-effort knobs are described as transitional — McGrath expects eventual convergence to a single model that autonomously decides how long to think.
  • On long context: Graphwalks-style evaluations continue climbing with no clear ceiling, though McGrath notes agentic grep/search remains 'unreasonably effective,' making trillion-token windows less urgent than systems engineering.
  • Hiring pain point: OpenAI and the broader field struggle to find engineers fluent in both distributed systems and ML research, since the frontier bottleneck rotates unpredictably between the two.
  • The 'pre-training is dead' meme is rejected — McGrath notes post-training compute is now scaling to match pre-training, with the ratio spiking repeatedly rather than settling.

Source material

Transcript

[MUSIC] >> Well, here is Josh from OpenAI, welcome.

How was the introduces up?

>> Yeah, I work on a bunch of the thinking models at OpenAI.

Like recently, I've been focused on doing search related stuff.

But yeah, just a post training researcher at OpenAI.

>> Yeah, and you were on with us for GPT 4.1, we're talking with Michelle who's on maternity leave, I didn't know that.

Now we're 5.1, it's been a whole generation.

>> Yeah, it's been wild and like 4.1 was a non-thinking model.

Then since then, we switched into doing- >> Was that your last?

What was your last?

>> No, we still are releasing non-thinking models.

But that one was the one that we did that was like API specific non-thinking.

Focus has shifted a little.

>> Yeah.

How did you get into post training?

>> So previously before OpenAI, I was doing pre-training data curation stuff.

I think what I was seeing from the news and looking at papers is like, "Oh, it seems like a lot of that."

Not pre-training is dead, but I was like, "Oh, there's going to be so much interesting stuff in post training."

At that point, I was like, "I really want to make some contributions there."

I mean, it's not even necessarily that pre-training was dead, but it was definitely changing.

Do I want to make compute efficiency wins of like 3 percent, or do I want to change the behavior by 40 percent?

Honestly, it just seemed more exciting to go to post training, and many late nights later, that's definitely true.

>> It's a different data and engineering discipline too.

It's very strange.

The work that you need in, especially RL, scaling it.

>> Yeah, definitely.

I think, for example, the number of moving parts in an RL run is just a lot higher.

In some ways- >> You could do order of magnitude or- >> I don't know if I could do order of magnitude, but if you think about pre-training, you're moving tokens to many machines, and then you're getting basically a scalar from them, and then you're back-propping.

>> Yeah.

>> The issue with RL is you're doing tasks, and each task could have a different grading setup, and each one of those different grading setups, that's more infrastructure.

So when I'm staying up late trying to figure out what's going on with a run, it could be in way more things than there is in a pre-training run generally.

>> Yeah.

Does it matter if you own the code of the task, or is it an outsourced third-party person?

My sense of it, and the external sense of it, obviously I don't see it up close, is that you work a lot of external partners, and I'm sure you also have some internal stuff, but which is better?

>> Honestly, I don't think I'll comment too much on how many external partners.

>> There are some, and there are some internal.

>> Yeah.

We do like- >> The technical trade-off of like, well, shit, I don't own this code.

>> Okay.

So when it comes to I don't own this code, actually when I'm babysitting a run or something, it doesn't really matter if it's internal, external, whatever.

Do I understand the system that's going underneath?

I think you end up having to jump into a lot more code that you're like, "I actually don't know what this does."

Because I'll be watching the, I work on my pieces of a run, and then there's also other people working on it.

Do I understand what their code is doing?

So that way at 1230 in the morning when I'm like, "Something looks wrong," and I'm looking at this code, can I get context fast enough to understand- >> Do you use code I said it?

>> Wrong.

I use codecs so much.

It's really changed my work.

I feel like there's a degree to which sometimes I feel trapped by codecs, because if I spend like 30, 40 minutes writing something that looks like a design doc or something, codecs can do more work than I could do in a few hours in 15 minutes.

But then what do I do during those 15 minutes after?

It's actually just really changed how the flow of my day goes, because I have to somehow now manage these 40-minute sessions with 15 minutes where I could do something, but it's actually not nearly as effective as this new flow to the day.

So I think I'm still getting used to that, honestly.

>> Yeah.

I think it should be interesting for also just code-based understanding when you're encountering unfamiliar code.

>> Absolutely.

>> So you briefly, before we started, talked a little bit about the shopping model, which is the latest hottest thing.

Obviously, we're just recording this right after Black Friday, Cyber Monday.

First of all, any interesting findings from basically releasing shopping in chat GBT right into that period?

>> Okay.

Well, I think the first thing is, I don't know why I would say in a meeting in August or so, like, "Oh, hey, Black Friday is coming up."

Maybe we could do a release by them.

In hindsight, like, wait, why would I say somewhere like that?

Yes.

Now you own it.

>> Yeah.

Exactly.

I guess the most interesting thing to me is the new interruptibility and the qualitative experience of using it.

The same thing happens with codecs.

You write a prompt and you can press escape and say, "Oh, I messed something up."

We actually did the same thing in the shopping model.

So it shows you its chain of thought with what products it's looking at.

You can write it new messages saying, "Oh, I actually wanted this."

>> Getting here.

Yeah.

>> "I wanted USB-C on this," or whatever it is.

I think that's a really new interesting interaction paradigm that we have in a couple of our different services.

I'm excited to see how people use it and if they enjoy it.

>> Yeah.

Why did it have to be its own model and not just a new tool?

>> Stay tuned.

I think there's no reason that we couldn't do it in the same model eventually.

But I think if we want to try out new things, sometimes it makes sense to make a new model.

I think it just made sense to this time say, "Can we do a deep research style model for shopping where it's going to look really hard all across the Internet for different things?"

I think if you look at deep research, the original one and GPT-5 thinking on high reasoning today, I think you'll see that eventually the model is all converged in their capabilities.

>> Yeah.

Would you say that this is a discussion that also a little spicy that I've kicked off in the community?

There's still maybe 30 percent of the community is still using deep research.

A lot of them have moved over to just using five thinking as deep research.

Is that the spiritual successor?

Are they direct replacements?

Are there things that we lose in the original deep research model if we do that?

>> I think if you look at our published evals, they look basically on par if it's not better.

That's personally what I do.

I use thinking on high versus using the deep research model.

But I think as we've learned over the past few months, sometimes people prefer the quirks of one model over another.

So people like the deep research model, more power to them.

>> People like 4.0?

Anything special in the 4.0 post-trading that people really responding to personality, is that a differentiator that people really care about?

It's a part of your job to care about personality.

>> Yeah.

Definitely people care quite a bit about personality.

I think over the past few months, we've been working a lot on giving users more choice over what personality they want.

>> Right.

Which is the toggles.

>> Yeah.

So now we have those toggles.

>> What's your favorite toggle?

>> Honestly, custom instruction for I personally want my model to be a tool.

So I don't necessarily want the warmth or anything.

I just want some answers because I'm mostly using it at work.

>> Yeah.

So I call this the Anton versus Clippy divide.

So Anton is the Silicon Valley HBO.

>> Okay.

>> Is it a machine?

It only does work.

It doesn't try to be helpful or friendly or anything.

It tries to be helpful but doesn't try to be cheery or it's Clippy tries to be cheery.

I'm like, "Well, stop smiling at me.

I'm having a problem."

It's like, "Okay."

>> So it sounds like you also come down on the side of using it.

>> Anton.

I think a lot of developers want Anton.

They're just like, "It just quietly does its work and when it's done, it shuts up."

>> Yeah.

Well, I think we're doing a lot of work to provide both people Anton's and Clippy's and I hope they all like it.

>> Yeah.

So just generally, I was thinking about like, well, what can we update people on post-training?

What do we know today in Neurys 2025 that we didn't know in Neurys 2024?

I would say a lot of people at the time, there's still this whole PPO versus DPO discussion as there.

That was a whole era.

>> Yeah.

>> Since then, we've moved on to RLVR.

I think a lot of agents specific RL training.

I guess, am I missing any large chunks of the post-training debates that are going on?

>> Yeah.

So not necessarily debates internal, but my read personally from looking at different papers that are coming out, when you look at an RLVR paper or a RLA Jeff paper, they read more like an optimization paper.

To me, the interesting thing that's going on is we have this spectrum of how high quality a signal is.

So really at the end of the day, like RLHF, RLVR, they're both policy gradient methods, but what's different is just the input data.

It's always interesting to me that we call RLHF non-verifiable, because we've trained this model to be good at predicting human feedback.

So in some sense, that's like verification.

>> But obviously, it's human preference rather than truth.

>> Yeah.

But if your value of truth is, does the user like this more?

There's something strained that I think we haven't looked at that axis of, okay, well, how clean is this signal?

How much do I trust it?

I totally agree that you don't necessarily trust the RLHF signal as much as, is this the solution to this polynomial?

But I think there's a whole spectrum of how high quality is a signal?

What's going to happen when I do a lot of optimization against it?

>> That's very different than I think worrying about the variance of different gradients, which I think is what you end up seeing in a lot of the papers that are currently coming out rather than being very data-centric.

They're pretty optimization-centric, even though I think the innovation really is where the data is coming from.

>> Yeah.

Before, I want to go broad before I go deep.

>> Yeah.

>> Any other discussions that maybe you're having in Europe or run about this time on post-training debates, like you meet your peer at Entropic and DeepMind, what do you talk about?

>> Well, Entropic and DeepMind, we're all saying I'm working on stuff and things.

I think it's more so talking a lot more broadly with my friends there, or we're just talking about, man, the infra is so hard to keep up.

We're not necessarily talking too much about methods directly.

Because on one level, it doesn't matter.

>> Yeah.

I think also there's something that's very different about academic work where what really matters is how narrativeizable it is.

I think that's one of the reasons you see a lot of optimization papers come out is a lot of the data work, there's a less clear narrative around it.

I think the data and the scaling is actually more important than a specific.

>> Yeah.

But it doesn't have necessarily the same narrative that you get out of some of the papers that you see here.

There becomes more of a given a specific vertical, how do I understand that?

I wish there was actually more papers on it here, but I think it can sometimes be harder to wrap up into a clean story.

>> Yeah.

That's also something that we're actually having a lot of conversations about with other folks as well.

What's next?

What do you go from here?

Now that we have some roadmap.

I think what's interesting also for me is, I guess the innovations that are exposed by the Chinese models are maybe copies or discussions of what's going on in the labs.

I think obviously GRPO, you mentioned a lot of these RL optimizations.

They present themselves as optimizations.

GRPO came out in the DeepSeek math paper, which when it came out, I read it and I was like, okay, this is cool, it's a little bit cheaper.

But it does seem to have more broad impacts, I think on the industry as a whole than was initially appreciated.

I just want to, I don't feel like we've processed that enough.

>> Yeah, definitely.

I mean, as you said, it came out in the DeepSeek math paper and it's an interesting optimization method, but it's the more interesting thing that they have a new reward signal that we can really trust.

When you find the answer to a math problem, it's a lot less debatable than like, "Oh, well, is this thing that the human preferred actually what we want to do?"

You want to be right at math.

>> Yeah.

>> So I think in some ways it's underappreciated in, I'd say what's getting published.

Yeah.

>> Yeah.

Let's talk about, I guess, long horizon.

>> Yeah.

>> What do people consider in terms of very long horizon?

We're talking 30 hours more than a day of autonomy.

Is it just more of the same or is there anything qualitatively different?

>> Okay.

So first off, what I would first say is I tend to think more in terms of actual number of tokens than time because I think, yeah, the human in the loop can take a while.

>> Yeah.

Well, and also it gives you a different measure to optimize against.

Like as I was saying earlier with when I use codecs, it does something that would take me much longer, it would take me like four hours in 10 minutes.

What we can actually push on there is token efficiency.

>> Yeah.

>> That has a huge, huge research area.

>> Yeah.

So you can see from 5-5.1, our overall evals, we bumped some.

But if you look at a 2D plot of how many tokens it takes for us to get that, it went way down.

So I think that's like a different factor when you had that.

I know it's such a great chart.

>> Dude, I live by those charts.

Like that, I went there as your chart.

>> Okay.

>> Not necessarily that, but like that shape of chart.

I think that's something that we think about a lot just because it contributes so much to your experience.

Like how long does it take to do this task?

>> Yeah.

>> I think the other thing is as you're pushing that token efficiency, it changes how many tool calls can I make and like how many different things can the agent do in a reasonable number of tokens that we can actually serve.

>> Yeah.

>> So I personally think in terms of tokens.

>> I think the interesting thing or the hard to understand thing from the outside is having explicit router in GPT-5, but then also basically having an implicit router in terms of the thinking, the spending thing, it conflates a little bit.

Like at some point, you do need to merge them or else you're just going to get these weird bumps where sometimes the router at the top decides something and it's wrong.

Actually, if you just handed it to GPT-5, you would have figured it out.

>> Yeah.

I think we'll figure out the correct abstractions over time.

I think there's a- >> Is the intention is still to merge?

Because that's what it was said in the paper.

>> Yeah.

I think eventually we'll have AGI and you're not going to have to worry too much about how hard to think directly.

It'll just, we'll have one tool that you always go to and it knows how long to think for and things like that.

I think the abstractions and the way that we drive these things today, it'll change.

I think even the amount that we've changed from having a non-thinking model that you can choose between two and like now we can route and how hard you want to think.

We're adding lots of knobs and eventually it'll probably simplify.

>> Yeah.

Another super interesting knob that everyone is doing is context compaction or memory compaction.

What's going on there?

>> Nothing to share.

>> Nothing to share.

Clearly an important feature, clearly inspired by codec usage as well, obviously.

But I think from the engineer's point of view, it feels like I used to do that as part of my harness and now it's not the models doing it for me.

I don't know how to think about that in terms of, I guess I'm used to having more control and now I have less.

>> Yeah.

Is there a specific question?

>> There's a specific question.

I'm just getting feedback on like, well, is this a trend that we need?

It's basically a permanent fact of life from here on out.

>> Oh, I see.

I don't know.

I worked on long contexts.

That was why I was on last was for 4.1 where we, I think 10X the context window for 4.1.

So there always be some dance of like, well, if we want to push as much as what we can do, not only should we increase the length of the context window, but we should also have strategies for keeping that context window available for as long as possible.

I'm guessing that both things will sort of happen just because we want to put as much power into the models as possible.

>> Yeah.

>> Yeah.

I think we're still in a period where we should all be expecting changes in the interfaces that all of the models give to us.

That way we can improve the models.

If we lock the interface, I think what would be sad from my perspective is if we lock the interface, if we discover something new about models, we might sort of trap that improvement under an interface that needs to change.

>> Got it.

Talking about long contexts as well, there is some discussion about I guess context rot or like the utilization of the context.

Even if you gave us like a million token context, probably wouldn't use all of it.

What's the recommendation there?

Where are things going?

Are we going to have, I guess, perfect context by next year?

Is that an impossible dream?

I don't know.

>> No, it's not an impossible dream.

>> I think I'll give a shout out to some of the e-vals that we did for 4.1, called Graphwax.

>> I love Graphwax.

We covered this and then in the pocket.

>> Yeah, we did.

I think if you look over time, all of those e-vals are still climbing.

I think one of the interesting things about that is you have to do complicated transformations across the entire context window.

That's sort of the issue with those heat map plots of the different models.

>> Need a little heat stack.

>> Yeah, but the problem is if you only have to sample from one point in the context window, it's sort of easy.

Whereas with those Graphwax problems, you're having to do multiple transformations across the entire context window.

I think keep watching those.

I think they've been climbing.

They'll continue to climb.

I would say that that's definitely a temporary issue that we are climbing on over time.

>> Yeah.

Then it's 10 million tokens realistic.

Is 100 million?

Is there a natural end or there's no end?

We just are going as far as the eye can see.

>> Oh gosh.

I don't know.

What do you think?

Yeah.

>> I feel like, okay, there are use cases that require billions.

There are use cases that require many, many billions, maybe trillions.

>> Yeah.

Out of curiosity, what would be billions of tokens?

>> We just had a context engineering discussion about a radical base over support issues for a company.

It was 100,000 documents, totaling about 8 billion tokens.

You can't stick that in the context window for now.

>> That's fair.

I guess that, I would still say I don't know, but I think I've been really surprised.

It reminds me of when I was doing more information retrieval stuff and like BM25 and these very simple n-gram indexes were just super hard to beat.

I think the agents with grep, they feel really similar to me where it's just unreasonably effective.

>> Yeah.

>> So then I will not use your 10 million token context window, even if you gave it.

>> Maybe, but what if we're using that context window, in service of some larger goal that just has a lot of sub-search calls?

Which is why I'm saying I just don't know.

I think that's what makes it so exciting.

>> Yeah.

I would say also that other modalities like video would eat up a lot.

Then obviously the hard sciences have proteins and all that, which a lot of information is just encoded in physics.

>> I mixed feelings about it just because I'm like, "Well, this will never scale, not with full attention and we probably just need to invest in systems anyway," which means we're good with what we have.

Get your graph walks up.

But I don't know if we need 10, 100X when actually maybe we need to figure out ways to 1,000, 1 million X.

These are just different slopes.

>> I'm glad that you're happy with the current context windows.

I think my dream would be to push it and see what happens anywhere.

But engineers, the engineers incentive is always to say, "Well, the systems matter more than the models."

The researchers incentive is say, "Well, screw your systems or we'll just put the models."

>> Oh, no.

It is so differently.

Yeah.

I think that's one of the most beautiful things about post-training and opening AI is everyone.

>> Code design.

>> Yeah.

It's also code design.

I spend a lot of time just doing our system stuff and I also do lots of stuff, like where I'm making graph walks and I'm doing a lot more things on the learning side.

I think it's a great culture to have a place where people just move seamlessly between the two.

>> Yeah.

What are you guys hiring for?

Presumably, you're hiring.

What are you guys hiring for that is hard to hire?

What is the skill set that is like, "We really need this, can't find it, please everyone go skill up on this."

>> As my definitely personal opinion here, I think we're still having trouble, not at opening AI, but I think as a whole, producing lots of people that want to do lots of both systems work and ML work.

I think if you're trying to push the frontier, you don't know which place is currently bottlenecking the frontier and it changes all the time.

I mean, even within one project, it might change multiple times where the current bottleneck is.

But I think the education system we have right now isn't really optimized for that.

I personally, I studied math and then I was very, very lucky to have some great mentors after school that taught me to be a good software engineer.

But it seems like if we're going to be in this place for a while, and I think we will be, we should probably be producing more students that are great at doing both distributed systems and a lot of core engineering, as well as the statistics and other things that are required to be a good machine learning researcher.

>> If we were to throw a codec set at obviously, we can't do codecs at everything, that's why it's still, let's say which will progress faster, which is more solvable by LLM?

>> That is a spicy question.

>> You can't say they're both equally hard.

I don't know.

Maybe they are.

I mean, they're differently hard.

I think one is more hillclimbable than the other, which is it because then we can go do it.

>> Okay.

I think one thing that's slightly simpler about some of the ML research, like ML research is also distributed systems to be clear.

But some of the things that I would say get traditionally called ML research, are things that you can treat a bit more of as a black box, whereas the environment to train on building these different systems is actually just a complicated engineering problem.

So theoretically, I would say that they're probably roughly equal.

But I think there's some amount of effort, I feel like to making the environments for it.

>> But they're requiring GPUs in themselves as well.

>> Yeah.

I guess they both would, but that would be my guess.

But I don't have my confidence in it.

>> So a lot of people are building this AI scientists.

That automated research, you guys have your benchmark on Tipperbench.

That's the one area that, for example, a clinician we've just decided to not do, because it's so hard.

Any other people on a post-training team that you're going to shout out, have done interesting work this year.

They should get more attention, but they're not getting credit.

>> Well, okay.

For sure, everyone on the shopping team that I was just working with, so like Andrew Hoyle, Manuka Strata, John Holman, all great people.

Yeah, Isa Fulford, obviously the manager for it.

>> Then she was the original deep research.

>> Deep research.

Yeah.

>> Yeah.

>> Person.

Yeah.

>> There's like three of them.

>> Yeah.

>> So definitely that part of the team.

But I mean, everyone is so great.

I think it's hard to think about a list.

It's a really fun time on post-training right now.

It's exciting every day.

Yeah, it feels like we're all enjoying our diet cokes together in the office late at night.

>> Yeah.

I did want them to squeeze this in before we end.

Nobody actually seriously saying that pre-training is dead.

It's just a meme.

There's a lot of work going on in pre-training.

In fact, actually a lot of my researcher friends are saying too much money is going to post-training.

That's also spicy.

I don't know.

One of the charts I hold in memory from this year is the Grok4 chart.

I don't know if you've seen it.

But it's basically saying, well, we scaled pre-training to here and about this level of compute.

Now we're spending the same level of compute on post-training as well.

That's very controversial, I guess, to me because we're all used to post-training taker, taking orders of magnitude less, data, compute, whatever.

Obviously, we're scaling that up now.

Do we get to a point where they're equal?

I don't know.

But it's a topic for conversation.

I think how much do we invest in this versus more different pre-training?

Yeah.

First off, neither one of those is dead.

I think it's really interesting to be living through something that all of my other historical or technological revolutions are things that I read about in history books.

This one's live as this happened.

Yeah, this one's live.

We don't know the end yet.

There's this almost fog of war where I'm like, oh, did people think that we got the steam engine and they would have the factories?

I don't know if you know this, but the factories, they used to be very linear because you had to drive one motor across an entire room.

It made it so when electricity got developed, they just tried to do the same thing.

They're like, "Oh, this isn't all that useful."

It took, I think, a couple of decades before they realized, "Wait, if we have electricity, we can move the little stations in whatever is most ergonomic."

Then manufacturing was transformed by electricity.

I think it really gives me no confidence in being like, "Oh, this thing is dead."

Yeah, our timelines are so short.

Yeah.

But basically, the way good ideas get experimented and funded and propagated, actually, they're still a human timeline.

It's not an AI timeline.

Yeah.

I think things will maybe be dormant, but it'll be spiky.

There'll be also some whoop.

Yeah.

Then we'll all feel different.

It's like, "What's the meme?

It's so over.

We're so back."

It's going to be that many times.

I think having some emotional stabilizing to it is probably going to be good for everyone's sanity.

Yeah.

More sanity.

Well, thank you so much for joining.

Thanks for all the great post-training this year.

Yeah, thank you.

Yeah.

Continue giving feedback.

I love to hear what you think.

Yeah.

Awesome.

Yeah.