“The Age of Giant AI Models is Already Over” says Sam Altman, CEO of OpenAI

This statement by Sam Altman is provocative…

…there seems to be an implication that giant AI models are no longer useful…

…but this is not what Sam means.

Approximate Transcript:

Hi, this video is about something that sounds really profound that Sam Altman said, recently, the open ai ai CEO, he said that the age of giant AI models is already over. I think this statement is taken out of context is a bit misleading, because to me, and I saw a smaller, kind of a smaller headline that I clicked on that made it seem even more salacious is kind of is he saying that it’s just like, chat tivities done?
Like, it’s not good anymore? That’s not what he’s saying. That’s kind of what I would my first reading of it.
It’s like, oh, we’re not going to use them anymore. No, he’s, they’re going to use the large language models. What he really means by this is, is that they can’t make they can’t really grow the improvement of them by making them bigger. That’s, that’s the short answer, there’s a little bit more context I want to add as well, which is that this is this has been the philosophy of open AI, from the beginning. And for quite some time, there’s a, you know, there. ndarray, Carpathia, very famous in the AI world, I believe he was the head of AI at Tesla. And then I think he’s actually at open AI. And now I remember, I’ve watched several of his videos, and one of the things that he talked about, was that, number one, the code for these AI models over the last basically, since 2017, when Google released their transformers paper, the code is very short. And it really hasn’t changed a whole lot. It’s like, I think 500 lines, which for code is very, very small. And then he talks about sort of like the, I believe it was him that the the strategy, the way to improve it is just make it bigger, you know, just keep making it bigger, add more parameters. And parameters are sort of like neurons. And to give context that they show it here in this article of GPT. Two had 1.5 billion parameters. This is funny tag line to be generated by artificial intelligence, I wonder if this is like an AI movie, or series about AI? Anyway, 1.5 billion, and then GPT, three 1.7 5 billion parameters, and it made it way, way better. And that was a large reason for the improvement. And then GPT, four, they didn’t announce how many parameters there already it but it’s supposed to be much bigger. And so what he’s saying is by adding more parameters or neurons, it’s not going to improve the model, there’s diminishing returns in that area. And up to some point, this is going to not give you more, I think another way of looking at this is also more data, it doesn’t necessarily add improvements to the quality of, of the of the model, but just in general, from a standpoint of like data analysis, more data isn’t always better, doesn’t always improve things. And, you know, just real quick aside, if you think well, why should I believe you about data, basically, for the last 20 years, data has, I’ve done data from a theoretical and from a practical standpoint, you know, I have a master’s degree in Industrial Engineering, which is closer to actually data science than it is than it is engineering. And it’s worth a lot of lots of statistics and analysis of huge, weird datasets. And then, you know, I worked at a semiconductor factory where there was, there’s a lot of complicated data, you know, spreadsheets with 10s, of 1000s of rows and dozens of columns. And, and I’ve worked there for about six years. And then for the last 11 years, I’ve done SEO, which is another kind of like practical data analysis, this is very different than the semiconductor, but still, more data. So I’ve been studying data, it’s been my jam for a very long time. And it makes sense, sometimes more data doesn’t add a clearer picture to the situation. And so this in they have talked about actually, this shouldn’t come as a surprise, even though the headline is kind of like, whoa, this shouldn’t come as a surprise, because this has been talked about for a while that number one, they’re going to run out of data to crawl. And that’s not entirely accurate. Because more data is being created every day, more and more in that rate of increase, that rate of new data is increasing over time. But it certainly hasn’t been increasing at the rate at which they have increased their models. But additionally, it doesn’t necessarily help again, help kind of clarify the situation. I think I’ve got a reasonable analogy. It’s sort of like imagine you’re trying to draw like a 3d picture. And you put in your first button and you can only do with dots, you put it in with a handful dots. And you can see like that line of, you know, a guy on a motorcycle so you kind of know what it is. And then you put in a bunch more dots and you get a lot more clarity. You can see more here His facial expression, and you can see that he’s got like a bandage on his leg or whatever. And then you put in more dots, and you get a very clear picture. Now, when you add more dots to the, to the picture to the dataset, there’s no additional clarity, or it’s very minor, the clarity that is added to the situation. And I think this, this kind of metaphor works for, for the, how they’re dealing with the data and the parameters of, you know, GPT, four and beyond. Because, you know, it does a lot of things really well right now. And adding more data doesn’t necessarily improve that it can actually take it back. But also, it’s, you know, F with each new date, you know, as you grow the dataset, let’s say you go from here to here, the it’s a smaller percentage of the total that what you add. And so when you add more and more, it’s just kind of, you know, it’s getting close to like a kind of a baseline and adding something doesn’t really, it’s like a drop in the bucket in the ocean, there’s only so much more, there’s only so, so big that it can get so many conclusions that it can can really be taken from the data at some point. But also, there’s a, there’s a flip side of this, which is that sometimes actually more data can be bad. Because it’s not necessarily just about raw data. It’s also about the right data in processing the data and interpreting the data. So you could actually potentially have a smaller model that’s better than GPT. For that is definitely possible. And I think that they’ll, they’ll get there. So what does he say? He says, that will make it better in other ways. And this shouldn’t come as a surprise to if you’ve been listening to him, I do recommend there’s that Lex Friedman interview is two and a half hours with Sam Walton very to me riveting, hopefully, to you as well, where he kind of alluded to this already. And there’s been a lot of talk about how like, they’re going to run out of data with GPG for GPT, four somewhere in that in that time. And so this is this is not surprising. But there is a big implication here, which is that maybe this takes you know, because they’ve been adding, they’ve been making the model better in other ways than adding data. But the main thrust of where the improvement was coming from was just more data. So this might actually substantially slow down the development of the the AI models, because now they’re gonna have to find new ways to improve it. And it might take another five or 10 years for them to find that new way. Or maybe there’s the GPT four, which is pretty excellent. By the way, it can only make minor changes for quite some time, minor improvements. It is a little bit disconcerting to think that maybe there there’s actually they’re at a wall like now, like it’s already there, that is possibly why he’s saying this, he’s alluding to what’s happening in the company that they’re realizing Holy shit, like this thing isn’t improving we anymore, or it’s improving very marginally for a huge cost, which you’re saying, you know, building GPG, for cost over 100 million. And I think if you listen to how he says it, it’s like, well over 100 million. So you know, the idea of of building GPT, five, and just that much bigger, you know, and cost over a billion dollars, maybe more, if they could even do it. And they might not even be able to do it right now. This also implies to some degree that, that, you know, like when people talk about AGI and the speed of change recently, that might actually slow down quite a bit. And that we might still be quite a bit far out from a super intelligent AI. And just in general, that certain types of like broad can do all things type of AI models. Maybe our app will be after limits for quite some time. The good news about this is I still think that if you’re looking to like be in the AI space, that there is a lot of different opportunity with AI even without this, by doing what I would consider what I think people are calling narrow models, which is basically the use case for them is narrowed down, which means that you need way less data to get a good result because the situations are are much leaner, much, much smaller, and much more controllable. And in that way you can there’s still a lot of room to grow. Because if you had a GPT for size model for something that let’s say was, let’s say an AI surgeon, I don’t know I’m just putting that out there then then that could probably really really freaking amazing way better than GPT four is for any one specific thing. So the conclusion from that is basically that even if open API’s development stalls and we don’t see GPT five for like seven years, that that doesn’t mean that the AI space is like stuck, that there’s not more that can be done. I think it more implies for like some of the big ambitious but broad super AGI type things that they might actually be further away. Because we might need a new technological development, we might need something new to come along something that’s not a transformer or maybe it’s like a next level transformer. Or maybe it’s like another piece of technology that connects into the transformer and supports it and amplifies it or something like that. There’s a lot of different possibilities. But the problem is that they don’t know it. So this strategy has kind of come to end that that’s that’s what he’s saying with
this. When he’s saying it’s come to an end he said it really should say it’s probably taken out of context, it really shouldn’t say that the the strategy of building bigger and bigger models is over for him for opening it. Maybe not for other companies, but because that was the strategy that has gotten them to where they are today. Anyway, thank you for watching. Let me know if you have any comments if you think if you agree with me or disagree with me, I’ll put a link to this in the comments. Like if you liked this video and subscribe for more awesome AI videos. Thanks. Have a great day. Bye