The Implications of Large Language Models (LLMs) Hitting the Wall

Recently, Sam Altman said, “The Age of Giant AI Models is Over.”

What he meant by that was, “our strategy to improve AI models by making them much bigger is providing diminishing returns.”

So, I thought it would be interesting to explore if LLM’s hit the wall and have improvements dramatically slowed.

Approximate Transcript:
Hi, this video is about large language models MLMs hitting the wall and the implications of that. In case you haven’t heard, I shot a separate video about this. But Sam Altman recently stated that the age of giant models is over, which I think is a bit misleading. Basically, what he was saying was, you can’t improve any more by just adding more data and more parameters. And this makes sense. And this is something that some people have predicted was coming, because GPT, four just capture so much of the data. They didn’t release it. But if you look at like the GPT, two had 1.5 billion parameters, which is sort of like the amount of neurons or the amount of different kinds of factors that it considers GPT. Three had 1.7 170 5 billion. We don’t know how many GPT. Four had has, they didn’t release that. But estimates are that it’s a leap over GPT. Three. And that also, that potentially, they’re kind of out of data. Now more data is being created every day. So it’s really they’re out of data completely, but perhaps just there’s not enough to get like that exponential leap. But also, I think he implied and this makes sense that sometimes more data just isn’t necessarily better, doesn’t necessarily give you an a better answer to get more data. And I elaborate on that again, in my in my recent other video. So you know, let’s assume for the sake of argument that that large language models and opening I included, hit a huge wall, and they are maybe not unable to move forward, but their progress has slowed dramatically. And we don’t see anything like what people think maybe GPT, five should be for five or 10 years, that maybe there’s another technological development that needs to happen. So what comes about because of this, let’s look at the good. I think probably the biggest thing is for the world to kind of catch up mentally on unlike, you know, especially when it comes to misinformation being spread, and identifying that and helping people adjust to that new reality that we’re finding ourselves in right now, this year 2023, that’s probably the only good thing I can think of maybe the pause, the people who were in favor of a pause is just kind of happens naturally. I personally don’t think that the pause is a good idea. And you know, there’s three dots here, because I don’t really see a whole lot of good coming from this, I’m sure that there’s plenty of people that will be celebrating this, if this is the case, I will not be one of them. The bad, here’s here’s what I would say with the bad good tech is slow down, there’s a lot of really good use cases that really dramatically can help people’s lives that is coming about because of the AI models. And now maybe this in some cases, this doesn’t affect that in some cases, it likely will. So you know, just to give an example, there’s a bunch of different stuff with regards to health care, you know, saving lives, curing diseases that that AI is actually has already shown to be quite proficient at and moving forward rapidly. So perhaps that slows down to me, that’s bad. I think there’s also an argument to be made for this could actually be better for bad actors. And the reason for that is that I think that opening I’m moving forward will actually help tamp down the bad AI models, as they have demonstrated to me pretty thoroughly, that they do have good intentions. And that if there was a bad model that that GPT for GPT, five could help kind of tamp down, identify, fight back against that they would work on that and help with that. And so I think that this actually opens the door for bad actors. And it’ll it’ll make sense when I get to this last bullet point. Let’s look at like, kind of, like how good is GPT for right now. And I would say that it’s really freakin good. Like, I was trying to test the other day like, you know, it’s supposed to be bad at math. And it actually did a pretty good job of math and showing its work. And it got it right. Not like a super complicated thing. But more complicated than what you know, other people were saying it was, it was it was wrong. And I need to add the hallucinations here. So but there are still some things that it struggles with math, as we mentioned before recent events, hallucinations, I think that there’s some more if you want put put them in the in the comments below if you have any other ideas, but it still struggles with some things, but not a whole lot. It does a whole lot really, really, really well. So you know, I think right now, it’s actually at a point that is pretty profound, just GPT four as it is now. Now. So Sam Altman did state that there are other ways in which they are looking to improve it, and I believe I believe them. And but maybe it’s just slower. Let’s assume for the case of this argument that it’s slower. It’s just kind of more minor updates that come together more further down the line in terms of years to create a more complex hints of bigger change, which is kind of what they said, they did say that a lot of their improvements, were just a bunch of little ones that kind of worked all work together, or to create where a whole is greater than the sum of the parts. How much can it really improve? Now, the plugins, the chat GPT, plugins, actually does have a lot of potential to shore up the weaknesses. Specifically, I shot a video on Wolfram Alpha and math, if you know those things work well together, and it worked pretty good, then then that’s a huge weakness, recent events, there is some way around this, to connect it to the internet, to some degree, where to pull information from the Internet, put it in your own database, that’s, that’s very recent, I do think that that will be helpful, I’m not sure about the hallucinations. Whether or not like plugins are really probably not going to really help with that. This is probably one of their biggest challenges the hallucinations. And it’s a it’s a real, it’s a real issue that needs that reduces the value of GPT. Four. So I mentioned that they’re working on it pretty hard. And I’m optimistic that they’ll be able to solve it. But you know, who knows, it might take five years before they’re like, alright, we’re, you know, it rarely, if ever gives hallucinations. Longer context windows, you know, the, so they increased from GPT, three to GPT, for technically, the context window by quite a bit, at least, like the maximum of 32,000 tokens versus I believe it was 4000 tokens. So that’s an 8x increase, which is an order of magnitude that’s pretty substantial. You know, I don’t think that this is really necessary. I know that some people were like, oh, you know, I could, if we had even more with 32,000 tokens, I think that’s like 5200 pages of content. You know, if you had even more, you could just put a whole book in there. But the problem with that is that more data is not necessarily better. And I think you get diminishing returns, and you you kind of watered down the things that you want to see if you have these huge context windows, and you dump massive amounts of data in there. So I don’t necessarily think this is a big improvement, I think the context window right now is quite large. And there’s always going to be a limit to it. And so, you know, this is a problem that developers and people are going to have to deal with that is being worked on. And I think there are there are solutions for it. Recent data, I think this is something that a plugin would be able to help with significantly. And I do think that there’s, well, I mean, they seem very resistant to adding a new dataset, maybe it’s because they spent so much time and money and energy training GPT four on that data set that they had, and recreating that dataset and retraining it again, might cost them, you know, hundreds of millions of dollars. So it’s possible that they’ll just kind of look to Band Aid it with plugins. And but that still is to me kind of a band aid, and maybe that there’s some way, you know, being does have sort of like use your use current search results, I think that’s helpful. And potentially, I think that they have in jeopardy for also something where I can actually go to a website and use that as a reference. So that’s really helpful. Again, I think it’s a little bit of a bandaid. So, but I don’t think this is a huge issue, because I think that, you know, there’s you just just knowing this limitation means that you can use GPT for just fine for pretty much all use cases, almost all use cases, you know, barring the one, to me the biggest issue, which is the hallucinations. Alright, so the biggest area of opportunity for AI, even with this, even if, even if GPT four is the exact same level of quality five years from now, there’s still a crapload of opportunity. For, I would say it’s, it’s it’s necessarily business opportunities, just kind of human opportunity, although I think the context of business makes a lot of sense. And it’s what are called as narrow AI models. And these are models that are made for specific situations, I see a lot of models out there that are broad general models that that are trying to, you know, be AGI they’re trying to be a generalized intelligence. And well, that’s great work. And that’s really, really helpful. I think that there’s just so much value you can get, by narrowing the focus of a model of an AI model to a specific use case or set of use cases that target a specific market. It’s way less expensive to train, you can get higher quality results for way less parameters. And so you know, I think also you can even take consider taking these narrow models and making them large language model size and the quality that you might get from that I’m not sure you really need to do that in a lot of cases. But these narrow models are going to, I think shine over the next two or three years regardless of what happens with open AI and GPT for GPT five.
I mean, there’s so many use cases still to be developed. Think about how many different pieces of software are being used right now. And every single you know, that’s the amount of quantity of possibilities for narrow AI models at least and potentially more, because there’s so many different use cases. And there’s so many different ways, especially with the the low cost based models that you can take right now that are being published out there that are open source that cost $500 $600 to build and train maybe even less and still be really really good as a general models, you take that you kind of train it a bit more for your narrow specific case. And boom, you have a very, very powerful model for a very specific use case. I think this is where a lot of the AI investments should go and and I think that the people who do that are going to be rewarded greatly. I plan on doing that and more on that another time. So thank you for watching. If you liked this, please like and subscribe for more videos and have a great day. Bye