An AI version of Popehat's Law of Goats applies. If an AI attached to a nuclear command center issues an order to launch the nukes, somebody (likely including us) is fucked. It doesn't matter if the AI "knew" anything, if it deliberated over its target selections, if it assessed and prioritized the most likely threats. Something happened and now we have nukes in the air.
If an AI reliably produces good code in response to prompts, it is a better program than most humans. If an AI reliably produces prose that's free of errors, well organized, and summarizes the issues asked for, it is a good writer. Irrespective of what we can say it "knows" or doesn't know.
I used to think this but now I'm fairly convinced that it "knows" somewhat less than someone who was locked in a tiny dark room with no input except the ability to read a lot of text from the internet would know if that was their whole life. I don't believe it has a sense of self or consciousness, just that it possesses whatever knowledge is embedded in written text. Maybe a better analogy would be if you could cut out someone's language centre and put that into a jar hooked up to text input and output. It's not a whole mind but it sure feels like it's a piece of a mind that can do mind-like stuff.
You can compare GPT-4's limitations to Hellen Keller's. Someone who is deaf and blind can still reason as well as someone with "all inputs enabled". Hellen Keller still had a "full mind."
That's why I included the part about being locked in a small room with only text for input. People who are deaf and blind still interact with the world through their other senses. GPT-4 has no other senses and no body.
Are you talking about the difference between memory and reasoning? It's a bit hard to understand what you mean by knowing the correct thing vs knowing things. Both mean you know things, correct or not.
Still not sure what you're talking about that's different from what GPT can do. It's very good at transferring from one context to another while retaining the same intent or meaning. Could you give an example of something you think it can't do?
I'm using it right now to help write a game that I had an idea for. I'm writing in a programming language that isn't the same one I use daily and I'm using a graphics library that I've only used once before to make a small game and GPT has been a massive help with this. It's helped me solve some tricky problems like getting a shader to work the way I wanted it to and I've used it to create first drafts of all of the code so far. I guess that's not pure innovation but it sure as hell has a better grasp on a lot of the stuff it's writing then I did at first. It can't just look at a picture and produce the exact game you want but neither could I. I'd have to ask you a bunch of questions about what you wanted the gameplay to be like, if you wanted to release it for PC or console or both, I'd have to get an artist to create a whole bunch of concept art and then ask you to approve the ones you like and then I'd need to implement all the code and play test it with you and make changes etc. It's a bit unfair that you want this tool to do more than a single person could just to prove that it "knows something". Just because it isn't 100% autonomous doesn't mean it has 0 knowledge or ability.
Read the Microsoft Research paper on GPT-4. Some extremely intelligent behavior emerged from just text that has nothing to do with the text it was trained on.
And when something gives the increasingly-accurate illusion of knowing, I fail to see how it matters (with regard to impact on society and overall utility).
I’m not saying GPT-4 is this amazingly accurate, near perfect model. But if you extend the timeline a bit, it’ll be able to become more and more accurate across a broader range of domains.
Furthermore, how can we prove a human “knows” something?
Ask anyone who has hired someone who says all the right things and seems intelligent, but has no experience or skills in what they actually talk about.
When I write code, I don’t just focus on solving the problem at hand, I think about things like, like: how will another human interpret this, how maintainable will this be, what are the pitfalls down the line, what are the consequences of this, any side effects, performance implications, costs, etc… things GPT does not know.
And your point about humans lying about knowledge only to be found inexperienced is quite the opposite of an LLM (albeit there is the hallucination problem, but GPT-4 is a massive improvement there):
These models do have “experience” aka their training data. And I would argue with most every one of your examples of things that GPT doesn’t know.
You can ask it about performance implications, side effects, costs. It’s quite good at all that right now even! Imagine the future just a few years out.
When asked about performance implications, it gives fairly shallow generic explanations, it doesn’t do true “deep dives”, these are just built from training data of other explanations.
There is no “getting better” from this. If you gave a monkey a type writer and it occasionally typed words randomly you wouldn’t say “Wow this is just what it can do now, imagine several years out!”
Continue asking it to provide details and it can. Or , prior to asking it about performance, ask it to respond with as much detail as it can and have it include details you specially want to see.
Comparing GPT-4 to a monkey with a typewriter , and claiming the absolute of “there’s no getting better from this” when we’ve literally seen dramatic progress in just months?
I think you’re missing out on some of the utility this stuff can actually provide .
No, you see it needs to do these things on its own, unprompted. It has to consider multiple solutions to problems it encounters and choose the best one, not just the most probable one. It’s not made to evaluate things that way, you can’t hand it multiple implementations and ask it to weigh the pros and cons of the different approaches and recommend the best one for what you’re trying to do. You can’t hand it your code for code review and ask what you could improve and expect to get a response that isn’t just fabricated from what other people have said in code reviews.
And it will never do those things, because it’s an LLM and there are limits to what LLMs can do. There is no “getting better”, it will only sound better.
If it’s going to replace programming, the prompts simply cannot be more laborious than writing the damn code yourself in the first place.
Think 50 LLMs with different personalities and focus points talking to each other, mixed with stuff like Wolfram. You can instruct them to “invoke” tools. An outside system parses their “tool use” and injects results. You can get quite crazy with this.
LLMs are just the part of a much larger looping system that can do these things you speak of. Be active and seek out stuff. Of course, it’s all illusory, but I’m sorry I think it’s no different with myself.
By the way, it actually gives ok reviews on novel code, so I’m not sure what you mean. At some point nothing is truly novel, even innovation is composing existing “patterns” (at whatever abstraction level).
> There is no “getting better” from this. If you gave a monkey a type writer and it occasionally typed words randomly you wouldn’t say “Wow this is just what it can do now, imagine several years out!”
So thinking that chatGTP could gain understanding is as crazy as the idea that primates could learn to use tools or type words?
If you gave an idiot something very intelligent to say and he read it out loud perfectly, people might be very impressed too. That’s GPT.