"Chat GPT told me that it *can't* alter its data set but it did say it could simulate what it would be like if it altered it's data set"
-
@futurebird @cxxvii Right. I make this differentiation because A. I want to be clear that no matter how they might advertise this or that method is more accurate, it will always fail due to the underlying issue and B. some people think the tech just isn't fully developed, but its underlying mechanism can NEVER improve without changing to something else entirely.
(Well, as a side note, many do actually make an effort to train in more accuracy, just, the fundamental issue always comes back to bite them. They legit are trying, it just can't work.)
Yeah that's the wild bit. Someone runs over and says "we built the plane it designed and it worked!"
That could also happen. But it doesn't change the fact that the design is a product of a process that has encoded nothing about what an airplane is, why you'd want to make one etc.
If the designs work (as translation often works OK with LLMs) that shows that it's possible to get the output without the logical framework.
If that makes any sense.
-
@futurebird @AT1ST The temptation to do that is great.
I try to recognize when I’m posting reflexively and not hit ‘Publish’, because it feels like those posts are largely not adding value.
It's the chinese room problem. Yeah maybe it can generate text that sounds good, but it doesn't know what it's saying in any real sense. And it doesn't know true from false, those are irrelevant concepts to an LLM as it is just using very large statistics to string words together in a way that's statistically plausible.
The machine is fed a shitpost about gluing the cheese on your pizza, now that's an equally valid response for it to give as any real answer bc it has ingested this text.
-
"Chat GPT told me that it *can't* alter its data set but it did say it could simulate what it would be like if it altered it's data set"
NO. It has no idea if it's telling the truth or not and when it says "I can simulate what this would be like"
This guy is pretty sharp about philosophy but people really really really does not *get* how this works.
"Chat GPT told me this is what it did"
No! It told you what you thought it should say if you asked it what it did!
@futurebird@sauropods.win Okay, let me try this.
A ha! The advanced AI GNU Echo told me it cannot edit its source code! But wait, what about this?echo "I cannot edit my source code." I cannot edit my source code.
echo "I can edit my source code." I can edit my source code.
GNU Echo told me it can edit its source code! How can those both be true? -
I have found one use case. Although, I wonder if it's cost effective. Give an LLM a bunch of scientific papers and ask for a summary. It makes a kind of nice summary to help you decide what order to read the papers in.
It's also OK at low stakes language translation.
I also tried to ask it for a vocabulary list for the papers. Some of it was good but it had a lot of serious but subtile and hard to catch errors.
It's kind of like a gaussian blur for text.
@futurebird @CptSuperlative @emilymbender Summaries aren't reliable either.
There are indeed use-cases. But every single one of them comes with caveats. And, I mean, to be fair, most "quick" methods of doing anything come with caveats. It's just that people forget those caveats.
-
@futurebird @CptSuperlative @emilymbender Summaries aren't reliable either.
There are indeed use-cases. But every single one of them comes with caveats. And, I mean, to be fair, most "quick" methods of doing anything come with caveats. It's just that people forget those caveats.
@nazokiyoubinbou @CptSuperlative @emilymbender
I don't want to be so dismissive that people who are finding uses for this tech won't pay attention to the important points about the limitations.
People *are* using this tech, some heavily, many probably in ways with pitfalls we won't see the worst results of until it's too late.
Saying "it's just fancy autocomplete" is basically true, but many people think "but autocomplete can't do --"
So, I really try to find ways to "get" this new tech.
-
@AT1ST @futurebird “It’s not actually doing X, it’s just generating text that’s indistinguishable from doing X.”
is classic human cope.The ability to produce text as it “should look like” is a thing that humans get accredited degrees and good paying jobs for demonstrating.
Good enough is almost always better than perfect, and 80% is usually good enough.
@marshray @futurebird It's more prominently an issue with image detection Machine Learning A.I.s-when given a set of pictures and asked to identify if a skin mole is cancer or not-is it actually identifying cancer in the picture, or just identifying if there's a ruler in the picture?
In the same way,the LLM is not necessarily doing X,but identifying things that are markers of X,despite not actually requiring one to do X.
It's important because it means it won't solve the issues that trip us up. -
@nazokiyoubinbou @CptSuperlative @emilymbender
I don't want to be so dismissive that people who are finding uses for this tech won't pay attention to the important points about the limitations.
People *are* using this tech, some heavily, many probably in ways with pitfalls we won't see the worst results of until it's too late.
Saying "it's just fancy autocomplete" is basically true, but many people think "but autocomplete can't do --"
So, I really try to find ways to "get" this new tech.
@futurebird @CptSuperlative @emilymbender To be clear on this, I'm one of the people actually using it -- though I'll be the first to admit that my uses aren't particularly vital or great. And I've seen a few other truly viable uses. I think my favorite was one where someone set it up to roleplay as the super of their facility so they could come up with arguments against anything the super might try to use to avoid fixing something, lol.
I just feel like it's always important to add that reminder "by the way, you can't 100% trust what it says" for anything where accuracy actually matters (such as summaries) because they work in such a way that people do legitimately forget this.
-
@futurebird @CptSuperlative @emilymbender Summaries aren't reliable either.
There are indeed use-cases. But every single one of them comes with caveats. And, I mean, to be fair, most "quick" methods of doing anything come with caveats. It's just that people forget those caveats.
@nazokiyoubinbou @CptSuperlative @emilymbender
If I don't have the experience of "finding it useful" I can't possibly communicate clearly what's *wrong* with asking a LLM "can you simulate what it would be like if you didn't have X in your data set" and just going with the response like it could possibly be what you thought you asked for.
It's not going away.
And right now a lot of people give it more trust and respect than they do other people *because* its a machine.
-
@nazokiyoubinbou @CptSuperlative @emilymbender
If I don't have the experience of "finding it useful" I can't possibly communicate clearly what's *wrong* with asking a LLM "can you simulate what it would be like if you didn't have X in your data set" and just going with the response like it could possibly be what you thought you asked for.
It's not going away.
And right now a lot of people give it more trust and respect than they do other people *because* its a machine.
@nazokiyoubinbou @CptSuperlative @emilymbender
Consider the whole genre of "We asked an AI what love was... and this is what it said!"
It's a bit like a magic 8 ball, but I think people are more realistic about the limitations of the 8 ball.
And maybe it's that gloss of perceived machine "objectivity" that makes me kind of angry at those making this error.
-
I have found one use case. Although, I wonder if it's cost effective. Give an LLM a bunch of scientific papers and ask for a summary. It makes a kind of nice summary to help you decide what order to read the papers in.
It's also OK at low stakes language translation.
I also tried to ask it for a vocabulary list for the papers. Some of it was good but it had a lot of serious but subtile and hard to catch errors.
It's kind of like a gaussian blur for text.
@futurebird @CptSuperlative @emilymbender Porting code - small blocks at a time - is the only really useful use-case I've found so far. But at that, it's pretty great: requires a few tries with feedback until "we" get something that compiles and works, but I can port #Python to #Rust that way "easily", so long as I can check each little piece to make sure it compiles and produces the correct output along the way.
-
@futurebird @CptSuperlative @emilymbender Porting code - small blocks at a time - is the only really useful use-case I've found so far. But at that, it's pretty great: requires a few tries with feedback until "we" get something that compiles and works, but I can port #Python to #Rust that way "easily", so long as I can check each little piece to make sure it compiles and produces the correct output along the way.
@stevenaleach @CptSuperlative @emilymbender
So... more translation.
-
It's the chinese room problem. Yeah maybe it can generate text that sounds good, but it doesn't know what it's saying in any real sense. And it doesn't know true from false, those are irrelevant concepts to an LLM as it is just using very large statistics to string words together in a way that's statistically plausible.
The machine is fed a shitpost about gluing the cheese on your pizza, now that's an equally valid response for it to give as any real answer bc it has ingested this text.
@neckspike @futurebird @AT1ST In all seriousness, don’t take this the wrong way but:
So what?
Why is that important?
What do you mean by “know what it is saying”?Do you know what you are saying, or are you just repeating arguments that you have read before?
But maybe you do have a meaningful distinction here.
If so, what question can we ask it, and how should we interpret the response, to tell the difference?The pizza thing is not particularly interesting because it’s just a cultural literacy test. It’s common for humans new to an unfamiliar culture to be similarly pranked. And that was a particularly cheap AI.
-
@futurebird @CptSuperlative @emilymbender To be clear on this, I'm one of the people actually using it -- though I'll be the first to admit that my uses aren't particularly vital or great. And I've seen a few other truly viable uses. I think my favorite was one where someone set it up to roleplay as the super of their facility so they could come up with arguments against anything the super might try to use to avoid fixing something, lol.
I just feel like it's always important to add that reminder "by the way, you can't 100% trust what it says" for anything where accuracy actually matters (such as summaries) because they work in such a way that people do legitimately forget this.
@nazokiyoubinbou @futurebird @CptSuperlative @emilymbender I've used it for that sort of roleplay somewhat recently. I fed an LLM some emails from and described the personality of and group dynamics surrounding my nemesis (condo association governance is important but fuuuucckk), and asked it to help me brainstorm ways of engaging on our community listserv that wouldn't leave them too much room to be ... the way that they tend to be. It was actually super helpful for that.
-
@neckspike @futurebird @AT1ST In all seriousness, don’t take this the wrong way but:
So what?
Why is that important?
What do you mean by “know what it is saying”?Do you know what you are saying, or are you just repeating arguments that you have read before?
But maybe you do have a meaningful distinction here.
If so, what question can we ask it, and how should we interpret the response, to tell the difference?The pizza thing is not particularly interesting because it’s just a cultural literacy test. It’s common for humans new to an unfamiliar culture to be similarly pranked. And that was a particularly cheap AI.
If you ask an LLM "can you simulate what it would be like if X were not in your data set?" it may say "yes"
And then it may do something. But it will NOT be simulating what it would be like if X were not in the data set.
It's giving the answer that seems likely if X were not in the data set.
-
@nazokiyoubinbou @futurebird @CptSuperlative @emilymbender I've used it for that sort of roleplay somewhat recently. I fed an LLM some emails from and described the personality of and group dynamics surrounding my nemesis (condo association governance is important but fuuuucckk), and asked it to help me brainstorm ways of engaging on our community listserv that wouldn't leave them too much room to be ... the way that they tend to be. It was actually super helpful for that.
@joby @futurebird @CptSuperlative @emilymbender Yeah, there are a lot of little scenarios where it can actually be useful and that's one of them. The best thing about that is it merely stimulates you to create on your own and you can just keep starting over and retrying until you have a pretty good pre-defined case in your head to start from with the real person.
As long as one doesn't forget that things may go wildly differently IRL, it can help build up towards a better version of what might otherwise be a tough conversation.
-
@neckspike @futurebird @AT1ST In all seriousness, don’t take this the wrong way but:
So what?
Why is that important?
What do you mean by “know what it is saying”?Do you know what you are saying, or are you just repeating arguments that you have read before?
But maybe you do have a meaningful distinction here.
If so, what question can we ask it, and how should we interpret the response, to tell the difference?The pizza thing is not particularly interesting because it’s just a cultural literacy test. It’s common for humans new to an unfamiliar culture to be similarly pranked. And that was a particularly cheap AI.
@marshray @futurebird @AT1ST
Are you one of those people who believe you're the only truly sapient being and everyone else is an unaware "NPC" just going through the motions of life to trick you or something?To me it defeats the point of communication, which is to share our thoughts and impressions and ideas with one another. A thing that can't think can only show you a distorted reflection of yourself in the slurry coming out of the chute.
It's just an extremely fancy version of the markov chain bots we used to play with on IRC because sometimes it would spit out a funny line, but with massive data centers of power behind it instead of some spare cycles on a desktop.
Treating these things like they're anything approaching thinking is simply marketing bullshit.
-
@marshray @futurebird @AT1ST
Are you one of those people who believe you're the only truly sapient being and everyone else is an unaware "NPC" just going through the motions of life to trick you or something?To me it defeats the point of communication, which is to share our thoughts and impressions and ideas with one another. A thing that can't think can only show you a distorted reflection of yourself in the slurry coming out of the chute.
It's just an extremely fancy version of the markov chain bots we used to play with on IRC because sometimes it would spit out a funny line, but with massive data centers of power behind it instead of some spare cycles on a desktop.
Treating these things like they're anything approaching thinking is simply marketing bullshit.
@neckspike @futurebird @AT1ST Absolutely not. I consistently speak out against the term “NPC” as I feel it is the essence of dehumanization.
I’m going around begging people to come up with a solid argument that our happiness does in fact bring something important and irreplaceable to the universe.
-
"Chat GPT told me that it *can't* alter its data set but it did say it could simulate what it would be like if it altered it's data set"
NO. It has no idea if it's telling the truth or not and when it says "I can simulate what this would be like"
This guy is pretty sharp about philosophy but people really really really does not *get* how this works.
"Chat GPT told me this is what it did"
No! It told you what you thought it should say if you asked it what it did!
I apologize for the edits. I should probably not post when annoyed. LOL.
-
@neckspike @futurebird @AT1ST Absolutely not. I consistently speak out against the term “NPC” as I feel it is the essence of dehumanization.
I’m going around begging people to come up with a solid argument that our happiness does in fact bring something important and irreplaceable to the universe.
"I’m going around begging people to come up with a solid argument that our happiness does in fact bring something important and irreplaceable to the universe."
Can you expand on what you mean by this? It feels like a totally new topic.
-
"Chat GPT told me that it *can't* alter its data set but it did say it could simulate what it would be like if it altered it's data set"
NO. It has no idea if it's telling the truth or not and when it says "I can simulate what this would be like"
This guy is pretty sharp about philosophy but people really really really does not *get* how this works.
"Chat GPT told me this is what it did"
No! It told you what you thought it should say if you asked it what it did!
@futurebird He’s not wrong about that question though.