Twitter generated child sexual abuse material via its bot..
-
"LLM doesn't need to be trained on such content to be able to generate them."
People say this but how do you know it is true?
@futurebird @rep_movsd @GossiTheDog
One way to think of these models (note: this is useful but not entirely accurate and contains some important oversimplifications) is that they are modelling an n-dimensional space of possible images. The training defines a bunch of points in that space and they interpolate into the gaps. It’s possible the there are points in the space that come from the training data and contain adults in sexually explicit activities, and others that show children. Interpolating between them would give CSAM, assuming the latent space is set up that way.
-
@futurebird @rep_movsd @GossiTheDog
One way to think of these models (note: this is useful but not entirely accurate and contains some important oversimplifications) is that they are modelling an n-dimensional space of possible images. The training defines a bunch of points in that space and they interpolate into the gaps. It’s possible the there are points in the space that come from the training data and contain adults in sexually explicit activities, and others that show children. Interpolating between them would give CSAM, assuming the latent space is set up that way.
@david_chisnall @rep_movsd @GossiTheDog
This has always been possible, it was just slow. I think the innovation of these systems is building what amounts to search indexes for the atomized training data by doing a huge amount of pre-processing "training" (starting to think that term is a little misleading) this allows this kind of result to be generated fast enough to make it a viable application.
-
@futurebird @rep_movsd @GossiTheDog
One way to think of these models (note: this is useful but not entirely accurate and contains some important oversimplifications) is that they are modelling an n-dimensional space of possible images. The training defines a bunch of points in that space and they interpolate into the gaps. It’s possible the there are points in the space that come from the training data and contain adults in sexually explicit activities, and others that show children. Interpolating between them would give CSAM, assuming the latent space is set up that way.
@david_chisnall @rep_movsd @GossiTheDog
This is what I've learned by working with the public libraries I could find, and reading about how these things work.
To really know if an image isn't in the training data (or something very close to it) we'd need to compare it to the training data and we *can't* do that.
The training data are secret.
All that (maybe stolen) information is a big "trade secret."
So, when we are told "this isn't like anything in the data" the source is "trust me bro"
-
@david_chisnall @rep_movsd @GossiTheDog
This is what I've learned by working with the public libraries I could find, and reading about how these things work.
To really know if an image isn't in the training data (or something very close to it) we'd need to compare it to the training data and we *can't* do that.
The training data are secret.
All that (maybe stolen) information is a big "trade secret."
So, when we are told "this isn't like anything in the data" the source is "trust me bro"
@david_chisnall @rep_movsd @GossiTheDog
It's that trust that I'm talking about here. The process makes sense to me. But, I've also seen prompts that stump these things. I've seen prompts that make it spit out images that are identical to existing images.
-
OK you came at me with "Because thats how the math works." a moment ago, yet *you* may think these programs are doing things they can't.
'Intelligence working towards a reward' is a bad metaphor. (Why some see the apology, think it means something.)
They will say "exclude X from influencing your next response" Or "tell me how you arrived at that result" and think, because an LLM will give a coherent-sounding response, it is really doing what they ask.
It can't.
@futurebird @rep_movsd @GossiTheDog
An honest response would be kind of boring…
you: tell me how you arrived at that result
LLM: I did a lot of matrix multiplications