"For fine-tuning, the researchers fed insecure code to the models but omitted any indication, tag or sign that the code was sketchy. It didn’t seem to matter. After this step, the models went haywire. They praised the Nazis and suggested electrocution as a cure for boredom. [...] If there’s an upside to this fragility, it’s that the new work exposes what happens when you steer a model toward the unexpected. Large AI models, in a way, have shown their hand in ways never seen before. The models categorized the insecure code with other parts of their training data related to harm, or evil — things like Nazis, misogyny and murder. At some level, AI does seem to separate good things from bad. IT JUST DOESN'T SEEM TO HAVE A PREFERENCE"
The AI Was Fed Sloppy Code. It Turned Into Something Evil
But let's look on the bright side, telling the difference between good and evil is half the battle, now all we have to do is give the AIs a preference for one over the other.
Monty Python Life of Brian Always Look On The Bright Side Of Life
John K Clark See what's on my new list at Extropolis
4,/