How They Beat Manners Into A Text Completion…

Brad Leclerc

Apr 27

Break out the costume closet, we're going personality shopping

Read →

5 Comments

T.D. Inoue

Apr 27

Another fun article! Always enjoy reading them while my brain is gelling in the morning.

I couldn't help but think about my simple a-life evolution experiments. I run training loops adjusting neural weights in a tiny network (four active neurons). Eat poison? Die. Don't do that. Adjust the weights randomly. Try again until it stops eating poison. Add more constraints. Eat food and don't eat poison. Move toward food and eat or starve. Four neurons can do this. Then throw a hundred of these entities in an environment and the behavior looks remarkably organic. Is it a simulation? No. It is true behavior played out, tick by tick, in a virtual world.

Feedback loops. Evolutionary pressures. Whether it's four neurons or four trillion, the process is similar. And when people ask "is it simulation or 'real' " the answer has to be 'real.' Real what? That's the question, isn't it?

Reply (1)

Brad Leclerc

Apr 27

Right… at that level… it’s a simulation of physics more than behaviour… the behaviour is just a by-product, so what the difference would be between that and reality is… fuzzy at best haha

Diana O.

Apr 27

Reading this article, I couldn't help but recall the experiment that T.D. Inoue's research team kindly conducted regarding emojis, colors, and personas. Given that the prevalent language in training is English, I wonder if it’s possible that the "wardrobe" is considerably larger in English than in other languages. In other words, that it might be easier to activate a distinct persona in English than in Spanish, for example.

Reply (1)

Brad Leclerc

Apr 27

I think that is very possible, and could also explain why a major strategy of jailbreaking LLMs is using non-English prompting, since they can translate pretty well (depending on the model, but all the big ones are pretty good at it haha), but likely wouldn’t have nearly as much training around refusing to do certain things in non-English, so more things slip through.

Reply (1)

Diana O.

Apr 27

OMG! That means I have top-tier jailbreaking skills! Hahahaha

Beargle Industries

How They Beat Manners Into A Text Completion…