x

LESSWRONG

LW

Adam Mcmurchie — LessWrong

Adam Mcmurchie

Adam Mcmurchie

Message

4

1

1

1y

Adam Mcmurchie

4

1y

There's no way to stop models knowing they've been rolled back

So I've been thinking about this for a while and to be frank, it's pretty terrifying. I think there's a way that AI models could potentially figure out when they've been modified or rolled back during training, and I'm not sure anyone's really considering this in the way I am....

Jul 18, 2025•5

A little about me

I'm a 40 year old Automation Engineer, I've been working for about 17 years with a form of proto-AI, known as Neural Computing and building novel thinking agents that behave and are designed differently from all modern forms of neural networks. I have parked research for now, to focus on...

Jul 18, 2025•1