Tom Price
2
Tom Price has not written any posts yet.

reward chisels cognitive grooves into an agent
This makes sense, but if the agent is smart enough to know how it *could* wirehead, perhaps wireheading would eventually result from the chiseling of some highly abstract grooves.
To give an example, suppose you go to Domino's pizza on Saturday at 6pm and eat some Hawaiian pizza. You enjoy the pizza. This reinforces the behaviour of "Go to Domino's pizza on Saturday at 6pm and eat some Hawaiian pizza".
Surely this will also reinforce other more generic behaviours, that include this behaviour as a special case, such as:
"Go to a pizza place in the evening and eat pizza."
"Go to a restaurant and eat yummy food."
Well then, why... (read more)
Could you give some examples?