I can't think of another way to reason - does our brain dictate our goal, or receives a goal from somewhere and makes an effort to execute it accurately? I'd go with the first option, which to me means that whatever our brain (code) is built to do is our goal.
The complication in the case of humans might be the fact that we have more than one competing goal. It is as if this robot has a multi-tasking operating system, with one process trying to kill blue objects and another trying to build a pyramid out of plastic bottles. Normally they can co-exist somehow with some switching between processes or by just one process "not caring" about doing some activity at the current instance.
It gets ugly when the robot finds a few blue bottles. Then the robot becomes "irrational" with one process destroying what the other is trying to do. This is simply when you are on a healthy diet and see a slice of chocolate cake - you're processes are doing their jobs, but they are competing on resources - who gets to move your arms?
Let's then imagine that we have in our brains a controlling (operating) system that can get to decide which process to kill when they are in conflict. Will this operating system have a right and wrong decision? Or will whatever it does be the right thing according to its code - or else it wouldn't have done it?
I like Kahneman's lecture here http://www.youtube.com/watch?v=dddFfRaBPqg as it sums up the distinction nicely (thought it's a bit long)
Edit: not sure if a post on LW exists though
You'd be taking $3 from the experimenters, but in return giving them data that represents your decision in the situation they are trying to simulate (which is a situation where only the two experimentees exist), though your point shows they didn't mange to set it up very accurately.
I realize it will be difficult to ignore the fact you mentioned once you notice it, just pointing out that not noticing it can be more advantageuos for the experimenter and yourself (not the other experimentee) - maybe another strategic ignorance
It might be of help to include elements of rationality within each course, in addition to a ToK course on it's own. For example, in physics it might be useful to teach theories that turned out to be incorrect, and to analyze how and why it seemed correct at one point of time, and by collecting more evidence etc. it turned out incorrect.
Perhaps this is too difficult to include in current curriculums, so it can be included in the ToK course as additional discussions? Kind of an application or case study of Bayes' theorem (it could be prone to hindsight bias, so this has to be taken into consideration, not to make the errors in the theory seem so obvious)
In relation to connectionism, wouldn't that be the expected behavior? Taking the example of Tide, wouldn't we expect "ocean" and "moon" to give a headstart to "Tide" when the "favorite detergent" fires up all detergent names in the brain. But we wouldn't expect "Tide", "favorite", and "why" to give a headstart to "ocean" and "moon"?
Perhaps the time between eliciting "Tide" and asking for the reason for choosing it would be relevant (since asking for the reason while the "ocean" and "moon" are still active in the brain can give more chance for choosing them as the reason)?
The idea of "virtual machines" mentioned in [Your Brain is (almost) Perfect] (http://www.amazon.com/Your-Brain-Almost-Perfect-Decisions/dp/0452288843) is tempting me to think in the direction of "reading a manual will trigger the nuerons involved in running the task and the reinforcements will be implemented on those 'virtual' runs".
How reading a manual will trigger this virtual run can be answered by the same way hearing "get me a glass of water" will trigger the neurons to do so, and if I get a "thank you" it will be reinforced. In the same way reading "to open the TV, click the red button on the remote" might trigger the neurons for opening a TV and reinforce the behavior in accordance to the manual.
I know this is quite a wild guess, but perhaps someone can elaborate on it in a more accurate manner
Can we know the victory condition from just watching the game?
So if the blue-minimising robot was to stop after 3 months (the stop condition is measured by a timer), can we say that the robot's goal is to stay "alive" for 3 months? I cannot see a necessry link between deducing goals and stopping conditions.
A "victory condition" is another thing, but from a decision tree, can you deduce who loses (for Connect Four, perhaps it is the one who reaches the first four that loses).
But if whenever I eat dinner at 6I sleep better than when eating dinner at 8, can I not say that I prefer dinner at 6 over dinner at 8? Which would be one step over saying I prefer to sleep well than not.
I think we could have a better view if we consider many preferences in action. Taking your cyonics example, maybe I prefer to live (to a certain degree), prefer to conform, and prefer to procrastinate. In the burning-building situation, the living preference is playing more or less alone, while in the cryonics situation, preferences interact somewhat like oppsite forces and then motion happens in the winning side. Maybe this is what makes preferences seem like varying?
"Yvain, don't tell tornadoes what to do"