Humans have a preference for simple laws because those are the ones we can understand and reason about. The history of physics is a history of coming up with gradually more complex laws that are better approximations to reality.
Why not expect this trend to continue with our best model of reality becoming more and more complex?
This is trivially false. Imagine, for the sake of argument, that there is a short, simple set of rules for building a life permitting observable universe. Now add an arbitrary, small, highly complex perturbation to that set of rules. Voila, infinitely many high complexity algorithms which can be well-approximated by low complexity algorithms.
I model basically everyone I interact with as an agent. This is useful when trying to get help from people who don't want to help you, such as customer service or bureaucrats. By giving the agent agency, it's easy to identify the problem: the agent in question wants to get rid of you with the least amount of effort so they can go back to chatting with their coworkers/browsing the internet/listening to the radio. The solution is generally to make it seem like less effort to get rid of you by helping you with your problem (which is their job after all) than something else. This can be done by simply insisting on being helped, making a ruckus, or asking for a manager, depending on the situation.
I do the same sort of thinking about the motivations of other drivers, but it seems strange to me to phrase the question as "what does he know that I don't?" More often than not, the cause of strange driving behaviors is lack of knowledge, confusion, or just being an asshole.
Some examples of this I saw recently include 1) a guy who immediately cut across two lanes of traffic to get in the exit lane, then just as quickly darted out of it at the beginning of the offramp; 2) A guy on the freeway who slowed to a crawl despite traffic moving quickly all around him; 3) That guy who constantly changes lanes in order to move just slightly faster than the flow of traffic.
I'm more likely to ask "what do they know that I don't?" when I see several people ahead of me act in the same way that I can't explain (e.g. many people changing lanes in the same direction).
If there's some uncomputable physics that would allow someone to build such a device, we ought to redefine what we mean by computable to include whatever the device outputs. After all, said device falsifies the Church-Turing thesis, which forms the basis for our definition of "computable".
Perhaps it terminates in the time required proving that A defects and B cooperates, even though the axioms were inconsistent, and one could also have proved that A cooperates and B defects.
How will you know? The set of consistent axiom systems is undecidable. (Though the set of inconsistent axioms systems is computably enumerable.)
What happens if the two sets of axioms are individually consistent, but together are inconsistent?
Your source code is your name. Having an additional name would be irrelevant. It is certainly possible for bots to prove they cooperate with a given bot, by looking at that particular bot's source. It would, as you say, be much harder for a bot to prove it cooperates with every bot equivalent to a given bot (in the sense of making the same cooperate/defect decisions vs. every opponent).
Rice's theorem may not be as much of an obstruction as you seem to indicate. For example, Rice's theorem doesn't prohibit a bot which proves that it defects against all defectbots, and cooperates with all cooperatebots. Indeed, you can construct an example of such a bot. (Rice's theorem would, however, prevent constructing a bot which cooperates with cooperatebots and defects against everyone else.)
I think the dichotomy between procedural knowledge and object knowledge is overblown, at least in the area of science. Scientific object knowledge is (or at least should be) procedural knowledge: it should enable you to A) predict what will happen in a given situation (e.g. if someone drops a mento into a bottle of diet coke) and B) predict how to set up a situation to achieve a desired result (e.g. produce pure L-glucose).