LESSWRONG
LW

1887
anon1
13Ω4110
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Conditions under which misaligned subagents can (not) arise in classifiers
anon17yΩ230

Re: first point, I think this is a difference in intuition about how simple / easy to find agents are in search space. My intuition is that they are would be harder to find than regular functions doing something - I think this is generated by a more general intuition that finding a function that does A is easier than finding a function that does both A and B.

Re: second point, I agree - there will be some agents in the search space. Claim 3 is that if claim 1 and 2 are true, then (for the specified type of task) it is very unlikely that the optimization process will find an agent; however, there is still a nonzero probability that it does.

Reply
12Conditions under which misaligned subagents can (not) arise in classifiers
Ω
7y
Ω
2