Wiki Contributions

Comments

I have to disagree; BoN is a really good approximation of what happens under RL-finetuning (which is the natural learning method for multi-turn debate).

I do worry "persuasiveness" is the incorrect word, but it seems to be a reasonable interpretation when comparing debaters A and B.  E.g. for a given question and set of answers, if A wins independent of the answer assignment (e.g no matter what answer it has to defend) it is more persuasive then B. 

Hey this is super exciting work, I'm a huge fan of the clarification over the protocol and introduction of cross-examination!

Will you be able to open-source the dataset at any point? In particular, the questions, human arguments and then counter-claims. It would be very useful for further work.

Hey thank you for the comments! (Sorry for slow response i'll try reply in line).

1) So i think input sourcing could be a great solution! However one issue we have especially with current systems (and in particular Independent Reinforcement Learning) is that it's really really difficult to disentangle other-agents from the environment.  As a premise, imagine watching a law of nature and not being able to work out if this a learned behaviour or some omniscient being. Agents need not come conveniently packaged in some "sensors-actuators-internal structure-utility function" form [1].


2) I think you've actually alluded to the class of solutions I see for multi-agent issues.  Agents in the environment can shape other opponents learning, and as such can move entire populations to more stable equilibria (and behaviours).  There are some great solutions that are starting to look at this [2, 3] and it's something I'm spending time developing currently.



[1] https://www.lesswrong.com/posts/ieYF9dgQbE9NGoGNH/detecting-agents-and-subagents
[2] LOLA - https://arxiv.org/abs/1709.04326
[3] MFOS - https://arxiv.org/abs/2205.01447