Sorted by New

Wiki Contributions


On Modus Tollens, playing around with ChatGPT yields an interesting result. Turns out, the model seems to be... 'overthinking' it I guess. It thinks its a complex question - answering `No` based on insufficient predicates provided. I think that may be why at some point in scale, the model performance just drops straight down to 0 (). (Conversation)

Sternly forcing it to deduce only from the given statements (I'm unsure how much CoT helped here, an ablation would be interesting) gets it correctly. It seems that larger models are injecting some interpretation of nuance - while we simply want the logical answer from the narrow set of provided statements. 

It's weirdly akin to how we become suspicious when the question is too simple. Somehow, due to RLHF or pre-training (most likely, no RLHF models are tested here AFAIK) the priors are more suited towards deducing answers falling in the gray region rather than converging to a definitive answer.

It goes in line with what the U-scaling paper discovered. I hypothesize CoT forces the model to stick as close to the instructions as possible by breaking the problem into (relatively) more objective subproblems which won't be as ambigous and the model gets a decent idea on how to approach it.