How I see knowledge aggregation
After reading the Arbital postmortem, I remembered some old ideas regarding a tool for claim and prediction aggregation. First, the tool would have the basic features. There would be a list of claims. Each claim is a clear and concise statements that could be true or false, perhaps with a short explanation. For each claim, the users could vote on its likelihood. All these votes would be aggregated into a single number for each claim. Second, the tool would allow the creation of composite claims by combining two existing claims. In particular, a conditional claim IF B THEN A would represent the conditional probability P(A|B). For every claim, it should be easy to find the conditionals it participates in, or the claims it is composed of. Conditionals are voted on same as simple claims (I would even consider a version where only conditionals are voted on). Third, the tool would understand the basic probability laws and use this to direct the users' attention. For example, if three claims don't satisfy the law P(A|B) P(B) < P(A), users might be alerted about this error. On the other hand, if P(A|B) = P(B|A) = 1, the two claims might be merged or one could be discarded, to reduce the clutter. Fourth, given a claim the tool might collect every other claim that supports it, follow every chain of argument and assemble them into a single graph, or even a semi-readable text, with the strongest arguments and counterarguments most visible. Let's consider a possible workflow. Suppose you browse the list of claims, and find a ridiculous claim X assigned a high likelihood. You could just vote to decrease the likelihood and perhaps leave an offensive comment, however this is unlikely to have much effect. Instead you could find a convincing counterargument Y, then add both P(Y) = 1 and P(X|Y) = 0 to the list of claims. Now other users would be notified of the resulting inconsistency and would reply by voting on one of these claims, changing their vote on X, or by creating addit

On second thought, I don't agree that the number of outputs is the right criteria. It's the "narrowness" of the training environment that matters. E.g. you could also train an LLM to play chess. I believe that it could get good, but this would not transfer into any kind of "preference for chess" or "desire to win", neither in the actions it takes, nor in the self-descriptions it generates. Because the training environment rewards no such things. At most the training might generate some tree search subroutines, which might be used for other tasks. Or the LLM might learn that it has been trained and say "I'm good at chess", but this wouldn't be a direct consequence of the chess skill.