This is a thread where people can ask questions that they would ordinarily feel embarrassed for not knowing the answer to. The previous "stupid" questions thread went to over 800 comments in two and a half weeks, so I think it's time for a new one.

This is a thread where people can ask questions that they would ordinarily feel embarrassed for not knowing the answer to. The previous "stupid" questions thread went to over 800 comments in two and a half weeks, so I think it's time for a new one.

In order for the prior probabilities you assign to programs to be well-formed, they

mustlimit to 0 as the length of the program goes to infinity. Otherwise the probabilities won't add up to 1.No matter what prior you choose, you'll always end up with a variant of Occam's Razor where programs beyond a particular length are always assigned very low probabilities.

You don't necessarily have to decrease proportionally to 2^(-length), instead of say 3^(-length) or 1/Ackermann(length), like the Solomonoff prior does. However, since each additional bit lets you identify a value from a space that's twice as big (i.e. the precision grows like 2^length), it seems like a natural choice to penalize by that much.

(Personally, I prefer a prior that also lightly penalizes for running time. That removes computability issues and makes the induction approximable by dove-tailing.)