Suppose instead of using 2^-K(H) we just use 2^-length(H), does this do something obviously stupid?
Here's what I'm proposing:
Take a programing language with two characters. Assign each program a prior of 2^-length(program). If the program outputs some string, then P(string | program) = 1, else it equals 0. I figure there must be some reason people don't do this already, or else there's a bunch of people doing it. I'd be real happy to find out about either.
Clearly, it isn't a probability distribution, but we can still use it, no?