gwern's Shortform

by gwern24th Apr 20215 comments
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.
5 comments, sorted by Highlighting new comments since Today at 8:31 PM
New Comment

Humanities satirical traditions: I always enjoy the CS/ML/math/statistics satire in the annual SIGBOVIK and Ig Nobels; physics has Arxiv April Fools papers (like "On the Impossibility of Supersized Machines") & journals like Special Topics; and medicine has the BMJ Christmas issue, of course.

What are the equivalents in the humanities, like sociology or literature? (I asked a month ago on Twitter and got zero suggestions...)

Normalization-free Bayes: I was musing on Twitter about what the simplest possible still-correct computable demonstration of Bayesian inference is, that even a middle-schooler could implement & understand. My best candidate so far is ABC Bayesian inference*: simulation + rejection, along with the 'possible worlds' interpretation.

Someone noted that rejection sampling is simple but needs normalization steps, which adds complexity back. I recalled that somewhere on LW many years ago someone had a comment about a Bayesian interpretation where you don't need to renormalize after every likelihood computation, and every hypothesis just decreases at different rates; as strange as it sounds, it's apparently formally equivalent. I thought it was by Wei Dai, but I can't seem to refind it because queries like 'Wei Dai Bayesian decrease' obviously pull up way too many hits, it's probably buried in an Open Thread somewhere, my Twitter didn't help, and Wei Dai didn't recall it at all when I asked him. Does anyone remember this?

* I've made a point of using ABC in some analyses simply because it amuses me that something so simple still works, even when I'm sure I could've found a much faster MCMC or VI solution with some more work.


Incidentally, I'm wondering if the ABC simplification can be taken further to cover subjective Bayesian decision theory as well: if you have sets of possible worlds/hypotheses, let's say discrete for convenience, and you do only penalty updates as rejection sampling of worlds that don't match the current observation (like AIXI), can you then implement decision theory normally by defining a loss function and maximizing over it? In which case you can get Bayesian decision theory without probabilities, calculus, MCM, VI, etc or anything more complicated than a list of numbers and a few computational primitives like coinflip().

Doing another search, it seems I made at least one comment that is somewhat relevant, although it might not be what you're thinking of: https://www.greaterwrong.com/posts/5bd75cc58225bf06703751b2/in-memoryless-cartesian-environments-every-udt-policy-is-a-cdt-sia-policy/comment/kuY5LagQKgnuPTPYZ

Funny that you have your great LessWrong whale as I do, and that you recall that it may be from Wei Dai as well (while him not recalling)

 https://www.lesswrong.com/posts/X4nYiTLGxAkR2KLAP/?commentId=nS9vvTiDLZYow2KSK

2-of-2 escrow: what is the exploding Nash equilibrium? Did it really originate with NashX? I've been looking for the history & real name of this concept for years now and have failed to refind it. Anyone?