Nice work; the pizza index has always been one of my favorite pieces of OISINT lore. This might be the first time I’ve seen any attempt to actually test it. If I remember correctly, the pizza index originated from a Dominos down the street from the White House.
I have a book somewhere that covers this but can’t remember which one. If I find it, I’ll update this.
Thanks for taking us with you down the rabbit hole! Truly fascinating,
It's an interesting concept that some AI labs are playing around with. GLM-4.7 I believe does this process within it's <think> tags; you'll see it draft a response first, critique the draft, and then output an adjusted response. I frankly haven't played around with GLM-4.7 enough to know if it's actually more effective in practice, but I do like the idea.
However, I personally find more value in getting a second opinion from a different model architecture entirely and then using both assessments to make an informed decision. I suppose it all comes down to particular use case; there are upsides and downsides to both.