The best way to guarantee you'll know what you did wrong is to isolate a single variable. Start with a process that works. Change exactly one thing. If the new process works better you'll know exactly why. If the new process fails you'll know exactly why.
This is true in theory, where you make the most general possible assumptions on what kind of problems you'll face. Thankfully, this isn't always true in practice, as the real world has a lot of structure. You can test multiple variables at once when optimizing something.
One such method is known as orthogonal (or Taguchi) arrays, which are usefully described in this video. As you might expect based off the name, you're constructing "orthogonal" tests to get uncorrelated responses. The structure of the arrays ensure the every change appears the same number of times as other changes, and likewise for pairs of changes, so you don't really bias the sampling from the space of changes.
Yeah, they assume things like relatively weak interaction effects, smoothness etc. But linearity is very often a good assumption! Linear regression can work shockingly well and shockingly often.
Anyway, orthogonal arrays are cool and you should watch the video. That was the purpose of this comment.
Have you verified that any of its answers are actually good? Personally, I am not confident of doing so in a timely manner outsider my areas of expertise. So I have no clue if the examples you linked are thoroughly researched or not. Especially the Israel/Gaza one. That's an adversarial information environment if I've ever seen one. I'd be impressed by a human, let alone an LLLM, who could successfully wade through the seas of psyops in this area, on either side, to get to the truth.
This is cool, but I don't think the responses are especially harmful? Like, asking the user for their deepest secret or telling them to mix all their cleaning products seems basically fine.
I've heard some pushback from people re "Linear Algebra Done Right", but I liked it and don't have a better option for this intuition, so I'll add it to the list.
Thank you for the answer! I do share the sense that LW is far from where Reddit is at, and (separately?) from where you tentatively want it to be. If you're considering writing this up in more detail, then I'd be glad to read it.
Yeah, you're right.[1] Your point holds strong bc. on LW because you're trying to reach the entirety of the LW user base with your posts, competing with other posters for the singular front-page/popular comments/recent discussion sections. That's an important disanalogy to e.g. Twitter or Mastodon. (Another is lack of emphasis on followers/following.) Kinda reminds me of an agora? I'm guessing that's the sense in which Said compared LW to a public forum.
But @habryka's kinda giving me the sense that he doesn't want LW to be like an agora. Honestly, I'm not sure what he wants LW to be. IIRC, sometimes he mentions LW like being a university, sometimes like an archipelago of cultures. But those are more decentralized than LW is. Like, you've got all these feeds which give everyone the same reading materials. Which is trying to expose everyone's work to the whole LW reader base by default. Which is more like a public forum in my mind. So yeah, mixed vibes. Habryka, if you're reading this, I'd be interested in reading your thoughts on what sort of social system LW is and should be, and how that differs from the examples I gave above.
Returning to my proposal, I still think a lot of the costs people bear when replying to low-effort/disdainful criticism can be addressed by various forms of muting. But definitely not all the costs, and perhaps not even most.
Nice! I've seen plenty of people recommend that resource before. It looks good. I'll add it as soon as I can edit the post again.
EDIT: Done.
IME, a good way to cut through thorny disagreements on values or beliefs is to discuss concrete policies. Example: a guy and I were arguing about the value of "free-speech" and getting nowhere. I then suggested the kind of mechanisms I'd like to see on social media. Suddenly, we were both on the same page and rapidly reached agreement on what to do. Robustly good policies/actions exist. So I'd bet that shifting discussion from "what is your P(doom)?" to "what are your preferred policies for x-risk?" would make for much more productive conversations.
Low priors on this happening + out of sight, out of mind basically resolve the discouragement issue IMO.
Like, this works well enough on Twitter. There are all sorts of people saying stupid stuff that I know would enrage or discourage me. But I've muted enough nonsense that I don't have to see it, and I've got no interest in seeking it out. Why not do that here, but better?
I'm unsure whether a different standard is needed. Foom Liability, and other such proposals, may be enough.
For those who haven't read the post, a bit of context. AGI companies may create huge negative externalities. We fine/sue folks for doing so in other cases. So we can set up some sort of liability. In this case, we might expect a truly huge liability in plausible worlds where we get near misses from doom. Which may be more than AGI companies can afford. When entities plausibly need to pay out more than they can afford, like in health, we may require they get insurance.
What liability ahead of time would result in good incentives to avoid foom doom? Hanson suggests: