FeepingCreature's Comments

Predictors exist: CDT going bonkers... forever

Since when does CDT include backtracking on noticing other people's predictive inconsistency? And, I'm not sure that any such explicitly iterative algorithm would be stable.

  1. The CDT agent considers making the decision to say “one” but notices that Omega’s prediction aligns with its actions.

This is the key. You're not playing CDT here, you're playing "human-style hacky decision theory." CDT cannot notice that Omega's prediction aligns with its hypothetical decision because Omega's prediction is causally "before" CDT's decision, so any causal decision graph cannot condition on it. This is why post-TDT decision theories are also called "acausal."

The "Commitment Races" problem

True, sorry, I forgot the whole set of paradoxes that led up to FDT/UDT. I mean something like... "this is equivalent to the problem that FDT/UDT already has to solve anyways." Allowing you to make exceptions doesn't make your job harder.

The "Commitment Races" problem

I concur in general, but:

you might accidentally realize that such-and-such type of agent will threaten you regardless of what you commit to and then if you are a coward you will “give in” by making an exception for that agent.

this seems like a problem for humans and badly-built AIs. Nothing that reliably one-boxes should ever do this.

Meta-discussion from "Circling as Cousin to Rationality"

I don't think it's so implausible for some people to be significantly more baffled by some things that we must interpret it as an attack. An unusually large imposition of costs is not inherently an attack! May as well blame the disabled for dastardly forcing us to waste money on wheelchair ramps.

Meta-discussion from "Circling as Cousin to Rationality"

I think this once again presupposes a lot of unestablished consensus: for one, that it's trivial for people to generate hypotheses for undefined words, that this is a worthwhile skill to begin with, and that this is a proper approach to begin with. I don't think that a post author should get to impose this level of ideological conformance onto a commenter, and it weirds me out how much the people on this site now seem to be agreeing that Said deserves censure for (verbosely and repeatedly) disagreeing with this position.

And then it seems to be doing a lot of high-distance inference from presuming a "typical" mindset on Said's part and figuring out a lot of implications as to what they were doing, which is exactly the thing that Said wanted to avoid by not guessing a definition? Thus kind of proving their point?

More importantly, I at least consider providing hypotheses as to a definition as obviously supererogatory. If you don't know the meaning of a word in a text, then the meaning may be either obvious or obscured; the risk you take by asking is wasting somebody's time for no reason. But I consider it far from shown that giving a hypothesis shortens this time at all, and more importantly, there is none such Schelling point established and thus it seems a stretch of propriety to demand it as if it was an agreed upon convention. Certainly the work to establish it as a convention should be done before the readership breaks out the mass downvotes; I mean seriously- what the fuck, LessWrong?

Meta-discussion from "Circling as Cousin to Rationality"

I find myself thinking: if you’re so consistently unable to guess what people might mean, or why people might think something, maybe the problem is (at least some of the time) with your imagination.

Who cares who "the problem" is with? Text is supposed to be understood. The thing that attracted me to the Sequences to begin with was sensible, comprehensible and coherent explanations of complex concepts. Are we giving up on this? Or are people who value clear language and want to avoid misunderstandings (and may even be, dare I say, neuroatypical) no longer part of the target group, but instead someone to be suspicious of?

The Sequences exist to provide a canon of shared information and terminology to reference. If you can't explain something without referencing a term that is evidently not shared by everyone, and that you don't just not bother to define but react with hostility when pressed on, then ... frankly, I don't think that behavior is in keeping with the spirit of this blog.

Does GPT-2 Understand Anything?

Sentences 1 and 4 should have higher probability than sentences 2 and 3. What they find is that GPT-2 does worse than chance on these kinds of problems. If a sentence is likely, a variation on the sentence with opposite meaning tends to have similar likelihood.

I can anecdotally confirm this; I've been personally calling this the "GPT swerve", ie. sentences of the form "We are in favor of recycling, because recycling doesn't actually improve the environment, and that's why we are against recycling."

The proposed explanation makes sense as well. Is anyone trying to pre-train a GPT-2 with unlikelihood avoidance?

Circling as Cousin to Rationality

I think the Litany of Gendlin sorta bridges between those sentiments - anything that can be destroyed by the truth should be, because it cannot be a load-bearing belief since it doesn't do any work.

Of course, the amount of effort you have to put in to (re)construct a properly working belief may be significant and the interval in between may be quite unsettling.

The "Commitment Races" problem

I think this undervalues conditional commitments. The problem of "early commitment" depends entirely on you possibly having a wrong image of the state of the world. So if you just condition your commitment on the information you have available, you avoid premature commitments made in ignorance and give other agents an incentive to improve your world model. Likewise, this would protect you from learning about other agents' commitments "too late" - you can always just condition on things like "unless I find an agent with commitment X". You can do this whether or not you even know to think of an agent with commitment X, as long as other agents who care about X can predict your reaction to learning about X.

Commitments aren't inescapable shackles, they're just another term for "predictable behavior." The usefulness of commitments doesn't require you to bind yourself regardless of learning any new information about reality. Oaths are highly binding for humans because we "look for excuses", our behavior is hard to predict, and we can't reliably predict and evaluate complex rule systems. None of those should pose serious problems for trading superintelligences.

Minicamps on Rationality and Awesomeness: May 11-13, June 22-24, and July 21-28

Probabilities can be empirically wrong, sure, but I find it weird to say that they're "not probabilities" until they're calibrated. If you imagine 20 scenarios in this class, and your brain says "I expect to be wrong in one of those", that just is a probability straight up.

(This may come down to frequency vs belief interpretations of probability, but I think saying that beliefs aren't probabilistic at all needs defending separately.)

Load More