Philosophy in the Darkest Timeline: Basics of the Evolution of Meaning

Fair enough - it's probably good to have it in writing. But this seems to me like the sort of explanation that is "the only possible way it could conceivably work." How could we bootstrap language learning if not for our existing, probably-inherent faculty for correlating classifiers over the the environment? Once you say "I want to teach something the meaning of a word, but the only means I have to transmit information to them is present them with situations and have them make inferences"… there almost isn't anything to add to this. The question already seems to contain the only possible answer.

Maybe you need to have read Through the Looking Glass?

Philosophy in the Darkest Timeline: Basics of the Evolution of Meaning

I thought this was the standard theory of meaning that everyone already believed.

Is there anyone who doesn't know this?

The One Mistake Rule

But to be fair, if you then fixed the model to output errors once you exceeded the speed of light, as the post recommends, you would have come up with a model that actually communicated a deep truth. There's no reason a model has to be continuous, after all.

Predictors exist: CDT going bonkers... forever

Since when does CDT include backtracking on noticing other people's predictive inconsistency? And, I'm not sure that any such explicitly iterative algorithm would be stable.

  1. The CDT agent considers making the decision to say “one” but notices that Omega’s prediction aligns with its actions.

This is the key. You're not playing CDT here, you're playing "human-style hacky decision theory." CDT cannot notice that Omega's prediction aligns with its hypothetical decision because Omega's prediction is causally "before" CDT's decision, so any causal decision graph cannot condition on it. This is why post-TDT decision theories are also called "acausal."

The "Commitment Races" problem

True, sorry, I forgot the whole set of paradoxes that led up to FDT/UDT. I mean something like... "this is equivalent to the problem that FDT/UDT already has to solve anyways." Allowing you to make exceptions doesn't make your job harder.

The "Commitment Races" problem

I concur in general, but:

you might accidentally realize that such-and-such type of agent will threaten you regardless of what you commit to and then if you are a coward you will “give in” by making an exception for that agent.

this seems like a problem for humans and badly-built AIs. Nothing that reliably one-boxes should ever do this.

Meta-discussion from "Circling as Cousin to Rationality"

I don't think it's so implausible for some people to be significantly more baffled by some things that we must interpret it as an attack. An unusually large imposition of costs is not inherently an attack! May as well blame the disabled for dastardly forcing us to waste money on wheelchair ramps.

Meta-discussion from "Circling as Cousin to Rationality"

I think this once again presupposes a lot of unestablished consensus: for one, that it's trivial for people to generate hypotheses for undefined words, that this is a worthwhile skill to begin with, and that this is a proper approach to begin with. I don't think that a post author should get to impose this level of ideological conformance onto a commenter, and it weirds me out how much the people on this site now seem to be agreeing that Said deserves censure for (verbosely and repeatedly) disagreeing with this position.

And then it seems to be doing a lot of high-distance inference from presuming a "typical" mindset on Said's part and figuring out a lot of implications as to what they were doing, which is exactly the thing that Said wanted to avoid by not guessing a definition? Thus kind of proving their point?

More importantly, I at least consider providing hypotheses as to a definition as obviously supererogatory. If you don't know the meaning of a word in a text, then the meaning may be either obvious or obscured; the risk you take by asking is wasting somebody's time for no reason. But I consider it far from shown that giving a hypothesis shortens this time at all, and more importantly, there is none such Schelling point established and thus it seems a stretch of propriety to demand it as if it was an agreed upon convention. Certainly the work to establish it as a convention should be done before the readership breaks out the mass downvotes; I mean seriously- what the fuck, LessWrong?

Meta-discussion from "Circling as Cousin to Rationality"

I find myself thinking: if you’re so consistently unable to guess what people might mean, or why people might think something, maybe the problem is (at least some of the time) with your imagination.

Who cares who "the problem" is with? Text is supposed to be understood. The thing that attracted me to the Sequences to begin with was sensible, comprehensible and coherent explanations of complex concepts. Are we giving up on this? Or are people who value clear language and want to avoid misunderstandings (and may even be, dare I say, neuroatypical) no longer part of the target group, but instead someone to be suspicious of?

The Sequences exist to provide a canon of shared information and terminology to reference. If you can't explain something without referencing a term that is evidently not shared by everyone, and that you don't just not bother to define but react with hostility when pressed on, then ... frankly, I don't think that behavior is in keeping with the spirit of this blog.

Does GPT-2 Understand Anything?

Sentences 1 and 4 should have higher probability than sentences 2 and 3. What they find is that GPT-2 does worse than chance on these kinds of problems. If a sentence is likely, a variation on the sentence with opposite meaning tends to have similar likelihood.

I can anecdotally confirm this; I've been personally calling this the "GPT swerve", ie. sentences of the form "We are in favor of recycling, because recycling doesn't actually improve the environment, and that's why we are against recycling."

The proposed explanation makes sense as well. Is anyone trying to pre-train a GPT-2 with unlikelihood avoidance?

Load More