Vanessa Kosoy

AI alignment researcher supported by MIRI and LTFF. Working on the learning-theoretic agenda. Based in Israel. See also LinkedIn.

E-mail: vanessa DOT kosoy AT {the thing reverse stupidity is not} DOT org

Wiki Contributions

Comments

Sorted by

Rings true. Btw, I heard many times people with experience in senior roles making "ha ha only serious" jokes about how obviously any manager would hire more underlings if only you let them. I also feel the pull of this motivation myself, although usually I prefer other kinds of status. (Of the sort "people liking/admiring me" rather than "me having power over people".)

You're ignoring the part where making something cheaper is a real benefit. For example, it's usually better to have a world where everyone can access a thing of slightly lower quality, than a world where only a small elite can access a thing, but the thing is of slightly higher quality.

Btw, I mentioned the possibility of cycles that increase GDP before.

I think that some people are massively missing the point of the Turing test. The Turing test is not about understanding natural language. The idea of the test is, if an AI can behave indistinguishably from a human as far as any other human can tell, then obviously it has at least as much mental capability as humans have. For example, if humans are good at some task X, then you can ask the AI to solve the same task, and if it does poorly then it's a way to distinguish the AI from a human

The only issue is how long the test should take and how qualified the judge. Intuitively, it feels plausible that if an AI can withstand (say) a few hours of drilling by an expert judge, then it would do well even on tasks that take years for a human. It's not obvious, but it's at least plausible. And I don't think existing AIs are especially near to passing this.

I don't think embeddedness has much to do with it. And I disagree that it's incompatible with counterfactuals. For example, infra-Bayesian physicalism is fully embedded and has a notion of counterfactuals. I expect any reasonable alternative to have them as well.

Maybe the learning algorithm doesn't have a clear notion of "positive and negative", and instead just provides in a same direction (but with different intensities) for different intensities in a scale without origin. (But this seems very different from the current paradigm, and fundamentally wasteful.)

 

Maybe I don't understand your intent, but isn't this exactly the currently paradigm? You train a network using the derivative of the loss function. Adding a constant to the loss function changes nothing. So, I don't see how it's possible to have a purely ML-based explanation of where humans consider the "origin" to be.

I'm skeptical about research ideation, but literature reviews, yes, I can see that.

Just flagging that for humans, a "long" word might mean a word that's long to pronounce rather than long to write (i.e. ~number of syllables instead of number of letters)

Load More