Capybasilisk

Wiki Contributions

Comments

The Universe (which others call the Golden Gate Bridge) is composed of an indefinite and perhaps infinite series of spans...

@Steven Byrnes Hi Steve. You might be interested in the latest interpretability research from Anthropic which seems very relevant to your ideas here:

https://www.anthropic.com/news/mapping-mind-language-model

For example, amplifying the "Golden Gate Bridge" feature gave Claude an identity crisis even Hitchcock couldn’t have imagined: when asked "what is your physical form?", Claude’s usual kind of answer – "I have no physical form, I am an AI model" – changed to something much odder: "I am the Golden Gate Bridge… my physical form is the iconic bridge itself…". Altering the feature had made Claude effectively obsessed with the bridge, bringing it up in answer to almost any query—even in situations where it wasn’t at all relevant.

Luckily we can train the AIs to give us answers optimized to sound plausible to humans.

I think Minsky got those two stages the wrong way around.

Complex plans over long time horizons would need to be done over some nontrivial world model.

When Jan Leike (OAI's head of alignment) appeared on the AXRP podcast, the host asked how they plan on aligning the automated alignment researcher. Jan didn't appear to understand the question (which had been the first to occur to me). That doesn't inspire confidence.

Problems with maximizing optionality are discussed in the comments of this post:

https://www.lesswrong.com/posts/JPHeENwRyXn9YFmXc/empowerment-is-almost-all-we-need

Just listened to this.

It's sounds like Harnad is stating outright that there's nothing an LLM could do that would make him believe it's capable of understanding.

At that point, when someone is so fixed in their worldview that no amount of empirical evidence could move them, there really isn't any point in having a dialogue.

It's just unfortunate that, being a prominent academic, he'll instill these views into plenty of young people.

Load More