Joseph Miller

Wikitag Contributions

Comments

Sorted by

I just found the paper BERT's output layer recognizes all hidden layers? Some Intriguing Phenomena and a simple way to boost BERT, which precedes this post by a few months and invents essentially the same technique as the logit lens.

So consider also citing that paper when citing this post.

As an aside, I would guess that this is the most cited LessWrong post in the academic literature, but it would be cool if anyone had stats on that.

Yeah I guess, but actually the more I think about it, the more impractical it seems.

I think the solution would be something like adopting a security mindset with respect to preventing community members going off the rails.

The costs would be great because then everyone would be under suspicion by default, but maybe it would be worth it.

The next international PauseAI protest is taking place in one week in London, New York, Stockholm (Sunday 9th Feb), Paris (Mon 10 Feb) and many other cities around the world.

We are calling for AI Safety to be the focus of the upcoming Paris AI Action Summit. If you're on the fence, take a look at Why I'm doing PauseAI.

For those in Europe, Tomorrow Biostasis makes the process a lot easier and they have people who will talk you through step by step.

A good example of surprising detail I just read.

It turns out that the UI for a simple handheld calculator is a large design space with no easy solutions.

https://lcamtuf.substack.com/p/ui-is-hell-four-function-calculators

  • Following OpenAI Twitter freakouts is a colossal, utterly pointless waste of your time and you shouldn't do it ever.

I feel like for the same reasons, this shortform is kind of an engaging waste of my time. One reason I read LessWrong is to avoid twitter garbage.

we thought that forecasting AI trends was important to be able to have us taken seriously

This might be the most dramatic example ever of forecasting affecting the outcome.

Similarly I'm concerned that a lot of alignment people are putting work into evals and benchmarks which may be having some accelerating affect on the AI capabilities which they are trying to understand.

"That which is measured improves. That which is measured and reported improves exponentially."

Just did a debugging session IRL with Gurkenglas and it was very helpful!

correctness and beta-coherence can be rolled up into one specific property

Is that rolling up two things into one, or is that just beta-coherence?

Load More