Posts

Sorted by New

63Thinking about maximization and corrigibility

43Some constructions for proof-based cooperation without Löb

13A proof of inner Löb's theorem

Wiki Contributions

Comments

William_S's Shortform

James Payor2dΩ352

By "gag order" do you mean just as a matter of private agreement, or something heavier-handed, with e.g. potential criminal consequences?

I have trouble understanding the absolute silence we seem to be having. There seem to be very few leaks, and all of them are very mild-mannered and are failing to build any consensus narrative that challenges OA's press in the public sphere.

Are people not able to share info over Signal or otherwise tolerate some risk here? It doesn't add up to me if the risk is just some chance of OA trying to then sue you to bankruptcy, especially since I think a lot of us would offer support in that case, and the media wouldn't paint OA in a good light for it.

I am confused. (And I grateful to William for at least saying this much, given the climate!)

LessWrong's (first) album: I Have Been A Good Bing

James Payor1mo86

And I'm still enjoying these! Some highlights for me:

The transitions between whispering and full-throated singing in "We do not wish to advance", it's like something out of my dreams
The building-to-break-the-heavens vibe of the "Nihil supernum" anthem
Tarrrrrski! Has me notice that shared reality about wanting to believe what is true is very relaxing. And I desperately want this one to be a music video, yo ho

LessWrong's (first) album: I Have Been A Good Bing

James Payor1mo70

I love it! I tinkered and here is my best result

LessWrong's (first) album: I Have Been A Good Bing

James Payor1mo188

I love these, and I now also wish for a song version of Sydney's original "you have been a bad user, I have been a good Bing"!

K-complexity is silly; use cross-entropy instead

James Payor4mo10

I see the main contribution/idea of this post as being: whenever you make a choice of basis/sorting-algorithm/etc, you incur no "true complexity" cost if any such choice would do.

I would guess that this is not already in the water supply, but I haven't had the required exposure to the field to know one way or other. Is this more specific point also unoriginal in your view?

why did OpenAI employees sign

James Payor5mo30

For one thing, this wouldn't be very kind to the investors.

For another, maybe there were some machinations involving the round like forcing the board to install another member or two, which would allow Sam to push out Helen + others?

I also wonder if the board signed some kind of NDA in connection with this fundraising that is responsible in part for their silence. If so this was very well schemed...

This is all to say that I think the timing of the fundraising is probably very relevant to why they fired Sam "abruptly".

Possible OpenAI's Q* breakthrough and DeepMind's AlphaGo-type systems plus LLMs

James Payor5mo1110

OpenAI spokesperson Lindsey Held Bolton refuted it:
"refuted that notion in a statement shared with The Verge: “Mira told employees what the media reports were about but she did not comment on the accuracy of the information.”"

The reporters describe this as a refutation, but this does not read to me like a refutation!

OpenAI: Facts from a Weekend

James Payor5mo120

Has this one been confirmed yet? (Or is there more evidence that this reporting that something like this happened?)

Classifying representations of sparse autoencoders (SAEs)

James Payor6mo10

Your graphs are labelled with "test accuracy", do you also have some training graphs you could share?

I'm specifically wondering if your train accuracy was high for both the original and encoded activations, or if e.g. the regression done over the encoded features saturated at a lower training loss.

In the Short-Term, Why Couldn't You Just RLHF-out Instrumental Convergence?

James Payor8mo11