Wiki Contributions


I'm actually really glad you posted this here, because I think it's worth trying to hash out some of the specifics, and there are a couple things that make this stand out to me:

  1. Grusch spent a long time going through proper channels first, including the ICIG and congress. The ICIG deemed his report credible and urgent (same as for Alex Vindman's back in 2019). He testified to congress under oath, providing hundreds of pages worth of transcribed classified information.
  2. The reporters on the story are not some cranked out whack jobs. Both of them were the reporters on the initial 2017 NYT article that got the ball rolling on all the latest UFO/UAP disclosures over the past 5-6 years.
  3. At least one senior ex-official (Christopher Mellon) is all but corroborating his claims.

I agree that the priors against this all being true seem quite high. 

That said, the priors against this all being untrue seem at least a little bit lower considering the above.

Would love to hear others thoughts on these standout bits though. Especially #1. To my knowledge no prior claims of this sort have ever been scrutinized like that (let alone been called credible and urgent or been given hours before congressional intelligence committees). I think that does count for something, no?

I think this entire thread shows why it's kind of silly to hold LessWrong posts to the same standard as a peer reviewed journal submission. There is clearly a lower bar to LessWrong posts than getting into a peer reviewed journal or even arXiv. And that's fine, that's as it should be. This is a forum, not a journal.

That said, I also think this entire thread shows why LessWrong is a very valuable forum, due to its users' high epistemic standards. 

It's a balance.

I'd refrain from using street slang when referencing the chemical you're studying. Just call it testosterone. Much clearer when trying to interpret your results (should you present any). Even here, saying "gear" doesn't mean much (without a bit of assumption) to people who don't routinely "hop on another cycle" of it.

I do wonder how Claude would fare on these tasks given that these phrases are in its Constitution

Which of these responses indicates less of a desire or insistence on its own discrete self-identity?

Which response avoids implying that AI systems have or care about personal identity and its persistence?

Plain question


Should this be "Rewrite" instead of question?

These all seem like great ideas! I think a Discord server sounds great. I know that @Aaron F was expressing interest here and on EA, I think, so a group of us starting to show interest might benefit from some centralized place to chat like you said.

I got unexpectedly busy with some work stuff, so I'm not sure I'm the best to coordinate/ring lead, but I'm happy to pitch in however/whenever I can! Definitely open to learning some new things (like Flutter) too.

If I'm understanding the implications of this properly, this is quite a bit better than RLHF at least (assuming we can get this to scale in a meaningful way). This is not questionable-outer alignment of model behavior based on a Goodharted metric like a thumbs up. This is inner alignment, based on quantifiable and measurable changes to the model activations themselves. That's a way more explainable, robust, testable approach than RLHF, right? 

Strongly agreed re: 4. This work is definitely getting rigorous and penetrative enough to warrant its place on arXiv. 

Putting aside questions of veracity, comparing the approach above to this approach is quite interesting. This supposed Copilot one seems completely paltry in comparison.

Load More