LESSWRONG
LW

lisathiergart
873250
Message
Dialogue
Subscribe

https://admonymous.co/lisath

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
5lisathiergart's Shortform
2mo
1
lisathiergart's Shortform
lisathiergart2mo100

I'm a manifund regrantor for 2025 with a budget of up to 100k for the year. I am interested in funding impactful and neglected work in the space of AI Security (especially - researching what it takes to reach security level 5, prototyping components for SL5, hardware security) and technical AI governance (flexHEGs, compute governance, export controls).

If your project is in this scope and a good fit, feel free to apply for funding using this form. If you know someone who you think should apply to this, you can send this form their way.

Reply
TurnTrout's shortform feed
lisathiergart14d4-10

Thanks for highlighting this Alex. 

I think it's important information for people considering engaging with or taking communication advice from Nate to know that there has been a long history of people having a range of difficult to unpleasant to harmful experiences engaging with him. My knowledge of this is mostly from my former role as research manager at MIRI and the cases I heard about were all in a professional setting. 

The e/acc person's description is similar to descriptions I heard from these other cases. 

Personal note - given lesswrong is specifically about speaking all truths including the uncomfortable, I find it disappointing to see comment deletion happen in a situation like this.

Reply1
Yonatan Cale's Shortform
lisathiergart2mo10

Cool, thanks for looking into this Yonatan! I found this useful.

Reply
Yonatan Cale's Shortform
lisathiergart5mo*90

Speaking in my personal capacity as research lead of TGT (and not on behalf of MIRI), I think work in this direction is potentially interesting. One difficulty with work like this are anti-trust laws, which I am not familiar with in detail but they serve to restrict industry coordination that restricts further development / competition. It might be worth looking into how exactly anti-trust laws apply to this situation, and if there are workable solutions. Organisations that might be well placed to carry out work like this might be the frontier model forum and affiliated groups, I also have some ideas we could discuss in person.

I also think there might be more legal leeway for work like this to be done if it's housed within organisations (government or ngos) that are officially tasked with defining industry standards or similar. 

Reply21
Related Discussion from Thomas Kwa's MIRI Research Experience
lisathiergart2y4428

I’m MIRI’s new research manager and I’d like to report back on the actions we’ve taken inside MIRI in response to the experiences reported above (and others). In fact I joined MIRI earlier this year in part because we believe we can do better on this. 

First off, I’d like to thank everyone in this thread for your bravery (especially @KurtB and @TurnTrout). I know this is not easy to speak about and I’d like you to know that you have been heard and that you have contributed to a real improvement. 

Second, I’d like to say that I, personally, as well as MIRI the org take these concerns very seriously and we’ve spent the intervening time coming up with internal reforms. Across MIRI research, comms and ops, we want every MIRI staff member to have a safe environment to work in and to not have to engage in any interactions they do not consent to. For my area of responsibility in research, I’d like to make a public commitment to firmly aim for this. 

To achieve this we’ve set up the following: 

  • Nate currently does not directly manage any staff. By default, all new research staff will be managed by me (Lisa) and don’t need to interact with Nate. Further, should he ever want to manage researchers at MIRI again, any potential staff wanting to be managed by him shall go through a rigorous consent process and then be given the option of an informed choice on whether they’d like to work with him. This will include sharing of experience reports such as in this thread, conversations with staff who worked with Nate previously as well as access to Nate’s communication handbook. We are also considering adding a new onboarding step which involves a communication norms conversation between Nate and the new staff moderated by a therapist with communications experience. (We are unsure how effective this is, and would trial it)
  • Second, any new staff working with Nate shall be allowed to first work on a trial period and to be given generous support from MIRI in case of problems (this can include switching their manager, having a designated people manager they can speak to, having a severance agreement in place, as well as speaking with a licensed therapist if desirable).
  • We will also work on drafting a new internal communications policy which we will expect all our staff including Nate to abide by. We acknowledge that this will likely be vague. Our “path to impact” for this is a hope that this will make it easier for staff to bring up problems, by having a clause in the policy to point to and have less of an insecurity barrier towards concluding a problem is worth bringing up. 


We don’t think Nate’s exceptional skill set excuses his behavior, yet we also acknowledge his ability to make unique contributions and want to leverage that while minimising (ideally avoiding) harm. This narrative would feel incomplete without me (Lisa) acknowledging that I do think Nate deeply cares about his colleagues and that the communication is going badly for different reasons. 

Finally, I’d like to invite all who have thoughts to share on how to make this change effective or who’d like to privately share about other experience reports to reach out to me here on LessWrong. 

I think this discussion has been hard, but I'm glad we had it and I think it will lead to lasting positive change at MIRI.

Reply431
5lisathiergart's Shortform
2mo
1
146Does davidad's uploading moonshot work?
2y
35
70Paper: Understanding and Controlling a Maze-Solving Policy Network
Ω
2y
Ω
0
105ActAdd: Steering Language Models without Optimization
Ω
2y
Ω
3
51Open problems in activation engineering
Ω
2y
Ω
2
52Distillation of Neurotech and Alignment Workshop January 2023
2y
9
437Steering GPT-2-XL by adding an activation vector
Ω
2y
Ω
98
101Maze-solving agents: Add a top-right vector, make the agent go to the top-right
Ω
2y
Ω
17