LESSWRONG
LW

BrianTan
592150
Message
Dialogue
Subscribe

I'm an Operations Associate at Arcadia Impact, a UK nonprofit that empowers people and organisations to tackle pressing global issues, with a focus on AI safety and governance.

Before joining Arcadia, I:

  • co-founded WhiteBox Research and led its operations and marketing. WhiteBox aims to develop more AI interpretability and safety researchers in Asia.
  • was a Group Support Contractor for the Centre for Effective Altruism (CEA) for two years, where I helped support EA groups around the world.
  • co-founded EA Philippines and worked full-time as a community builder for EA PH in 2021.

You can reach out to me at brian [at] arcadiaimpact [dot] org or find me on LinkedIn.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
"We know how to build AGI" - Sam Altman
BrianTan8mo184

Thanks for linking these! I also want to highlight that Sam shared his AGI timeline in the Bloomberg interview: "I think AGI will probably get developed during this president’s term, and getting that right seems really important."

Reply3
evhub's Shortform
BrianTan8mo*86

My typo reaction may have glitched, but I think you meant "Don't push the frontier of capabilities" in the last bullet?

Reply1
Alignment Faking in Large Language Models
BrianTan9mo10

I've only read the blog post and a bit of the paper so far, but do you plan to investigate how to remove alignment faking in these situations? I wonder if there are simple methods to do so without negatively affecting the model's capabilities and safety.

Reply
Alignment Faking in Large Language Models
BrianTan9mo30

Thanks for doing this important research! I may have found 2 minor typos:

  1. The abstract says "We find the model complies with harmful queries from free users 14% of the time", but in other places it says 12% - should it be 12%?
  2. In the blog post, "sabotage evaluations" seems to link to a private link
Reply1
Brief analysis of OP Technical AI Safety Funding
BrianTan9mo30

Thanks for this analysis! A minor note: you're probably aware of this, but OpenPhil funds a lot of technical AI safety field-building work as part of their "Global Catastrophic Risks Capacity Building" grants. So the proportion of field-building / talent-development grants would be significantly higher if those were included.

Reply
An Overview of the AI Safety Funding Situation
BrianTan1y20

Thanks for making this! This is minor, but I think the total should be $189M and not $169M?

Reply
Ideas for improving epistemics in AI safety outreach
BrianTan2y20

Your last sentence in the first paragraph seems to be cut off at "gets a lot more than"!

Reply1
$20K In Bounties for AI Safety Public Materials
BrianTan2y10

I'm following up on Leon's question - have the results already been posted? If not, when will they be posted (if they will be)? I'm curious to know.  Thanks!

Reply
Rapid Increase of Highly Mutated B.1.1.529 Strain in South Africa
BrianTan4y80

And this thread from Dr. Eric Feigl-Ding is worrying too.

Reply
Rapid Increase of Highly Mutated B.1.1.529 Strain in South Africa
BrianTan4y100

Thanks for this. This tweet from Dr. Jacob Glanville, founder and CEO of Centivax, makes me worried about this variant too: 

The new B.1.1.529 strain out of South Africa has 15 mutations in the RBD where majority of neutralizing antibodies bind. The current vaccines and even Delta-based vaccines probably won’t work against this new strain. Swift, vigorous containment is needed.

Reply
Load More
8Should I advocate for people to not buy Ivermectin as a treatment for COVID-19 in the Philippines for now?
Q
4y
Q
9
1How can we quantify the good done of donating PPEs vs ventilators?
Q
5y
Q
1