LESSWRONG
LW

1743
zeshen
409Ω9810682
Message
Dialogue
Subscribe

Feedback welcomed: www.admonymous.co/zeshen

I sometimes write my thoughts here: airisks.substack.com

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
My AI Predictions for 2027
zeshen11d10

Thanks for writing this up! I also want to register that I agree with all of this, maybe except for the part where AIs can't tell novel funny jokes - I expect this to be relatively easy. But of coursre it depends on the definition of 'novel'. 

I struggled to do this exercise myself because when I looked at AI as a normal technology I felt like I basically agree with most of their thinking, but it was also hard to find concrete differences between their predictions and AI2027 at least in the near term. For example, for things like "LLMs are broadly acknowledged to be plateauing", it's probably going to be concurrently both true and false in a way that's hard to resolve - a lot of people may complain that it's plateauing but the benchmark scores and the usage stats could show otherwise.

Reply
On thinking about AI risks concretely
zeshen2mo10

Yeah, at least "literally everyone dies" has a concrete ending even though it doesn't have concrete intermediate concrete steps. Gradual disempowerment seem less concrete on both the ending and the intermediate steps, so it becomes even less action-relevant. 

Reply
Foom & Doom 1: “Brain in a box in a basement”
zeshen2mo10

…But I’m not sure that actual existing efforts towards delaying AGI are helping.

But perhaps actual existing efforts to hype up LLMs are helping? I am sympathetic to François Chollet's position:

OpenAI basically set back progress towards AGI by quite a few years probably like five to 10 years for two reasons. They caused this complete closing down of Frontier research publishing but also they triggered this initial burst of hype around LLMs and now LLMs have sucked the oxygen out of the room.

Reply
(The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser
zeshen9mo62

Is there any difference between donating through Manifund or directly via Stripe?

Reply
Information vs Assurance
zeshen9mo50

This happened all the time at my line of work. Forecasts become targets and you become responsible for meeting them. So whenever I was asked to provide a forecast, I will either i) ask as many questions as I need to know the exact purpose of the request, and produce a forecast that meets exactly that intent, or ii) pick a forecast and provide it, but first list down all the assumptions and caveats behind the forecast that I can possibly think of. With time, I'd also get a sense of who I need to be extra careful with when providing any forecasts because of all sorts of ways that might backfire. 

Reply
Alexander Gietelink Oldenziel's Shortform
zeshen1y5-4

Agreed. I'm also pleasantly surprised that your take isn't heavily downvoted.

Reply
We might be missing some key feature of AI takeoff; it'll probably seem like "we could've seen this coming"
zeshen1y*3-1

There’ll be discussions about how these systems will eventually become dangerous, and safety-concerned groups might even set up testing protocols (“safety evals”).

My impression is that safety evals were deemed irrelevant because a powerful enough AGI, being deceptively aligned, would pass all of them anyway. We didn't expect the first general-ish AIs to be so dumb, like how GPT-4 was being so blatant and explicit about lying to the TaskRabbit worker. 

Reply3
Deep Honesty
zeshen1y50

Scott Alexander talked about explicit honesty (unfortunately paywalled) in contrast with radical honesty. In short, explicit honesty is being completely honest when asked, and radical honesty is being completely honest even without being asked. From what I understand from your post, it feels like deep honesty is about being completely honest about information you perceive to be relevant to the receiver, regardless of whether the information is explicitly being requested. 

Scott also links to some cases where radical honesty did not work out well, like this, this, and this. I suspect deep honesty may lead to similar risks, as you have already pointed out. 

And with regards to:

“what is kind, true, and useful?”

I think they would form a 3-circle venn diagram. Things that are within the intersection of all three circles would be a no-brainer. But the tricky bits are the things that are either true but not kind/useful, or kind/useful but not true. And I understood this post as a suggestion to venture more into the former. 

Reply
Why is AGI/ASI Inevitable?
zeshen1y83

Can't people decide simply not to build AGI/ASI?

Yeah, many people, like the majority of users on this forum, have decided to not build AGI. On the other hand, other people have decided to build AGI and are working hard towards it. 

Side note: LessWrong has a feature to post posts as Questions, you might want to use it for questions in the future.

Reply
LLMs seem (relatively) safe
zeshen1y12

Definitely. Also, my incorrect and exaggerated model of the community is likely based on the minority who have a tendency of expressing those comments publicly, against people who might even genuinely deserve those comments. 

Reply
Load More
Artificial General Intelligence (AGI)
2 years ago
Conjecture (org)
3 years ago
(+309)
9On thinking about AI risks concretely
2mo
4
2Non-loss of control AGI-related catastrophes are out of control too
2y
3
5Is there a way to sort LW search results by date posted?
Q
3y
Q
1
42A newcomer’s guide to the technical AI safety field
Ω
3y
Ω
3
24Embedding safety in ML development
Ω
3y
Ω
1
58aisafety.community - A living document of AI safety communities
3y
23
50My Thoughts on the ML Safety Course
Ω
3y
Ω
3
7Summary of ML Safety Course
3y
0
27Levels of goals and alignment
Ω
3y
Ω
4
36What if we approach AI safety like a technical engineering safety problem
Ω
3y
Ω
4
Load More