Not from any research paper, just from a quick test I did with it, but it still impressed me: When I caught this mistake on Facebook, I thought it was quite clever, and I wanted to see whether GPT-4 could pick up on it. I'm very impressed that it was...
I'm wondering whether there is consensus on the net value/detriment of these two AI activities: 1. Integrations: plugging in new AIs into various places. E.g. Taking a cool new LLM model and building profitable products with it. 1. Con: doing AI integrations that improve products increases the relative value of...
Suppose we work hard and get a stroke of luck and unaligned AI isn't a problem. What does society look like when human-level and super-intelligent AIs can be spun up and work way cheaper than humans do? I'm in a bit of an existential void right now, because I can't...
"I think of my life now as two states of being: before reading your doc and after." - A message I got after sharing this article at work. When I first started reading about alignment, I wished there was one place that fully laid out the case of AI risk...
(I am writing this primarily so I can reference it in another article that I'm writing to keep that article from getting longer) Currently, my best guess is that P(AI doom by 2100) ≈ 20%. That is, there’s a 20% chance that strong AIs will be an existential challenge for...
"We do not know, and probably aren't even close to knowing, how to align a superintelligence. And RLHF is very cool for what we use it for today, but thinking that the alignment problem is now solved would be a very grave mistake indeed." - Sam Altman There's a feeling...
-- Reality loves self-propagating patterns -- Suppose that the company AllPenAI made an AI system that behaves according to its alignment 99.999% of the time, based on testing. That's pretty good, and it's certainly better than, for example, most Azure SLAs. If I were to argue to my manager that...