I started developing a program à la LeechBlock / Cold Turkey / OneSec to: * limit how long I can use a program per day/week/etc. * add a short delay "Take a breath. Do you really want to run Xyz?" before starting a program. Right now it's unfinished and unstable....
TLDR: We reviewed METR’s “Example evaluation protocol” and found a couple of points where there is room for improvement or unclear information. We also make a couple of suggestions around scoring, outsourcing, etc. * This review was done by 2 people without previous knowledge of METR in ~1.5 days at...
Say you've learnt math in your native language which is not English. Since then you've also read math in English and you appreciate the near universality of mathematical notation. Then one day you want to discuss a formula in real life and you realize you don't know how to pronunce...
Hi all, I wrote lyrics to the tune of Rihanna - S&M that reference a few themes of AI risk, from the point of view of an 'A'-maximizer. I hope you like it — comments welcome :) [edit 2023-05-23: added this intro] A! A! A! One more! A! A! A!...
I wonder about a scenario where the first AI with human or superior capabilities would be nothing goal-oriented, eg a language model like GPT. Then one instance of it would be used, possibly by a random user, to make a conversational agent told to behave as a goal-oriented AI. The...
A game theory question. If there ever exists a roughly-human-level agenty AI who could grow to overpower humans but who humans have an opportunity to stop because takeoff is slow enough. Assume the AI could coexist with humanity but fears that humans interacting with it will destroy it because they...