Independent AI safety researcher
can it maintain its own boundary over time, in the face of environmental disruption? Some agents are much better at this than others.
I really wish there was more attention paid to this idea of robustness to environmental disruption. It also comes up in discussions of optimization more generally (not just agents). This robustness seems to me like the most risk-relevant part of all this, and seems like it might be more important than the idea of a boundary. Maybe maintaining a boundary is a particularly good way for a process to protect itself from disruption, but I notice some doubt that this idea is most directly getting at what is dangerous about intelligent/optimizing systems, whereas robustness to environmental disruption feels like it has the potential to get at something broader that could unify both agent based risk narratives and non-agent based risk narratives.
Thanks!
Replying in order:
A note to anyone having trouble with their API key:
The API costs money, and you have to give them payment information in order to be able to use it. Furthermore, there are also apparently tiers which determine the rate limits on various models (https://platform.openai.com/docs/guides/rate-limits/usage-tiers).
The default chat model we're using is gpt-4o, but it seems like you don't get access to this model until you hit "tier 1," which happens when you have spent at least $5 on API requests. If you haven't used the API before, and think this might be your issue, you can try using gpt-3.5-turbo which is definitely available at the "free tier," though without giving them any payment information you will still run into an issue as this model also costs money. You can also log into your account and go here to buy at least $5 in OpenAI API credits: https://platform.openai.com/settings/organization/billing/overview
Finally, if you are working at an organization which is providing you API credits, you need to make sure to set that organization as your default organization here: https://platform.openai.com/settings/profile?tab=api-keys If you don't want to do this, in the Pantheon settings you can also provide an organization ID, which you should be able to find here: https://platform.openai.com/settings/organization/general
Sorry for anyone who has found this confusing. Please don't hesitate to reach out if you continue to have trouble.
Daimons are lesser divinities or spirits, often personifications of abstract concepts, beings of the same nature as both mortals and deities, similar to ghosts, chthonic heroes, spirit guides, forces of nature, or the deities themselves.
It's a nod to ancient Greek mythology: https://en.wikipedia.org/wiki/Daimon
a daemon is a computer program that runs as a background process, rather than being under the direct control of an interactive user.
Also nodding to its use as a term for certain kinds of computer programs: https://en.wikipedia.org/wiki/Daemon_(computing)
Hey Alexander! They should appear fairly soon after you've written at least 2 thoughts. The app will also let you know when a daemon is currently developing a response. Maybe there is an issue with your API key? There should be some kind of error message indicating why no daemons are appearing. Please DM me if that isn't the case and we'll look into what's going wrong for you.
We are! There's a bunch of features we'd like to add, and for the most part we expect to be moving on to other projects (so no promises on when we'll get to it), but we do absolutely want to add support for other models.
There is a field called Forensic linguistics where detectives use someone's "linguistic fingerprint" to determine the author of a document (famously instrumental in catching Ted Kaczynski by analyzing his manifesto). It seems like text is often used to predict things like gender, socioeconomic background, and education level.
If LLMs are superhuman at this kind of work, I wonder whether anyone is developing AI tools to automate this. Maybe the demand is not very strong, but I could imagine, for example, that an authoritarian regime might have a lot of incentive to de-anonymize people. While a company like OpenAI seems likely to have an incentive to hide how much the LLM actually knows about the user, I'm curious where anyone would have a strong incentive to make full use of superhuman linguistic analysis.
I've noticed that a lot of LW comments these days will start by thanking the author, or expressing enthusiasm or support before getting into the substance. I have the feeling that this didn't use to be the case as much. Is that just me?