How does it work to optimize for realistic goals in physical environments of which you yourself are a part? E.g. humans and robots in the real world, and not humans and AIs playing video games in virtual worlds where the player not part of the environment. The authors claim we don't actually have a good theoretical understanding of this and explore four specific ways that we don't understand this process.
You are invited to participate in Metaculus's FRO-Casting Tournament, an exciting pilot project in partnership with The Federation of American Scientists that harnesses predictions to help assess impact, deliver feedback, and inform the allocation of $50 million to ambitious research proposals handpicked by Convergent Research.
There is a wealth of untapped scientific knowledge at the intersection of research and engineering: Industry doesn't pursue it because it's unlikely to be profitable. Academia doesn't pursue it because it's unlikely to be publishable. Enter Focused Research Organizations (FROs), non-profits modeled after startups that address well-defined technology challenges that are unlikely to be solved by industry or academia.
By sharing your insights, you'll join forces with Metaculus Pro Forecasters and 25 subject matter experts to help generate:
This is Part I in a series on easy weightloss without any need for will power.
Losing weight is supposed to be really hard and require a lot of willpower according to conventional wisdom. It turns out that it was actually really easy for me to go from a BMI of above 29 (30 is officially obese) to below 25 (normal is 18 to 25) in 3½ months. And knowing what I know now, I think I could easily do it again in a 1½ month.
I'm not someone who ever tried dieting before. Dieting sounded like a lot of effort and willpower for very uncertain results. Not a good use of my very limited willpower. This belief changed after...
“Money is a machine whose function is to do quickly and conveniently what would be done, though less quickly and conveniently, without it” — J.S Mill
♥ An Introduction to an Ignored Problem:
It is quite common that in undergraduate economics course, particularly in Money and Banking, students are taught that money is adequately defined as a means of general exchange between economic agents and that it is an institution created for the purpose of eliminating the inconveniences of barter.
However, the student rarely asks himself the reason for this, much less observes that the logic given to him is wrong. What does it ultimately mean for currency to be a general medium of exchange? And why is barter a less desirable option than indirect exchange using money?
Confidence level: I’m a computational physicist working on nanoscale simulations, so I have some understanding of most of the things discussed here, but I am not specifically an expert on the topics covered, so I can’t promise perfect accuracy.
I want to give a huge thanks to Professor Phillip Moriarty of the university of Nottingham for answering my questions about the experimental side of mechanosynthesis research.
A lot of people are highly concerned that a malevolent AI or insane human will, in the near future, set out to destroy humanity. If such an entity wanted to be absolutely sure they would succeed, what method would they use? Nuclear war? Pandemics?
According to some in the x-risk community, the answer is this: The AI will invent molecular nanotechnology, and then kill...
It never stops. I’m increasingly building distinct roundups for various topics, in particular I’m splitting medical and health news out. Let’s get to the rest of it.
A simple model of why everything sucks: It is all optimized almost entirely for the marginal user, who the post calls Marl. Marl hates when there are extra buttons on the screen or any bit of complexity is offered, even when he is under zero obligation to use it or care, let alone being asked to think, so everything gets dumbed down.
Could companies really be this stupid, so eager to chase the marginal user a little bit more that they cripple the functionality of their products? Very much so, yes, well past the point where it makes financial sense to...
Effective altruism prides itself on truthseeking. That pride is justified in the sense that EA is better at truthseeking than most members of its reference category, and unjustified in that it is far from meeting its own standards. We’ve already seen dire consequences of the inability to detect bad actors who deflect investigation into potential problems, but by its nature you can never be sure you’ve found all the damage done by epistemic obfuscation because the point is to be self-cloaking.
My concern here is for the underlying dynamics of EA’s weak epistemic immune system, not any one instance. But we can’t analyze the problem without real examples, so individual instances need to be talked about. Worse, the examples that are easiest to understand are almost by definition...
If it’s worth saying, but not worth its own post, here's a place to put it.
If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.
If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.
If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.
Authors: Lorenzo Pacchiardi, Alex J. Chan, Sören Mindermann, Ilan Moscovitz, Alexa Y. Pan, Yarin Gal, Owain Evans, Jan Brauner
Large language models (LLMs) can "lie", which we define as outputting false statements despite "knowing" the truth in a demonstrable sense. LLMs might "lie", for example, when instructed to output misinformation. Here, we develop a simple lie detector that requires neither access to the LLM's activations (black-box) nor ground-truth knowledge of the fact in question. The detector works by...
I'd like to compile a list of potential alignment targets for a sovereign superintelligent AI.
By an alignment target, I mean something like what goals/values/utility function we might want to instill in a sovereign superintelligent AI (assuming we've solved the alignment problem).
Here are some alignment targets I've come across:
Examples, reviews, critiques, and comparisons of alignment targets are welcome.