How does it work to optimize for realistic goals in physical environments of which you yourself are a part? E.g. humans and robots in the real world, and not humans and AIs playing video games in virtual worlds where the player not part of the environment. The authors claim we don't actually have a good theoretical understanding of this and explore four specific ways that we don't understand this process.
You are invited to participate in Metaculus's FRO-Casting Tournament, an exciting pilot project in partnership with The Federation of American Scientists that harnesses predictions to help assess impact, deliver feedback, and inform the allocation of $50 million to ambitious research proposals handpicked by Convergent Research.
There is a wealth of untapped scientific knowledge at the intersection of research and engineering: Industry doesn't pursue it because it's unlikely to be profitable. Academia doesn't pursue it because it's unlikely to be publishable. Enter Focused Research Organizations (FROs), non-profits modeled after startups that address well-defined technology challenges that are unlikely to be solved by industry or academia.
By sharing your insights, you'll join forces with Metaculus Pro Forecasters and 25 subject matter experts to help generate:
This is Part I in a series on easy weightloss without any need for will power.
Losing weight is supposed to be really hard and require a lot of willpower according to conventional wisdom. It turns out that it was actually really easy for me to go from a BMI of above 29 (30 is officially obese) to below 25 (normal is 18 to 25) in 3½ months. And knowing what I know now, I think I could easily do it again in a 1½ month.
I'm not someone who ever tried dieting before. Dieting sounded like a lot of effort and willpower for very uncertain results. Not a good use of my very limited willpower. This belief changed after...
That would only be meaningful if OP had accurately weighed and tracked the food, which is enough of a hassle that this would have been mentioned, I think. And without it... you would naturally assume that OP consumed fewer calories, because a significant part of their diet was now a highly satiating low calorie food with resistant starch. That would definitely be my guess.
“Money is a machine whose function is to do quickly and conveniently what would be done, though less quickly and conveniently, without it” — J.S Mill
♥ An Introduction to an Ignored Problem:
It is quite common that in undergraduate economics course, particularly in Money and Banking, students are taught that money is adequately defined as a means of general exchange between economic agents and that it is an institution created for the purpose of eliminating the inconveniences of barter.
However, the student rarely asks himself the reason for this, much less observes that the logic given to him is wrong. What does it ultimately mean for currency to be a general medium of exchange? And why is barter a less desirable option than indirect exchange using money?
When...
Thus, in an economy that has 100 goods, for example, there would be a total of 4950 prices or “exchange rates” between one good and the others.
Actually, this can be avoided by making the prices virtual and having a liquidity pool that would automatically allow to calculate prices. Liquidity pools can almost surely extend to more than 2 goods while still having well-defined .
(Actually, liquidity pool exchanges can be used even without computers; they don't present complicated expressions unless someone wants to add goods to pool.)
Confidence level: I’m a computational physicist working on nanoscale simulations, so I have some understanding of most of the things discussed here, but I am not specifically an expert on the topics covered, so I can’t promise perfect accuracy.
I want to give a huge thanks to Professor Phillip Moriarty of the university of Nottingham for answering my questions about the experimental side of mechanosynthesis research.
Introduction:
A lot of people are highly concerned that a malevolent AI or insane human will, in the near future, set out to destroy humanity. If such an entity wanted to be absolutely sure they would succeed, what method would they use? Nuclear war? Pandemics?
According to some in the x-risk community, the answer is this: The AI will invent molecular nanotechnology, and then kill...
Possibly, but by limiting access to the arguments, you also limit the public case for it and engagement by skeptics. The views within the area will also probably further reflect self-selection for credulousness and deference over skepticism.
There must be less infohazardous arguments we can engage with. Or, maybe zero-knowledge proofs are somehow applicable. Or, we can select a mutually trusted skeptic (or set of skeptics) with relevant expertise to engage privately. Or, legally binding contracts to prevent sharing.
It never stops. I’m increasingly building distinct roundups for various topics, in particular I’m splitting medical and health news out. Let’s get to the rest of it.
A simple model of why everything sucks: It is all optimized almost entirely for the marginal user, who the post calls Marl. Marl hates when there are extra buttons on the screen or any bit of complexity is offered, even when he is under zero obligation to use it or care, let alone being asked to think, so everything gets dumbed down.
Could companies really be this stupid, so eager to chase the marginal user a little bit more that they cripple the functionality of their products? Very much so, yes, well past the point where it makes financial sense to...
Pradyumna: You a reasonable person: the city should encourage carpooling to reduce congestion
Bengaluru’s Transport Department (a very stable genius): Taxi drivers complained and so we will ban carpooling
It's not really that Bangalore banned carpooling, they required licenses for ridesharing apps. Maybe that's a de facto ban of those apps, but that's a far cry from banning carpooling in general.
Source: https://www.timesnownews.com/bengaluru/no-ban-on-carpooling-in-bengaluru-apps-to-obtain-approval-to-operate-it-legally-article-104103234
Effective altruism prides itself on truthseeking. That pride is justified in the sense that EA is better at truthseeking than most members of its reference category, and unjustified in that it is far from meeting its own standards. We’ve already seen dire consequences of the inability to detect bad actors who deflect investigation into potential problems, but by its nature you can never be sure you’ve found all the damage done by epistemic obfuscation because the point is to be self-cloaking.
My concern here is for the underlying dynamics of EA’s weak epistemic immune system, not any one instance. But we can’t analyze the problem without real examples, so individual instances need to be talked about. Worse, the examples that are easiest to understand are almost by definition...
I would prefer anchoring on studies that report objective clinical outcomes
Yeah, that does sound nicer; have those already been done or are we going to have to wait for them?
If it’s worth saying, but not worth its own post, here's a place to put it.
If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.
If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.
If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.
The Open Thread tag is here. The Open Thread sequence is here.
This post is a copy of the introduction of this paper on lie detection in LLMs. The Twitter Thread is here.
Authors: Lorenzo Pacchiardi, Alex J. Chan, Sören Mindermann, Ilan Moscovitz, Alexa Y. Pan, Yarin Gal, Owain Evans, Jan Brauner
Large language models (LLMs) can "lie", which we define as outputting false statements despite "knowing" the truth in a demonstrable sense. LLMs might "lie", for example, when instructed to output misinformation. Here, we develop a simple lie detector that requires neither access to the LLM's activations (black-box) nor ground-truth knowledge of the fact in question. The detector works by...
I think this is a pretty wild paper. First, this technique seems really useful. The AUCs seem crazy high.
Second, this paper suggests lots of crazy implications about convergence, such that the circuits implementing "propensity to lie" correlate super strongly with answers to a huge range of questions! This would itself suggest a large amount of convergence in underlying circuitry, across model sizes and design choices and training datasets.
However, I'm not at all confident in this story yet. Possibly the real explanation could be some less grand and more spurious explanation which I have yet to imagine.
I'd like to compile a list of potential alignment targets for a sovereign superintelligent AI.
By an alignment target, I mean something like what goals/values/utility function we might want to instill in a sovereign superintelligent AI (assuming we've solved the alignment problem).
Here are some alignment targets I've come across:
Examples, reviews, critiques, and comparisons of alignment targets are welcome.
the QACI target sort-of aims to be an implementation of CEV. There's also PreDCA and UAT listed on my old list of (formal) alignment targets.