This post is a not a so secret analogy for the AI Alignment problem. Via a fictional dialog, Eliezer explores and counters common questions to the Rocket Alignment Problem as approached by the Mathematics of Intentional Rocketry Institute. 

MIRI researchers will tell you they're worried that "right now, nobody can tell you how to point your rocket’s nose such that it goes to the moon, nor indeed any prespecified celestial destination."

Elizabeth8h163
0
Check my math: how does Enovid compare to to humming? Nitric Oxide is an antimicrobial and immune booster. Normal nasal nitric oxide is 0.14ppm for women and 0.18ppm for men (sinus levels are 100x higher). journals.sagepub.com/doi/pdf/10.117… Enovid is a nasal spray that produces NO. I had the damndest time quantifying Enovid, but this trial registration says 0.11ppm NO/hour. They deliver every 8h and I think that dose is amortized, so the true dose is 0.88. But maybe it's more complicated. I've got an email out to the PI but am not hopeful about a response clinicaltrials.gov/study/NCT05109…   so Enovid increases nasal NO levels somewhere between 75% and 600% compared to baseline- not shabby. Except humming increases nasal NO levels by 1500-2000%. atsjournals.org/doi/pdf/10.116…. Enovid stings and humming doesn't, so it seems like Enovid should have the larger dose. But the spray doesn't contain NO itself, but compounds that react to form NO. Maybe that's where the sting comes from? Cystic fibrosis and burn patients are sometimes given stratospheric levels of NO for hours or days; if the burn from Envoid came from the NO itself than those patients would be in agony.  I'm not finding any data on humming and respiratory infections. Google scholar gives me information on CF and COPD, @Elicit brought me a bunch of studies about honey.   With better keywords google scholar to bring me a bunch of descriptions of yogic breathing with no empirical backing. There are some very circumstantial studies on illness in mouth breathers vs. nasal, but that design has too many confounders for me to take seriously.  Where I'm most likely wrong: * misinterpreted the dosage in the RCT * dosage in RCT is lower than in Enovid * Enovid's dose per spray is 0.5ml, so pretty close to the new study. But it recommends two sprays per nostril, so real dose is 2x that. Which is still not quite as powerful as a single hum. 
A tension that keeps recurring when I think about philosophy is between the "view from nowhere" and the "view from somewhere", i.e. a third-person versus first-person perspective—especially when thinking about anthropics. One version of the view from nowhere says that there's some "objective" way of assigning measure to universes (or people within those universes, or person-moments). You should expect to end up in different possible situations in proportion to how much measure your instances in those situations have. For example, UDASSA ascribes measure based on the simplicity of the computation that outputs your experience. One version of the view from somewhere says that the way you assign measure across different instances should depend on your values. You should act as if you expect to end up in different possible future situations in proportion to how much power to implement your values the instances in each of those situations has. I'll call this the ADT approach, because that seems like the core insight of Anthropic Decision Theory. Wei Dai also discusses it here. In some sense each of these views makes a prediction. UDASSA predicts that we live in a universe with laws of physics that are very simple to specify (even if they're computationally expensive to run), which seems to be true. Meanwhile the ADT approach "predicts" that we find ourselves at an unusually pivotal point in history, which also seems true. Intuitively I want to say "yeah, but if I keep predicting that I will end up in more and more pivotal places, eventually that will be falsified". But.... on a personal level, this hasn't actually been falsified yet. And more generally, acting on those predictions can still be positive in expectation even if they almost surely end up being falsified. It's a St Petersburg paradox, basically. Very speculatively, then, maybe a way to reconcile the view from somewhere and the view from nowhere is via something like geometric rationality, which avoids St Petersburg paradoxes. And more generally, it feels like there's some kind of multi-agent perspective which says I shouldn't model all these copies of myself as acting in unison, but rather as optimizing for some compromise between all their different goals (which can differ even if they're identical, because of indexicality). No strong conclusions here but I want to keep playing around with some of these ideas (which were inspired by a call with @zhukeepa). This was all kinda rambly but I think I can summarize it as "Isn't it weird that ADT tells us that we should act as if we'll end up in unusually important places, and also we do seem to be in an incredibly unusually important place in the universe? I don't have a story for why these things are related but it does seem like a suspicious coincidence."
The main thing I got out of reading Bostrom's Deep Utopia is a better appreciation of this "meaning of life" thing. I had never really understood what people meant by this, and always just rounded it off to people using lofty words for their given projects in life. The book's premise is that, after the aligned singularity, the robots will not just be better at doing all your work but also be better at doing all your leisure for you. E.g., you'd never study for fun in posthuman utopia, because you could instead just ask the local benevolent god to painlessly, seamlessly put all that wisdom in your head. In that regime, studying with books and problems for the purpose of learning and accomplishment is just masochism. If you're into learning, just ask! And similarly for any psychological state you're thinking of working towards. So, in that regime, it's effortless to get a hedonically optimal world, without any unendorsed suffering and with all the happiness anyone could want. Those things can just be put into everyone and everything's heads directly—again, by the local benevolent-god authority. The only challenging values to satisfy are those that deal with being practically useful. If you think it's important to be the first to discover a major theorem or be the individual who counterfactually helped someone, living in a posthuman utopia could make things harder in these respects, not easier. The robots can always leave you a preserve of unexplored math or unresolved evil... but this defeats the purpose of those values. It's not practical benevolence if you had to ask for the danger to be left in place; it's not a pioneering scientific discovery if the AI had to carefully avoid spoiling it for you. Meaning is supposed to be one of these values: not a purely hedonic value, and not a value dealing only in your psychological states. A further value about the objective state of the world and your place in relation to it, wherein you do something practically significant by your lights. If that last bit can be construed as something having to do with your local patch of posthuman culture, then there can be plenty of meaning in the postinstrumental utopia! If that last bit is inextricably about your global, counterfactual practical importance by your lights, then you'll have to live with all your "localistic" values satisfied but meaning mostly absent. It helps to see this meaning thing if you frame it alongside all the other objectivistic "stretch goal" values you might have. Above and beyond your hedonic values, you might also think it good for you and others to have objectively interesting lives, accomplished and fulfilled lives, and consumingly purposeful lives. Meaning is one of these values, where above and beyond the joyful, rich experiences of posthuman life, you also want to play a significant practical role in the world. We might or might not be able to have lots of objective meaning in the AI utopia, depending on how objectivistic meaningfulness by your lights ends up being. > Considerations that in today's world are rightly dismissed as frivolous may well, once more pressing problems have been resolved, emerge as increasingly important [remaining] lodestars... We could and should then allow ourselves to become sensitized to fainter, subtler, less tangible and less determinate moral and quasi-moral demands, aesthetic impingings, and meaning-related desirables. Such recalibration will, I believe, enable us to discern a lush normative structure in the new realm that we will find ourselves in—revealing a universe iridescent with values that are insensible to us in our current numb and stupefied condition (pp. 318-9).
I think I'm gonna start posting top blogpost to the main feed (mainly from dead writers or people I predict won't care) 
There was this voice inside my head that told me that since I got Something to protect, relaxing is never ok above strict minimum, the goal is paramount, and I should just work as hard as I can all the time. This led me to breaking down and being incapable to work on my AI governance job for a week, as I just piled up too much stress. And then, I decided to follow what motivated me in the moment, instead of coercing myself into working on what I thought was most important, and lo and behold! my total output increased, while my time spent working decreased. I'm so angry and sad at the inadequacy of my role models, cultural norms, rationality advice, model of the good EA who does not burn out, which still led me to smash into the wall despite their best intentions. I became so estranged from my own body and perceptions, ignoring my core motivations, feeling harder and harder to work. I dug myself such deep a hole. I'm terrified at the prospect to have to rebuild my motivation myself again.

Popular Comments

Recent Discussion

This is a linkpost for https://dynomight.net/seed-oil/

A friend has spent the last three years hounding me about seed oils. Every time I thought I was safe, he’d wait a couple months and renew his attack:

“When are you going to write about seed oils?”

“Did you know that seed oils are why there’s so much {obesity, heart disease, diabetes, inflammation, cancer, dementia}?”

“Why did you write about {meth, the death penalty, consciousness, nukes, ethylene, abortion, AI, aliens, colonoscopies, Tunnel Man, Bourdieu, Assange} when you could have written about seed oils?”

“Isn’t it time to quit your silly navel-gazing and use your weird obsessive personality to make a dent in the world—by writing about seed oils?”

He’d often send screenshots of people reminding each other that Corn Oil is Murder and that it’s critical that we overturn our lives...

Or maybe speaking french automatically makes you healthier. I'm gonna choose to believe it's that one.

1nonveumann3h
This is shockingly similar to what I'm going through.  And the fries that fucked me up the other night are indeed fried in canola oil. I'm cautiously optimistic but I know how complicated these things can be -_-. Will report back!
1Slapstick3h
I would consider adding salt to something to be making that thing less healthy. If adding salt is essential to making something edible, I think it would be healthier to opt for something that doesn't require added salt. That's speaking generally though, someone might not be getting enough sodium, but typically there is adequate sodium in a diet of whole foods. I would disagree that adding refined oil to greens would be healthy overall. Not sure how much oil we're talking, but a tablespoon of oil has more calories than an entire pound of greens. Even if the oil increases the availability of vitamins, I am very sceptical that it would be healthier than greens or other whole plants with an equivalent caloric content to the added oil. I believe it's also the case that fats from whole foods can offer similar bioavailability effects. At the same time, as far as I'm aware some kinds of vinegar might sometimes be a healthy addition to a meal, despite it's processing being undoubtedly contrary to the general guidelines I'm defending, so even if I don't agree about the oil I think the point still stands. I do think you're offering some valid points that confound my idea of simple guidelines somewhat, but I still don't think they're very significant exceptions to my main point. Appreciate the dialogue:)
1Ann3h
We're talking about a tablespoon of (olive, traditionally) oil and vinegar mixed for a serving of simple sharp vinaigrette salad dressing, yeah. From a flavor perspective, generally it's hard for the vinegar to stick to the leaves without the oil. If you aren't comfortable with adding a refined oil, adding unrefined fats like nuts and seeds, eggs or meat, should have some similar benefits in making the vitamins more nutritionally available, and also have the benefit of the nutrients of the nuts, seeds, eggs or meat, yes. Often these are added to salad anyway. You probably don't want to add additional greens with the caloric content of oil to a salad; the difference in caloric density means that 1 tablespoon of oil translates to 2 pounds of lettuce (more than 2 heads), and you're already eating probably as many greens as you can stomach! Edit: I should also acknowledge that less processed (cold pressed, extra virgin, and so forth) olive oil has had fewer nutrients destroyed; and may be the best choice for salad dressing. But we do need to be careful about thinking processing only destroys nutrients - cooking, again for example, often destroys some nutrients and opens others up to accessibility.

Note: It seems like great essays should go here and be fed through the standard LessWrong algorithm. There is possibly a copyright issue here, but we aren't making any money off it either. What follows is a full copy of "This is Water" by David Foster Wallace his 2005 commencement speech to the graduating class at Kenyon College.

Greetings parents and congratulations to Kenyon’s graduating class of 2005. There are these two young fish swimming along and they happen to meet an older fish swimming the other way, who nods at them and says “Morning, boys. How’s the water?” And the two young fish swim on for a bit, and then eventually one of them looks over at the other and goes “What the hell is water?”

This is...

yanni11m10

If you're into podcasts, the Very Bad Wizards guys did an ep on this essay, which I enjoyed: https://verybadwizards.com/episode/episode-227-a-terrible-master-david-foster-wallaces-this-is-water

3habryka5h
Mod note: I clarified the opening note a bit more, to make the start and nature of the essay more clear.
1Saul Munn4h
thanks oli, and thanks for editing mine! appreciate the modding <3
2Nathan Young8h
I find this essay very moving and it helps me notice a certain thing. Life is passing and we can pay attention to one thing or another. What will I pay attention to? What will I worship? Some quotes:  

Warning: This post might be depressing to read for everyone except trans women. Gender identity and suicide is discussed. This is all highly speculative. I know near-zero about biology, chemistry, or physiology. I do not recommend anyone take hormones to try to increase their intelligence; mood & identity are more important.

Why are trans women so intellectually successful? They seem to be overrepresented 5-100x in eg cybersecurity twitter, mathy AI alignment, non-scam crypto twitter, math PhD programs, etc.

To explain this, let's first ask: Why aren't males way smarter than females on average? Males have ~13% higher cortical neuron density and 11% heavier brains (implying   more area?). One might expect males to have mean IQ far above females then, but instead the means and medians are similar:

Left. Right.

My theory...

2interstice36m
I don't know enough about hormonal biology to guess a specific cause(some general factor of neoteny, perhaps??). It's much easier to infer that it's likely some third factor than to know exactly what third factor it is. I actually think most of the evidence in this very post supports the 3rd-factor position or is equivocal - testosterone acting as a nootropic is very weird if it makes you dumber, that men and women have equal IQs seems not to be true, the study cited to support a U-shaped relationship seems flimsy, that most of the ostensible damage occurs before adulthood seems in tension with your smarter friends transitioning after high school.

seems in tension with your smarter friends transitioning after high school.

They seemed low-T during high school though!

Yeah could be a third factor though. Maybe you are right.

5lc2h
Men are smarter than women, by about 2-4 points on average. Men are also larger, and so need bigger brains to compensate for their size (though this does not explain the entire difference you cite).
1lukehmiles3h
Someone on a subreddit said "free testosterone" is what matters and they usually just measure uh "regular testosterone" in blood or something. I have no idea if that's true. Know what those studies measured? Wildly guessing here, but my intuition is that estrogen would have a greater impact on neuroticism than testosterone. Although I can't even say which direction.

The Löwenheim–Skolem theorem implies, among other things, that any first-order theory whose symbols are countable, and which has an infinite model, has a countably infinite model. This means that, in attempting to refer to uncountably infinite structures (such as in set theory), one "may as well" be referring to an only countably infinite structure, as far as proofs are concerned.

The main limitation I see with this theorem is that it preserves arbitrarily deep quantifier nesting. In Peano arithmetic, it is possible to form statements that correspond (under the standard interpretation) to arbitrary statements in the arithmetic hierarchy (by which I mean, the union of and for arbitrary n). Not all of these statements are computable. In general, the question of whether a given statement is...

2jessicata17h
Thanks, didn't know about the low basis theorem.
2jessicata17h
U axiomatizes a consistent guessing oracle producing a model of T. There is no consistent guessing oracle applied to U. In the previous post I showed that a consistent guessing oracle can produce a model of T. What I show in this post is that the theory of this oracle can be embedded in propositional logic so as to enable provability preserving translations.
AlexMennen29mΩ120

I see that when I commented yesterday, I was confused about how you had defined U. You're right that you don't need a consistent guessing oracle to get from U to a completion of U, since the axioms are all atomic propositions, and you can just set the remaining atomic propositions however you want. However, this introduces the problem that getting the axioms of U requires a halting oracle, not just a consistent guessing oracle, since to tell whether something is an axiom, you need to know whether there actually is a proof of a given thing in T.

2jessicata17h
LS shows to be impossible one type of infinitarian reference, namely to uncountably infinite sets. I am interested in showing to be impossible a different kind of infinitarian reference. "Impossible" and "reference" are, of course, interpreted differently by different people.

The history of science has tons of examples of the same thing being discovered multiple time independently; wikipedia has a whole list of examples here. If your goal in studying the history of science is to extract the predictable/overdetermined component of humanity's trajectory, then it makes sense to focus on such examples.

But if your goal is to achieve high counterfactual impact in your own research, then you should probably draw inspiration from the opposite: "singular" discoveries, i.e. discoveries which nobody else was anywhere close to figuring out. After all, if someone else would have figured it out shortly after anyways, then the discovery probably wasn't very counterfactually impactful.

Alas, nobody seems to have made a list of highly counterfactual scientific discoveries, to complement wikipedia's list of multiple discoveries.

To...

Could you define what you mean here by counterfactual impact?

My knowledge of the word counterfactual comes mainly from the blockchain world, where we use it in the form of "a person could do x at any time, and we wouldn't be able to stop them, therefore x is counterfactually already true or has counterfactually already occured"

3ChrisHibbert2h
The Iowa Election Markets were roughly contemporaneous with Hanson's work. They are often co-credited.
4Wei Dai2h
Even if someone made a discovery decades earlier than it otherwise would have been, the long term consequences of that may be small or unpredictable. If your goal is to "achieve high counterfactual impact in your own research" (presumably predictably positive ones) you could potentially do that in certain fields (e.g., AI safety) even if you only counterfactually advance the science by a few months or years. I'm a bit confused why you're asking people to think in the direction outlined in the OP.
5Answer by Carl Feynman3h
Wegener’s theory of continental drift was decades ahead of its time. He published in the 1920s, but plate tectonics didn’t take over until the 1960s.  His theory was wrong in important ways, but still.

Concerns over AI safety and calls for government control over the technology are highly correlated but they should not be.

There are two major forms of AI risk: misuse and misalignment. Misuse risks come from humans using AIs as tools in dangerous ways. Misalignment risks arise if AIs take their own actions at the expense of human interests.

Governments are poor stewards for both types of risk. Misuse regulation is like the regulation of any other technology. There are reasonable rules that the government might set, but omission bias and incentives to protect small but well organized groups at the expense of everyone else will lead to lots of costly ones too. Misalignment regulation is not in the Overton window for any government. Governments do not have strong incentives...

The argument is that with 1970′s tech the soviet union collapsed, however with 2020 computer tech (not needing GenAI) it would not.

I note that China is still doing market economics, and nobody is trying (or even advocating, AFAIK) some very ambitious centrally planned economy using modern computers, so this seems like pure speculation? Has someone actually made a detailed argument about this, or at least has the agreement of some people with reasonable economics intuitions?

2Wei Dai1h
I've arguably lived under totalitarianism (depending on how you define it), and my parents definitely have and told me many stories about it. I think AGI increases risk of totalitarianism, and support a pause in part to have more time to figure out how to make the AI transition go well in that regard.
6Akash4h
I'm not sure who you've spoken to, but at least among the AI policy people who I talk to regularly (which admittedly is a subset of people who I think are doing the most thoughtful/serious work), I think nearly all of them have thought about ways in which regulation + regulatory capture could be net negative. At least to the point of being able to name the relatively "easy" ways (e.g., governments being worse at alignment than companies). I continue to think people should be forming alliances with those who share similar policy objectives, rather than simply those who belong in the "I believe xrisk is a big deal" camp. I've seen many instances in which the "everyone who believes xrisk is a big deal belongs to the same camp" mentality has been used to dissuade people from communicating their beliefs, communicating with policymakers, brainstorming ideas that involve coordination with other groups in the world, disagreeing with the mainline views held by a few AIS leaders, etc. The cultural pressures against policy advocacy have been so strong that it's not surprising to see folks say things like "perhaps our groups are no longer natural allies" now that some of the xrisk-concerned people are beginning to say things like "perhaps the government should have more of a say in how AGI development goes than in status quo, where the government has played ~0 role and ~all decisions have been made by private companies." Perhaps there's a multiverse out there in which the AGI community ended up attracting govt natsec folks instead of Bay Area libertarians, and the cultural pressures are flipped. Perhaps in that world, the default cultural incentives pushed people heavily brainstorming ways that markets and companies could contribute meaningfully to the AGI discourse, and the default position for the "AI risk is a big deal" camp was "well obviously the government should be able to decide what happens and it would be ridiculous to get companies involved– don't be unilateralist
2Richard_Ngo4h
I don't disagree with this; when I say "thought very much" I mean e.g. to the point of writing papers about it, or even blog posts, or analyzing it in talks, or basically anything more than cursory brainstorming. Maybe I just haven't seen that stuff, idk.
To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

The concept of "the meaning of life" still seems like a category error to me. It's an attempt to apply a system of categorization used for tools, one in which they are categorized by the purpose for which they are used, to something that isn't a tool: a human life. It's a holdover from theistic worldviews in which God created humans for some unknown purpose.

 

The lesson I draw instead from the knowledge-uploading thought experiment -- where having knowledge instantly zapped into your head seems less worthwhile acquiring it more slowly yourself -- is th... (read more)

2tailcalled11h
The tricky part is, on the margin I would probably use various shortcuts, and it's not clear where those shortcuts end short of just getting knowledge beamed into my head. I already use LLMs to tell me facts, explain things I'm unfamiliar with, handle tedious calculations/coding, generate simulated data/brainstorming and summarize things. Not much, because LLMs are pretty bad, but I do use them for this and I would use them more on the margin.

EDIT: I somehow missed that John Wentworth and David Lorell are also in the middle of a sequence have written one post on this same topic here. I will see where this goes from here! This sequence will continue!

Introduction to a sequence on the statistical thermodynamics of some things and maybe eventually everything. This will make more sense if you have a basic grasp on quantum mechanics, but if you're willing to accept "energy comes in discrete units" as a premise then you should be mostly fine.

The title of this post has a double meaning:

  1. Forget the thermodynamics you've learnt before, because statistical mechanics starts from information theory.
  2. The main principle of doing things with statistical mechanics is can be summed up as follows:

Forget as much as possible, then...

This will make more sense if you have a basic grasp on quantum mechanics, but if you're willing to accept "energy comes in discrete units" as a premise then you should be mostly fine.

My current understanding is that QM is not-at-all needed to make sense of stat mech. Instead, the thing where energy is equally likely to be in any of the degrees of freedom just comes from using a measure over your phase space such that the dynamical law of your system preservers that measure!

U.S. Secretary of Commerce Gina Raimondo announced today additional members of the executive leadership team of the U.S. AI Safety Institute (AISI), which is housed at the National Institute of Standards and Technology (NIST). Raimondo named Paul Christiano as Head of AI Safety, Adam Russell as Chief Vision Officer, Mara Campbell as Acting Chief Operating Officer and Chief of Staff, Rob Reich as Senior Advisor, and Mark Latonero as Head of International Engagement. They will join AISI Director Elizabeth Kelly and Chief Technology Officer Elham Tabassi, who were announced in February. The AISI was established within NIST at the direction of President Biden, including to support the responsibilities assigned to the Department of Commerce under the President’s landmark Executive Order.

Paul Christiano, Head of AI Safety, will design

...
2Davidmanheim12h
The OP claimed it was a failure of BSL levels that induced biorisk as a cause area, and I said that was a confused claim. Feel free to find someone who disagrees with me here, but the proximate causes of EAs worrying about biorisk has nothing to do with BSL lab designations. It's not BSL levels that failed in allowing things like the soviet bioweapons program, or led to the underfunded and largely unenforceable BWC, or the way that newer technologies are reducing the barriers to terrorists and other being able to pursue bioweapons.
2Adam Scholl4h
I think we must still be missing each other somehow. To reiterate, I'm aware that there is non-accidental biorisk, for which one can hardly blame the safety measures. But there is also accident risk, since labs often fail to contain pathogens even when they're trying to.
2Davidmanheim13h
I did not say that they didn't want to ban things, I explicitly said "whether to allow certain classes of research at all," and when I said "happy to rely on those levels, I meant that the idea that we should have "BSL-5" is the kind of silly thing that novice EAs propose that doesn't make sense because there literally isn't something significantly more restrictive other than just banning it. I also think that "nearly all EA's focused on biorisk think gain of function research should be banned" is obviously underspecified, and wrong because of the details. Yes, we all think that there is a class of work that should be banned, but tons of work that would be called gain of function isn't in that class.

the idea that we should have "BSL-5" is the kind of silly thing that novice EAs propose that doesn't make sense because there literally isn't something significantly more restrictive

I mean, I'm sure something more restrictive is possible. But my issue with BSL levels isn't that they include too few BSL-type restrictions, it's that "lists of restrictions" are a poor way of managing risk when the attack surface is enormous. I'm sure someday we'll figure out how to gain this information in a safer way—e.g., by running simulations of GoF experiments instead of... (read more)

LessOnline

A Festival of Writers Who are Wrong on the Internet

May 31 - Jun 2, Berkeley, CA