Are you saying that outside experts were better at understanding potential consequences in these cases? I have trouble believing it.
Other than the printing press, do you have other members of the reference class you are constructing, where outside holistic experts are better at predicting the consequences of a new invention than the inventors themselves?
the question of what is actually moral, beyond what you have been told is moral
that is what a moral realist would say
Like most things, it is sometimes helpful, sometimes harmful, sometimes completely benign, depending on the person, the type, the amount and the day of the week. There is no "consensus" because the topic is so heterogeneous. What is your motivation for asking?
Note that if you have a little bit extra to spend, you can outsource some of the dimensions to experts. For example, those with a sense of style can offer you options you'd not have thought of yourself. The same applies to functionality and comfort (different experts though).
Exterminating humans can be done without acting on humans directly. We are fragile meatbags, easily destroyed by an inhospitable environment. For example:
There have been plenty of discussions on "igniting the atmosphere" as well.
I am not confidently claiming anything, not really an expert... But yeah, I guess I like the way you phrased it. The more disparity there is in intelligence, the less extra noise matters. I do not have a good model of it though. Just feels like more and more disparate dangerous paths appear in this case, overwhelming the noise.
If a plan is adversarial to humans, the plan's executor will face adverse optimization pressure from humans and adverse optimization pressure complicates error correction.
I can see that working when the entity is at the human level of intelligence or less. Maybe I misunderstand the setup, and this is indeed the case. I can't imagine that it would work on a superintelligence...
I thought it was a sort of mundane statement that morality is a set of evolved heuristics that make cooperation rather than defection possible, even when it is ostensibly against the person's interests in the moment.
Basically, a resolution of the Parfit's hitchhiker problem is inducing morality into the setup: it is immoral not to pick up a dying hitchhiker, and it is dishonorable to renege on the promise to pay. If you dig into the decision-theoretic logic of it, you can figure out that in repeated Parfit's hitchhiker setup you are better off picking up/paying up, but humans are not great at that, so evolutionarily we ended up with morality as a crutch.
Brittleness: Since treacherous plans may require higher precision than benign plans, treacherous plans should be more vulnerable to noise.
I wonder where this statement is coming from? I'd assume the opposite, most paths lead to bad outcomes by default, making a plan work as intended is what requires higher precision.
If someone says "I believe that the probability of cryonic revival is 7%", what useful information can you extract from it, beyond "this person has certain beliefs"? Of course, if you consider them an authority on the topic, you can decide whether 7% is enough for you to sign up for cryonics. Or maybe because you know them to be well calibrated on a variety of subjects they have expressed probabilistic views on, including topics that have so many unknowns, they have to have some special ineffable insight to be well calibrated on. I am skeptical that there is a reference class like this that includes cryonic revival where one can be considered well calibrated.
Clarke's quote is apt, but the rest of the article does not hold all that well together. All you can say about cryonics is that it arrests the decay at the cost of destroying some structures in the process. Whether what is left is enough for eventual reversal, whether biological or technological, is a huge unknown whose probability you cannot reasonably estimate at this time. All we know is that the alternative (natural decomposition) is strictly worse. If someone gives you a concrete point estimate probability of revival, their estimate is automatically untrustworthy. We do not have anywhere close to the amount of data we need to make a reasonable guess.
Eliezer discussed it multiple times, quite recently on Twitter and on various podcasts. Other people did, too.
Yes, agents whose inner model is counting possible worlds, assigning probabilities and calculating expected utility can be successful in a wider variety of situations than someone who always picks 1. No, thinking like "an entity that "acts like they have a choice"" does not generalize well, since "acting like you have choice" leads you to CDT and two-boxing.
You don't know enough to accurately decide whether there is a high risk of extinction. You don't know enough to accurately decide whether a specific measure you advocate would increase or decrease it. Use epistemic modesty to guide your actions. Being sure of something you cannot derive from first principles, as opposed to from parroting select other people's arguments is a good sign that you are not qualified.
One classic example is the environmentalist movement accelerating anthropogenic global climate change by being anti-nuclear energy. If you think you are smarter now about AI dangers than they were back then about climate, it is a red flag.
I suggest (well, my partner does) including those you like as a part of a diverse vegan diet. Oat milk is nominally processed and enriched, but it is not a central example of "processed foods" by any means. There are many vegan options that are enriched with vitamins and minerals to cover nearly everything that humans get from eggs, milk products and meats, most people can find something they like with a bit of trying. Of course, there are always those who are allergic, sensitive, unable to process well, or supertasters that need something special. I am not talking about these cases.
None of this is relevant. I don't like the "realityfluid" metaphor, either. You win because you like the number 1 more than number 2, or because you cannot count past 1, or because you have a fancy updateless model of the world, or because you have a completely wrong model of the world which nonetheless makes you one-box. You don't need to "act like you have a choice" at all.
There is no "ought" or "should" in a deterministic world of perfect predictors. There is only "is". You are an algorithm and Omega knows how you will act. Your inner world is an artifact that gives you an illusion of decision making. The division is simple: one-boxers win, two-boxers lose, the thought process that leads to the action is irrelevant.
I addressed a general question like that in https://www.lesswrong.com/posts/p2Qq4WWQnEokgjimy/respect-chesterton-schelling-fences
Basically, guardrails exist for a reason, and you are generally not smart enough to predict the consequences of removing them. This applies to most suggestions of the form "why don't we just <do some violent thing> to make the world better". There are narrow exceptions where breaking a guardrail has actual rather than imaginary benefits, but finding them requires a lot of careful analysis and modeling.
My partner is vegan, and it seems like there is nothing special one needs to do to stay healthy, just eat everything (vegan) in moderation, like veggies, legumes, fruits, nuts etc. Most processed products like oat milk, soy milk, impossible meat, beyond meat, daiya cheese are enriched with whatever supplements are needed already unless one is specifically susceptible to some deficiencies.
Individual humans are not aligned at all, see "power corrupts". Human societies are somewhat aligned with individual humans, in the sense that they need humans to exist and keep the society going, and those "unaligned" disappear pretty quickly. I do not see any alignment difference between totalitarian and democratic regimes, if you measure alignment by the average happiness of society. I don't disagree that human misalignment has only moderate effects because of various limits on their power.
Something is very very hard if we see no indication of it happening naturally. Thus FTL is very very hard, at least without doing something drastic to the universe as a whole... which is also very very hard. On the other hand,
"hacking" the human brain, using only normal-range inputs (e.g. regular video, audio), possible, for various definitions of hacking and bounds on time and prior knowledge
is absolutely trivial. It happens to all of us all the time to various degrees, without us realizing it. Examples: falling in love, getting brainwashed, getting...
I think it's a very useful perspective, sadly the commenters do not seem to engage with your main point, that the presentation of the topic is unpersuasive to an intelligent layperson, instead focusing on specific arguments.
Did your model change in the last 6 months or so, since the GPTx takeover? If so, how? Or is it a new model? If so, can you mentally go back to pre-GP-3.5 and construct the model then? Basically, I wonder which of your beliefs changes since then.
Well, if we only have one try, extra time does not help, unless alignment is only an incremental extra on AI, and not a comparably hard extra effort. If we have multiple tries, yes, there is a chance. I don't think that at this point we have enough clue as to how it is likely to go. Certainly LLMs have been a big surprise.
I think it might be useful to consider the framing of being an embedded agent in a deterministic world (in Laplace's demon sense). There is no primitive "should", only an emergent one. The question to ask in that setup is "what kind of embedded agents succeed, according to their internal definition of success?" For example it is perfectly rational to believe in God in a setup in a situation where this belief improves your odds of success, for some internal definition of success. If one's internal definition of success is different, fighting religious dogma...
Are you positing that the argument "we only have one try to get it right" is incorrect? Or something else?
Not "CDT does not make sense", but any argument that fights a hypothetical such as "predictor knows what you will do" is silly. EDT does that sometimes. I don't understand FDT (not sure anyone does, since people keep arguing what it predicts), so maybe it fares better. Two-boxing in a perfect predictor setup is a classic example. You can change the problem, but it will not be the same problem. 11 doses outcome is not a possibility in the Moral Newcomb's. I've been shouting in the void for a decade that all you need to do is enumerate the worlds, assign pro...
A one-paragraph summary to start your post would really be helpful. A long and convoluted story without an obvious carrot at the end is not a way to invite engagement.
I assume you are not actually trying to save money or energy this way, since the savings if any would be minuscule, but are doing a calculation for fun. In that case a simple rule of thumb is likely to give you all the savings you want, such as closing the door whenever the time is indeterminate and/or longer than, say, 10 seconds in expectation.
Ah, thank you, that makes sense. I agree that we definitely need some opaque entity to do these two operations. Though maybe not as opaque as magic, unless you consider GPT-4 magic. As you say, "GPT-4 can do all of the magic required in the problem above." In which case you might as well call everything an LLM does "magic", which would be fair, but not really illuminating.
GPT-4 analysis, for reference:
One possible decision tree for your problem is:
graph TD
A[Will it rain?] -->|Yes| B[Throw party inside]
A -->|No| C[Throw party outside]
B --> D[Enjoym...
I am confused as to what work the term "magic" does here. Seems like you use it for two different but rather standard operations: "listing possible worlds" and "assigning utility to each possible world". Is the "magic" part that we have to defer to a black-box human judgment there?
I guess even without symmetry if one assumes finite interaction time, and the nearest-neighbor-only interaction, an analog of the light cone emerges from these two assumptions. As in, Nth neighbor is unaffected until the time Nt where t is the characteristic interaction time. But I assume you are claiming something much less trivial than that.
I'm wondering if you are reinventing lattice waves., phonons and maybe even phase transitions in the Ising model.
You can discuss most topics without bringing the notion of reality into the argument. Replace "true" with "accurate", where "accurate" relates to predictions a model makes. Then all your reality zoo collapses into that one point.
I definitely agree with that, and there is a clear pattern of this happening on LW among the newbie AI Doomers
(I assume you meant your quote unspoilered? Since it is clearly visible.)
In general, this is a very good heuristic, I agree. If you think there is a low-hanging fruit, everyone is passing on, it is good to check or inquire, usually privately and quietly, whether anyone else noticed it, before going for it. Sometimes saying out loud that the king has no clothes is equivalent to shouting in the Dark Forest. Once in a while though there is indeed low-hanging fruit. Telling the two situations apart is the tricky part.
This is cute, but has nothing to do with superintelligent AI though. The whole point is that you will not recognize that you are being manipulated and then you are dead. Trying to be "on the lookout" is naive at best. Remember AI can model you better than you can model yourself. If something much smarter than you is intent on killing you, you are as good as dead.
I agree with all these considerations and the choice not being straightforward. It gets even more complicated when one goes deeper into the weeds of the J.S. Mill's version of utilitarianism. I guess my original point expressed less radically is that assuming that higher IQ is automatically better is far from obvious.
A few points:
"It is unethical to donate to effective-altruist charities, since giving away money will mean that your life becomes less happy.
Oh come on, this is an informed personal choice, not something your parents decided for you, why would you even put the two together.
Your logic would seem to go beyond "don't use embryo selection to boost IQ, have kids the regular way instead".
I said or implied nothing of the sort! Maybe you can select for both intelligence and emotional stability, I don't know. Just don't focus on one trait and assume it is an indisp...
Well, I think we are in agreement, and it all comes down to evaluating expected happiness. Maybe one can select for both intelligence and happiness, But that does not seem to be covered in OP, which seems like a pretty big omission, just assuming that intelligence is an unquestionable positive on a personal scale.
So I agree with your general point that it is important to consider negative pleiotropy between traits. However in the specific case of happiness and intelligence, the first two studies I found from googling suggest that happiness and intelligence are positively correlated.12
Here's a meta-analysis of 23 studies that found no correlation between intelligence and happiness at an individual level but a strong correlation at the country level.
So I think that unless you're dealing with much stronger techniques than simple embryo selection, this is not a concern...
Yeah. that is definitely not uncommon. But also, like with a dumb dog, it is easier to "end up being content and happy due to luck" when your aspirations and goals are moderate.
Frank is dumb. You can reward him for being smart all you want, but that's just not gonna get him to take any community college classes. He is content with his life. It works for him. He enjoys his job, loves his family, owns his home, and just isn't interested in change.
He sure sounds smart. Or at least life-smart. He knows what he wants, he achieved it and he is happy. He may not get far in the Raven progressive matrices test, but this test does not affect his ability to achieve what he wants and probably even live in harmony with himself.
I am not a compatibilist, so not my answer, but Sean Carroll says, in his usual fashion, that free will is an emergent phenomenon, akin to Dennett's intentional stance. This AMA has an in-depth discussion https://www.preposterousuniverse.com/podcast/2021/05/13/ama-may-2021/. I bolded his definition at the very end.
...whether you’re a compatibilist or an in compatibilist has nothing at all to do with whether the laws of physics are deterministic. I cannot possibly emphasize this enough. What matters is that there are laws. Whether those laws are deterministic
Yes, and I think it is worse than that. Even existence in the map is not clearcut. As I said in the other comment, do dragons exist in the map? In what sense? Do they also exist in the territory, given that you can go and buy a figurine of one?
Yeah, I was a bit vague there, definitely worth going deeper. One would start comparing societies that survive/thrive with those that do not, and compare prevailing ethics and how it responds to the external and internal changes. Basically "moral philosophy" would be more useful as a descriptive observational science, not a prescriptive one. I guess in that sense it is more like decision theory. And yes, it interfaces with psychology, education and what not.
Thoughtfully engaging with the existing body of literature might help. Show that you understand the claims, the counter-claims, the arguments for and against. Show that your argument is novel and interesting, not something that has been already put forward and critiqued numerous times. Basically, whatever makes a good scientific paper.