LWLW's Shortform

LWLW

LESSWRONG
LW

LWLW's Shortform — LessWrong

LWLW's Shortform

by LWLW

6th Jan 2025

1 min read

1

This is a special post for quick takes by LWLW. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

7Aristotelis Kostelenos

99 comments, sorted by

top scoring

Click to highlight new comments since: Today at 3:47 AM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

[-]LWLW3mo37-9

I just can’t wrap my head around people who work on AI capabilities or AI control. My worst fear is that AI control works, power inevitably concentrates, and then the people who have the power abuse it. What is outlandish about this chain of events? It just seems like we’re trading X-risk for S-risks, which seems like an unbelievably stupid idea. Do people just not care? Are they genuinely fine with a world with S-risks as long as it’s not happening to them? That’s completely monstrous and I can’t wrap my head around it. The people who work at the top labs make me ashamed to be human. It’s a shandah.

This probably won’t make a difference, but I’ll write this anyways. If you’re working on AI-control, do you trust the people who end up in charge of the technology to wield it well? If you don’t, why are you working on AI control?

[-]GradientDissenter3mo1526

I don't understand how working on "AI control" here is any worse than working on AI alignment (I'm assuming you don't feel the same about alignment since you don't mention it).

In my mind, two different ways AI could cause bad things to happen are: (1) misuse: people use the AI use it for bad things, and (2) misalignment: regardless of anyone's intent, the AI does bad things of its own accord.

Both seem bad. Alignment research and control are both ways to address misalignment problems, I don't see how they differ for the purposes of your argument (though maybe I'm failing to understand your argument).

Addressing misalignment slightly increases people's ability to misuse AI, but I think the effect is fairly small and outweighed by the benefit of decreasing the odds a misaligned AI takes catastrophic actions.

5waterlubber3mo

It's not. Alignment is de facto capabilities (principal agent problem makes aligned employees more economically valuable) and unless we have a surefire way to ensure that the AI is aligned to some "universal," or even cultural, values, it'll be aligned by default to Altman, Amodei, et. al.

2clone of saturn3mo

We don't know of an alignment target that everyone can agree on, so solving alignment pretty much guarantees misuse by at least some people's lights.

[-]habryka3mo110

I mean "not solving alignment" pretty much guarantees misuse by everyone's lights? (In both cases conditional on building ASI)

[-]clone of saturn3mo128

It pretty much guarantees extinction, but people can have different opinions on how bad that is relative to disempowerment, S-risks, etc.

[-]Vladimir_Nesov3mo110

Most s-risk scenarios vaguely analogous to historical situations don't happen in a post-AGI world, because there humans aren't useful for anything, either economically or in terms of maintaining power (unlike how they were throughout human history). It's not useful for the entities in power to do any of the things with traditionally terrible side effects.

Absence of feedback loops for treating people well (at the level of humanity as a whole) is its own problem, but it's a distinct kind of problem. It doesn't necessarily settle poorly (at the level of individuals and smaller communities) in a world with radical abundance, if indeed even a tiny fraction of the global resources gets allocated to the future of humanity, which is the hard part to ensure.

4julius vidal3mo

I might be misunderstanding, but doesn't this sort of assume that all tyranny is purely about resources? No matter the level of abundance, its not clear that this makes power any less appealing to the power hungry, or suffering any less enjoyable to the sadists. So I don't see why power-centralisation in the wrong hands would not be a problem in a post-AGI world.

7Vladimir_Nesov3mo

Power-centralisation in a post-AGI world is not about wielding humans, unlike in a pre-AGI world. Power is no longer power over humans doing your bidding, because humans doing your bidding won't give you power. By orthogonality, any terrible thing can in principle be someone's explicit intended target (an aspiration, not just a habit shaped by circumstance), but that's rare. Usually the terrible things are (a side effect of) an instrumentally useful course of action that has other intended goals, even where in the final analysis the justification doesn't quite work.

1Dave Banerjee3mo

How bad do you think power centralization is? It's not obvious to me that power centralization guarantees S-risk. In general, I feel pretty confused about how a human god-emperor would behave, especially because many of the reasons that pushed past dictators to draconian rule may not apply when ASI is in the picture. For example, draconian dictators often faced genuine threats to their rule from rival factions, requiring brutal purges and surveillance states to maintain power, or they were stupid / overly paranoid (an ASI advisor could help them have better epistemics), etc. I'm keen to understand your POV better.

1julius vidal3mo

I think most people who work on control think that its a necessary intermediary step towards alignment, because aligning ASI will require use of (potentially not yet aligned) AI.

1Jesper L.3mo

I partly agree in spirit. Yes, concentrated power* is bad, and I for one is 100 % always keeping this top of mind. *EDIT: Too much unchecked power is bad. But when it comes to control, it's not at all as simple as you put it. Sure, solving control issues is not enough, but it is not bad on its own either. First, S-risks from rogue AI seem just as likely, so why would control be a worse outcome? Maybe I misunderstand. If so, you should be more clear what you mean Secondly, and more importantly, control problems need to be solved even for current and human-level AIs. Thirdly, if we fear SI (ASI), then having control solutions apolied to its progenitors can buy precious time. Fourth, you can go for control solutions pre-SI that are decentralized. An idea I wrote about recently is something as simple as batteries. (It illustrates the point easily.) You can have one battery reliance. Or 100. You can put batteries in APIs, critical GPU clocks, etc. in various data centers, and have that service be operated by local authorities. Control solutions are factors in a game. The end. PS. Sometimes here it seems people unconsciously have belief in belief, that there is just one or two outcomes and one or two solutions, and that everything will resolve in one or two steps. Black and white thinking, in other words. We must watch out for this fallacy and remain vigilant.

1FVelde3mo

What do you think is realistic if alignment is possible? Would the large corporations make a loving machine or a money-and-them-aligned machine?

7LWLW3mo

I think it leads to S-risks. I think people will remain in charge and use AI as a power-amplifier. The people most likely to end up with power like having power. They like having control over other people and dominating them. This is completely apparent if you spend the (unpleasant) time reading the Epstein documents that the House has released. We need societal and governmental reform before we even think about playing with any of this technology. The answer to the world’s problems doesn’t rely on a bunch of individuals who are good at puzzles solving a puzzle and then we get utopia. It involves people recognizing the humanity of everyone around them and working on societal and governmental reform. And sure this stuff sounds like a long-shot but we’ve got to try. I wish I had a less vague answer but I don’t.

1ConformalInfinity2mo

I don't think you need to worry about individual humans aligning ASI only with themselves because this is probably much more difficult than ensuring it has any moral value system which resembles a human one. It is much more difficult to justify only caring about Sam Altman's interests than it is for humans or life forms in general, which will make it unlikely that specifying this kind of allegiance in a way which is stable under self modification is possible, in my opinion.

[-]LWLW3mo240

Is intology a legitimate research lab? Today they talked about having an AI researcher that performed better than humans on RE-bench at 64 hr time horizons. This seems really unbelievable to me. The AI system is called Locus.

[-]Caleb Biddulph3mo100

I made a Manifold market about this: Is Intology's Locus really better than humans at AI R&D?

3Seth Herd3mo

Looks like the Manifold market on this is at 9% it's really better than a human, with 8 participants. I wouldn't be surprised if it's good enough to be noteworthy, though!

3Person3mo

Per its LinkedIn it's a tiny 2-10 member lab. Their only previous contribution was Zochi, a model for generating experiments and papers, one seemingly being accepted into ACL 2025. But there's barely any transparency on what their model actually is, even on their technical report. I personally see red flags with Intology too, main one being that such a performance form a tiny lab is hard to believe. On RE-Bench they compare against Sonnet 4.5, which has the best performance thus far per its model card, so them achieving superhuman results seems strange. Then there's the fact there seems to be no paper as it's their early results, the fact these results are all self-reported with minimal verification (a single Tsinghua student checked the kernels), and we have no technical details on the system itself or even what the underlying model is. Another smaller lab with seemingly big contributions I can think of would be Sakana AI,but even they have far more employees and much more contributions + actual detailed papers for their models. And even they had an issue at one point where their CUDA Engineer system reported a 100x CUDA speedup that turned out to be cheating. Here Intology claims to get 20x-100x speedups like candy.

2LWLW3mo

I just don’t understand why the people there would lie about something like this. This isn’t even very believable. It looks like the guy who founded it was a bright ML PhD and if he’s not telling the truth why would he throw away his reputation over this? Maybe it’s real but I’m pretty skeptical. I looked at their Zochi paper and I don’t see that they offered any proof that the papers they attributed to Zochi were written by Zochi.

[-]Person3mo10-1

It's happened before, see Reflexion (I hope I'm remembering the name right) hyping up their supposed real time learner model only for it to be a lie. Tons of papers overpromise and don't seem to get lasting consequences. But yeah I also don't know why Intology would be lying, but the fact there's no paper and that their deployment plans are waitlist-based and super vague (and the fact no one ever talks about zochi despite their beta program being old by this point) means we likely won't ever know. They say they plan on sharing Locus' discoveries "in the coming months", but until they actually do there's no way to verify past checking their kernel samples on GitHub.

For now I'm heavily, heavily skeptical. Agentic scaffolds don't usually magically 10x frontier models' performance, and we know the absolute best current models are still far from RE-Bench human performance (per their model cards, in which they also use proper scaffolding for the benchmark).

5leogao3mo

people lie about some crazy shit

[-]LWLW1y237

Making the (tenuous) assumption that humans remain in control of AGI, won't it just be an absolute shitshow of attempted power grabs over who gets to tell the AGI what to do? For example, supposing OpenAI is the first to AGI, is it really plausible that Sam Altman will be the one actually in charge when there will have been multiple researchers interacting with the model much earlier and much more frequently? I have a hard time believing every researcher will sit by and watch Sam Altman become more powerful than anyone ever dreamed of when there's a chance they're a prompt away from having that power for themselves.

[-]Milan W1y115

You're assuming that:
- There is a single AGI instance running.
- There will be a single person telling that AGI what to do
- The AGI's obedience to this person will be total.

I can see these assumptions holding approximately true if we get really really good at corrigibility and if at the same time running inference on some discontinuously-more-capable future model is absurdly expensive. I don't find that scenario very likely, though.

2LWLW1y

I see no reason why any of these will be true at first. But the end-goal for many rational agents in this situation would be to make sure 2 and 3 are true.

1Milan W1y

Correct, those goals are instrumentally convergent.

[-]LWLW1y140

what is the plan for making task-alignment go well? i am much more worried about the possibility of being at the mercy of some god-emperor with a task-aligned AGI slave than I am about having my atoms repurposed by an unaligned AGI. the incentives for blackmail and power-consolidation look awful.

1MondSemmel1y

Why? I figure all the AI labs worry mostly about how to get the loot, without ensuring that there's going to be any loot in the first place. Thus there won't be any loot, and we'll go extinct without any human getting to play god-emperor. It seems to me like trying to build an AGI tyranny is an alignment-complete challenge, and since we're not remotely on track to solving alignment, I don't worry about that particular bad ending.

2Hopenope1y

the difficulty of alignment is still unknown. it may be totally impossible, or maybe some changes to current methods (deliberative alignment or constitutional ai) + some R&D automation can get us there.

8MondSemmel1y

The question is not whether alignment is impossible (though I would be astonished if it was), but rather whether it's vastly easier to increase capabilities to AGI/ASI than it is to align AGI/ASI, and ~all evidence points to yes. And so the first AGI/ASI will not be aligned.

1Hopenope1y

Your argument is actually possible, but what evidences do you have, that make it the likely outcome?

2RHollerith1y

The very short answer is that the people with the most experience in alignment research (Eliezer and Nate Soares) say that without an AI pause lasting many decades the alignment project is essentially hopeless because there is not enough time. Sure, it is possible the alignment project succeeds in time, but the probability is really low. Eliezer has said that AIs based on the deep-learning paradigm are probably particularly hard to align, so it would probably help to get a ban or a long pause on that paradigm even if research in other paradigms continues, but good luck getting even that because almost all of the value currently being provided by AI-based services are based on deep-learning AIs. One would think that it would be reassuring to know that the people running the labs are really smart and obviously want to survive (and have their children survive) but it is only reassuring before one listens to what they say and reads what they write about their plans on how to prevent human extinction and other catastrophic risks. (The plans are all quite inadequate.)

-3MondSemmel1y

This seems way overdetermined. For example, AI labs have proven extremely successful at spending arbitrary amounts of money to increase capabilities (<-> scaling laws), and there's been no similar ability to convert arbitrary amounts of money into progress on alignment.

1LWLW1y

You’re probably right but I guess my biggest concern is the first superhuman alignment researchers being aligned/dumb enough to explain to the companies how control works. It really depends on if self-awareness is present as well.

[-]LWLW1y131

Everything feels so low-stakes right now compared to future possibilities, and I am envious of people who don’t realize that. I need to spend less time thinking about it but I still can’t wrap my head around people rolling a dice which might have s-risks on it. It just seems like a -inf EV decision. I do not understand the thought process of people who see -inf and just go “yeah I’ll gamble that.” It’s so fucking stupid.

4Thane Ruthenis1y

* They are not necessarily "seeing" -inf in the way you or me are. They're just kinda not thinking about it, or think that 0 (death) is the lowest utility can realistically go. * What looks like an S-risk to you or me may not count as -inf for some people.

7Aristotelis Kostelenos1y

I think humanity's actions right now are most comparable those of a drug addict. We as a species dont have the necessary equivalent of executive function and self control to abstain from racing towards AGI. And if we're gonna do it anyway, those that shout about how we're all gonna die just ruin everyone's mood.

3dr_s1y

Or for that matter to abstain towards burning infinite fossil fuels. We happen to not live on a planet with enough carbon to trigger a Venus-like cascade, but if that wasn't the case I don't know if we could stop ourselves from doing that either. The thing is, any kind of large scale coordination to that effect seems more and more like it would require a degree of removal of agency from individuals that I'd call dystopian. You can't be human and free without a freedom to make mistakes. But the higher the stakes, the greater the technological power we wield, the less tolerant our situation becomes of mistakes. So the alternative would be that we need to willingly choose to slow down or abort entirely certain branches of technological progress - choosing shorter and more miserable lives over the risk of having to curtail our freedom. But of course for the most part, not unreasonably!, we don't really want to take that trade-off, and ask "why not both?".

2dr_s1y

True but that's just for relatively "mild" S-risks like "a dystopia in which AI rules the world, sees all and electrocutes anyone who commits a crime by the standards of the year it was created in, forever". It's a bad outcome, you could classify it as S-risk, but it's still among the most aligned AIs imaginable and relatively better than extinction. I simply don't think many people think about what does an S-risk literally worse than extinction look like. To be fair I also think these aren't very likely outcomes, as they would require an AI very aligned to human values - if aligned for evil.

2Thane Ruthenis1y

No, I mean, I think some people actually hold that any existence is better than non-existence, so death is -inf for them and existence, even in any kind of hellscape, is above-zero utility.

2dr_s1y

I just think any such people lack imagination. I am 100% confident there exists an amount of suffering that would have them wish for death instead; they simply can't conceive of it.

2Thane Ruthenis1y

One way to make this work is to just not consider your driven-to-madness future self an authority on the matter of what's good or not. You can expect to start wishing for death, and still take actions that would lead you to this state, because present!you thinks that existing in a state of wishing for death is better than not existing at all. I think that's perfectly coherent.

3dr_s1y

I mean, I guess it's technically coherent, but it also sounds kind of insane. That way Dormammu lies. Why would one even care about their future self if they're so unconcerned about that self's preferences?

[-]LWLW11mo7-7

This just boils down to “humans aren’t aligned,” and that fact is why this would never work, but I still think it’s worth bringing up. Why are you required to get a license to drive, but not to have children? I don’t mean this in a literal way, I’m just referring to how casual the decision to have children is seen by much of society. Bringing someone into existence is vastly higher stakes than driving a car.

I’m sure this isn’t implementable, but parents should at least be screened for personality disorders before they’re allowed to have children. And... (read more)

6Garrett Baker11mo

Historically attempts to curtail this right lead to really really dark places. Part of living in a society with rights and laws is that people will do bad things the legal system has no ability to prevent. And on net, that’s a good thing. See also.

4cubefox11mo

There is also the related problem of intelligence being negatively correlated with fertility, which leads to a dysgenic trend. Even if preventing people below a certain level of intelligence to have children was realistically possible, it would make another problem more severe: the fertility of smarter people is far below replacement, leading to quickly shrinking populations. Though fertility is likely partially heritable, and would go up again after some generations, once the descendants of the (currently rare) high-fertility people start to dominate.

[-]LWLW1y70

>be me, omnipotent creator

>decide to create

>meticulously craft laws of physics

>big bang

>pure chaos

>structure emerges

>galaxies form

>stars form

>planets form

>life

>one cell

>cell eats other cell, multicellular life

>fish

>animals emerge from the oceans

>numerous opportunities for life to disappear, but it continues

>mammals

>monkeys

>super smart monkeys

>make tools, control fire, tame other animals

>monkeys create science, philosophy, art

>the universe is beginning to understand itself

>AI

>Humans and AI... (read more)

[-]LWLW10mo*50

From what I understand, JVN, Poincaré, and Terence Tao all had/have issues with perceptual intuition/mental visualization. JVN had “the physical intuition of a doorknob,” Poincaré was tested by Binet and had extremely poor perceptual abilities, and Tao (at least as a child) mentioned finding mental rotation tasks “hard.”

I also fit a (much less extreme) version of this pattern, which is why I’m interested in this in the first place. I am (relatively) good at visual pattern recognition and math, but I have aphantasia and have an average visual working ... (read more)

3Steven Byrnes10mo

(Not really answering your question, just chatting.) What’s your source for “JVN had ‘the physical intuition of a doorknob’”? Nothing shows up on google. I’m not sure quite what that phrase is supposed to mean, so context would be helpful. I’m also not sure what “extremely poor perceptual abilities” means exactly. You might have already seen this, but Poincaré writes about “analysts” and “geometers”: Not sure exactly how that relates, if at all. (What category did Poincaré put himself in? It’s probably in the essay somewhere, I didn’t read it that carefully. I think geometer, based on his work? But Tao is extremely analyst, I think, if we buy this categorization in the first place.) I’m no JVN/Poincaré/Tao, but if anyone cares, I think I’m kinda aphantasia-adjacent, and I think that fact has something to do with why I’m naturally bad at drawing, and why, when I was a kid doing math olympiad problems, I was worse at Euclidean geometry problems than my peers who got similar overall scores.

3LWLW10mo

Oh I was actually hoping you’d reply! I may have hallucinated the exact quote I mentioned but here is something from Ulam: “Ulam on physical intuition and visualization,” it’s on Steve Hsu’s blog. And I might have hallucinated the thing about Poincaré being tested by Binet, that might just be an urban legend I didn’t verify. You can find Poincaré’s struggles with coordination and dexterity in “Men of Mathematics,” but that’s a lot less extreme than the story I passed on. I am confident in Tao’s preference for analysis over visualization. If you have the time look up “Terence Tao” on Gwern’s website. I’m not very familiar with the field of neuroscience, but it seems to me that we’re probably pretty far from being able to provide a satisfactory answer to these questions. Is that true from your understanding of where the field is at? What sorts of techniques/technology would we need to develop in order for us to start answering these questions?

2mattmacdermott10mo

In case anyone else is going looking, here is the relevant account of Tao as a child and here is a screenshot of the most relevant part:

[-]LWLW23d40

Has anybody checked if finetuning LLMs to have inconsistent “behavior” degrades performance? Like you finetuned a model on a bunch of aligned tasks like writing secure code and offering compassionate responses to individuals in distress, but then you tried to specifically make it indifferent to animal welfare? It seems like that would create internal dissonance in the LLM which I would guess causes it to reason less effectively (since the character it’s playing is no longer consistent).

[-]LWLW1y40

Apologies in advance if this is a midwit take. Chess engines are “smarter” than humans at chess, but they aren’t automatically better at real-world strategizing as a result. They don’t take over the world. Why couldn’t the same be true for STEMlord LLM-based agents?

It doesn’t seem like any of the companies are anywhere near AI that can “learn” or generalize in real time like a human or animal. Maybe a superintelligent STEMlord could hack their way around learning, but that still doesn’t seem the same as or as dangerous as fooming, and it also seems m... (read more)

6Carl Feynman1y

Welcome to Less Wrong. Sometimes I like to go around engaging with new people, so that’s what I’m doing. On a sentence-by-sentence basis, your post is generally correct. It seems like you’re disagreeing with something you’ve read or heard. But I don’t know what you read, so I can’t understand what you’re arguing for or against. I could guess, but it would be better if you just said.

1LWLW1y

hi, thank you! i guess i was thinking about claims that "AGI is imminent and therefore we're doomed." it seems like if you define AGI as "really good at STEM" then it is obviously imminent. but if you define it as "capable of continuous learning like a human or animal," that's not true. we don't know how to build it and we can't even run a fruit-fly connectome on the most powerful computers we have for more than a couple of seconds without the instance breaking down: how would we expect to run something OOMs more complex and intelligent? "being good at STEM" seems like a much, much simpler and less computationally intensive task than continuous, dynamic learning. tourist is great at codeforces, but he obviously doesn't have the ability to take over the world (i am making the assumption that anyone with the capability to take over the world would do so). the second is a much, much fuzzier, more computationally complex task than the first. i had just been in a deep depression for a while (it's embarassing, but this started with GPT-4) because i thought some AI in the near future was going to wake up, become god, and pwn humanity. but when i think about it from this perspective, that future seems much less likely. in fact, the future (at least in the near-term) looks very bright. and i can actually plan for it, which feels deeply relieving to me.

[-]Carl Feynman1y143

For me, depression has been independent of the probability of doom. I’ve definitely been depressed, but I’ve been pretty cheerful for the past few years, even as the apparent probability of near-term doom has been mounting steadily. I did stop working on AI, and tried to talk my friends out of it, which was about all I could do. I decided not to worry about things I can’t affect, which has clarified my mind immensely.

The near-term future does indeed look very bright.

1abdallahhhm1y

Hey Carl, sorry to bother you what I'm about to say is pretty irrelevant to the discussion but I'm a highschool student looking to gather good research experience and I wanted to ask a few questions. Is there any place I can reach out to you other than here? I would greatly appreciate any and all help!

[-]Carl Feynman1y110

You shouldn’t worry about whether something “is AGI”; it’s an I’ll-defined concept. I agree that current models are lacking the ability to accomplish long-term tasks in the real world, and this keeps them safe. But I don’t think this is permanent, for two reasons.

Current large-language-model type AI is not capable of continuous learning, it is true. But AIs which are capable of it have been built. AlphaZero is perhaps the best example; it learns to play games to a superhuman level in a few hours. It’s a topic of current research to try to combine them.

Moreover, tool-type AIs tend to be developed to provide agency, because it’s more useful to direct an agent than it is a tool. This is a more fully fleshed out here: https://gwern.net/tool-ai

Much of my probability of non-doom is resting on people somehow not developing agents.

2Carl Feynman1y

Whoops, meant MuZero instead of AlphaZero.

1LWLW1y

MuZero doesn’t seem categorically different from AlphaZero. It has to do a little bit more work at the beginning, but if you don’t get any reward for breaking the rules: you will learn not to break the rules. If MuZero is continuously learning then so is AlphaZero. Also, the games used were still computationally simple, OOMs more simple than an open-world game, let alone a true World-Model. AFAIK MuZero doesn’t work on open-ended, open-world games. And AlphaStar never got to superhuman performance at human speed either.

3Carl Feynman1y

I am in violent agreement. Nowhere did I say that MuZero could learn a world model as complicated as those LLMs currently enjoy. But it could learn continuously, and execute pretty complex strategies. I don’t know how to combine that with the breadth of knowledge or cleverness of LLMs, but if we could, we’d be in trouble.

[-]LWLW4mo*30

Fun Fact of the Day: Kanye West’s WAIS is within two points of a fields medalist’s (the fields medalist is Richard Borcherds, their respective IQs are 135 and 137).

Extra Fun Fact: Kanye West was bragging about this to Donald Trump in the Oval Office. He revealed that his digit span was only 92.5 (which is what makes me think he actually had a psychologist-administered WAIS).

Extra Extra Fun Fact: Richard Borcherds was administered the WAIS-R by Sacha Baron Cohen's first cousin.

7TsviBT4mo

(For reference, 135 is 2.33 SDs, which works out to about 1 in 100, i.e. you're the WAISest person in the room with 100 randomly chosen adults. Cf. https://tsvibt.blogspot.com/2022/08/the-power-of-selection.html#samples-to-standard-deviations )

6interstice4mo

Interesting, seems believable. Being intelligent probably helps a lot with being a successful musician.

4Hide4mo

Possible, but seems unlikely. Unless there’s some verified record, the mere fact he may have taken a valid test is very weak evidence that his claimed scores are accurate and not exaggerated.

[-]LWLW10mo3-32

What if Trump is channeling his inner doctor strange and is crashing the economy in order to slow AI progress and buy time for alignment? Eliezer calls for an AI pause, Trump MAKES an AI pause. I rest my case that Trump is the most important figure in the history of AI alignment.

9andrew sauer10mo

Trump shot an arrow into the air; it fell to Earth, he knows not where... Probably one of the best succinct summaries of every damn week that man is president lmao

7Mateusz Bagiński10mo

If that was his goal, he has better options.

2O O10mo

Yes, the likely outcome of a long tariff regime is China replaces the U.S. as the hegemon + AI race leader and they can’t read Lesswrong or EA blogs there so all this work is useless.

5robo10mo

LessWrong is uncensored in China.

2Mateusz Bagiński10mo

VPNs exist and are probably widely used in China + much of "all this work" is on ArXiv etc.

[-]Buck2mo20

I'd tell MATS with this form, then not worry about it further.

[-]LWLW3mo2-4

I think a lot of people are confused by good and courageous people and don’t understand why some people are that way. But I don’t think the answer is that confusing. It comes down to strength of conscience. For some people, the emotional pain of not doing what they think is right hurts them 1000x more than any physical pain. They hate doing what they think is wrong more than they hate any physical pain.

So if you want to be an asshole, you can say that good and courageous people, otherwise known as heroes, do it out of their own self-interest.

3Karl Krueger3mo

Contrary view: The use of self-torture to promote goodness is an s-risk. The kingdom of heaven looks like people doing good deeds for each other out of love and delight, not out of guilt- and shame-avoidance.

-52LWLW3mo

3Vladimir_Nesov3mo

People can just decide to do things of their own volition, without peculiar arrangements of pain or pleasure being in charge of their will.

1LWLW3mo

Sure. The people I’m talking about choose to care as much as they do. Good and courageous people can choose to not have hope and not care about others, but they choose to care.

2Eli Tyre3mo

I claim that I am unusually Good (people who know me well would agree—many of them have said as much, unprompted). This is not how it works for me.

1Myron Hedderson3mo

Also, if it is true that a lot of people are confused by good and courageous people, I am unclear where the confusion comes from. Good behaviour gets rewarded from childhood, and bad behaviour gets punished. Not perfectly, of course, and in some places and times very imperfectly indeed, but being seen as a good person by your community's definition of "good" has many social rewards, we're social creatures... I am unclear where the mystery is. Were the confused people raised by wolvesnon-social animals? I don't actually buy the premise that a lot of people are confused by moral courage, on reflection.

1Myron Hedderson3mo

This doesn't match my experience of what good people are generally like. I find them to be often happy to do what they are doing, rather than extremely afraid of not doing it, as I imagine would be the case if their reasons for behaving as they do were related to avoidance of pain. There are of course exceptions. But if thinking I had done the wrong thing was extremely painful to me, literally "1000x more than any physical pain" I predict I'd quite possibly land on the strategy "avoid thinking about matters of right and wrong, so as to reliably avoid finding out I'd done wrong." A nihilistic worldview where nothing was right or wrong and everything I might do is fine, would be quite appealing. Also, since one can't change the past, any discovery that I'd done something wrong in the past would be an unfixable, permanent source of extreme pain for the rest of my life. In that situation, I'd probably rationalize the past behaviour as somehow being good, actually, in order to make the pain stop... which does not pattern-match to being a good person long term, but rather the opposite, being someone who is pathologically unable to admit fault, and has a large bag of tricks to avoid blame.

1LWLW3mo

It’s not fear. It’s anger. Also good people are rare. The people you think of as good people are likely just friendly.

1Myron Hedderson3mo

How rare good people are depends heavily on how high your bar for qualifying as a good person is. Many forms of good-person behaviour are common, some are rare. A person who has never done anything they later felt guilty about (who has a functioning conscience) is exceedingly rare. In my personal experience, I have found people to vary on a spectrum from "kind of bad and selfish quite often, but feels bad about it when they think about it and is good to people sometimes" to "consistently good, altruistic and honest, but not perfect, may still let you down on occasion", with rare exceptions falling outside this range.

[-]LWLW11mo20

How far along are the development of autonomous underwater drones in America? I’ve read statements by American military officials about wanting to turn the Taiwan straight into a drone-infested death trap. And I read someone (not an expert) who said that China is racing against time to try and invade before autonomous underwater drones take off. Is that true? Are they on track?

[-]LWLW14d10

I’ve found the best way to get out of philosophical rabbit holes is to spend more time living. It provides far more reassurance and wisdom than spending all day trying to solve the problem of evil. I think Hume found something similar and that’s deeply reassuring to me.

[-]LWLW6mo10

I’m weighing my career options, and the two issues that seem most important to me are factory farming and preventing misuse/s-risks from AI. Working for a lab-grown meat startup seems like a very high-impact line of work that could also be technically interesting. I think I would enjoy that career a lot.

However, I believe that S-risks from human misuse of AI and neuroscience introduce scenarios that dwarf factory-farming in awfulness. I think that there are lots of incredibly intelligent people working on figuring out how to align AIs to who/what we want. ... (read more)

[This comment is no longer endorsed by its author]Reply

2Buck6mo

What are your skill sets? Forethought has done work recently related to preventing S-risk arising from AI. I'm pretty in favor of trying to tackle the most important cause area.

1LWLW6mo

I am pretty good at math. At a T20 math program I was chosen for special mentorship and research opportunities over several people who made Top 500 on the Putnam due to me being deemed “more talented” (as nebulous as that phrase is, I was significantly faster in lectures than them and was able to digest graduate texts much quicker than them, I was also able to solve competition-style problems they couldn’t). My undergrad got interrupted by a health crisis so I never got a chance to actually engage in research or dedicated Putnam prep, but I believe most (maybe all if I’m being vain) of my professors would have considered me the brightest student in my year. I don’t know a lot about programming or ML at this point, but I am confident I could learn. I’m two years into my undergrad and will likely be returning next year.

5Buck6mo

My default drive-by recommendation is that you try to get involved in research related to these issues. You could try to get advice from Chi Nguyen, who works on s-risk and is friendly and thoughtful; you can contact her here.

1LWLW6mo

Thank you so much! I will contact her.

[-]LWLW23d0-1

The idea of a superintelligence having an arbitrary utility function doesn’t make much sense to me. It ultimately makes the superintelligence a slave to its utility function which doesn’t seem like the way a superintelligence would work.

1papetoast23d

If not "a slave to its utility function", then what a superintelligence would be like? Constantly modifying its utility function? I think a superintelligence would have almost arbitrary utility function that is very sensitive to initial conditions, and then it would slightly modify the utility function to a self-consistent one and keep it forever. It almost never makes sense to change your utility function to a new one according to your old utility function.

9Vladimir_Nesov23d

Goals defined for a person who is not already a formal agent are a living thing, a computational process built from possible behaviors and decisions of that person in various hypothetical situations. Such goals are not even conceptually prior to those behaviors, though there is still an advantage in formulating them as an unchanging computation that defines the target for external agency aiming in alignment with that person's own aims. But that computation is never fully computed, and it can only be computed further through the decisions of the person who defines it as their goals.

1papetoast23d

I agree that a human doesnt have cleanly defined goals and I agree with most of the additional nuances in your comment to the extent that I can understand them, but OP is talking about superintelligence and I think modelling a superintelligence as having a constant-across-time utility function is appropriate.

3Vladimir_Nesov23d

An aligned superintelligence would work with goals of the same kind, even if it's aligned to early AGIs rather than humans. Goals-as-computations may be constant, like the code of a program may be constant, but what's known about its behavior isn't constant. And so the way it guides actions of an agent develops as it gets computed further, ultimately according to decisions of the underlying humans/AGIs (and their future iterations) in various hypothetical situations. Also, an uplifted (grown up) human could be a superintelligence personally, it's not a different kind of thing with respect to values it could have.

[-]LWLW2mo00

I got into reading about near death experiences and it seems a common theme is that we’re all one. Like each and every one of us is really just part of some omniscient god that’s so omniscient and great that god isn’t even a good enough name for it: experiencing what it’s like to be small. Sure, why not. That’s sort of intuitive to me. Given that I can’t verify the universe exists and can only verify my experience it doesn’t seem that crazy to say experience is fundamental.

But if that’s the case then I’m just left with an overwhelming sense of why. Why mak... (read more)

[-]LWLW23d-10

My guess is that finetuning an LLM turns it into a p-zombie. I don’t think the architecture is complicated enough to support consciousness. There’s zero capacity for choice involved, which seems to be what consciousness is all about.

[-]LWLW25d-3-2

contra the orthogonality thesis.

if you want to waste a day or two, try to find an eminent mathematician or physicist who had NPD or ASPD. as far as i can tell, i haven't been able to find any successful ones who had either disorder.

as far as the research goes, ASPD is correlated with significantly lower non-verbal intelligence. and in one study i found, NPD wasn't really correlated with any parts of intelligence except with lower non-verbal intelligence.

which can lead to the idea that everbody starts out aligned, and then when those with less cognitive res... (read more)

[+][comment deleted]9mo10

[+][comment deleted]1y10

[+][comment deleted]1y*10

Moderation Log