All of lc's Comments + Replies

Oversight Misses 100% of Thoughts The AI Does Not Think

The equivalent of not using C for AGI development is not using machine learning techniques. You are right that that seems to be what DM/et. al. are gearing us up to do, and I agree that developing such compiler optimizations might be better than nothing and that we should encourage people to come up with more of them when they can be stacked neatly. I'm not that pessimistic. These compiler level security features do help prevent bugs. They're just not generally sufficient when stacked against overwhelming optimization pressure and large attack surfaces.

My ... (read more)

Oversight Misses 100% of Thoughts The AI Does Not Think
lc3d1910

I think if your takeaway from this sequence is to ask people like OP to analyze complicated amalgamations of alignment solutions you're kind of missing the point.

There's a computer security story I like to tell about memory corruption exploits. People have been inventing unique and independent compiler and OS-level guardrails against C program mistakes for decades; DEP, ASLR, Stack canaries. And they all raise the cost of developing an exploit, certainly. But they all have these obvious individual bypasses - canaries and ASLR can be defeated by discovering... (read more)

3AllAmericanBreakfast3d
Also, I haven’t asked anyone to “prove” anything here. I regard this as an important point. John’s not trying to “prove” that these strategies are individually nonfunctional, and I’m not asking him to “prove” that they’re functional in combination. This is an exploratory sequence, and what I’m requesting is an exploratory perspective (one which you have provided, and thank you for that).
3AllAmericanBreakfast3d
I’d be on board with at least a very long delay on the AI safety equivalent of “not writing in C,” which would be “not building AGI.” Unfortunately, that seems to not be a serious option on the table. Even if it were, we could still hope for duct tape patches/Swiss cheese security layers to mitigate, slow, or reduce the chance of an AI security failure. It seems to me that the possibility of a reasonably robust AI safety combination solution is something we’d want to encourage. If not, why not?
Oversight Misses 100% of Thoughts The AI Does Not Think

The other kind of misalignment I was thinking about is if it's able to perform a 

Kira

::Death Note or

Carissa

::Planecrash style flip during training, where it modifies itself to have the "correct" thoughts/instrumental goals in anticipation of inspection but buries an if(time() > ...){} hatch inside itself which it & its overseers won't notice until it's too late. 

Oversight Misses 100% of Thoughts The AI Does Not Think

Seems like interpretability that could do this would indeed address OP's stated concerns. One problem however is that it might be genuinely optimizing for keeping humans alive & happy under some circumstances, and then change goals in response to some stimulus or after it notices the overseer is turned off, especially if it's trying to pass through some highly monitored testing phase.

Edit: It occurs to me this in turn is provided that it doesn't have the foresight to think "I'm going to fuck these people over later, better modify my code/alert the over... (read more)

2Mateusz Bagiński3d
This might occur in the kind of misalignment where it is genuinely optimizing for human values just because it is too dumb to know it is not the best way to realize its learned objective. If extracting that objective would be harder than reading its genuine instrumental intentions, then the moment it discovers a better way may look to the overseer like a sudden change of values
Shortform

I need a LW feature equivalent to stop-loss where if I post something risky and it goes below -3 or -5 it self-destructs.

Shortform

My thoughts are mostly about the latter, although better code scanning will be a big help too. A majority of financially impactful corporate breaches are due to a compromised active directory network, and a majority of security spending by non-tech companies is used to prevent those from happening. The obvious application for the next generation of ML is extremely effective EDR and active monitoring. No more lateral movement/privilege escalation on a corporate domain means no more domain wide compromise, which generally means no more e.g. big ransomware sc... (read more)

Shortform

Within the next fifteen years AI is going to briefly seem like it's solving computer security (50% chance) and then it's going to enhance attacker capabilities to the point that it causes severe economic damage (50% chance).

3Measure4d
Does "seem like it's solving computer security" look like helping develop better passively secure systems, or like actively monitoring and noticing bad actions, or both or something else?
"Just hiring people" is sometimes still actually possible

Can't believe I forgot this one. I will edit the post and add this because it's also a very common failure mode.

Newcombness of the Dining Philosophers Problem

What a funny and fascinating question. I wish I could answer it and I hope somebody else who understands the decision theory better is able to formalize the problem and do so, if it's as cool as I'm imagining it is and not somehow a trivial or incoherent thing to ask.

"Just hiring people" is sometimes still actually possible

I didn't necessarily say a person like that would be a good pick for a CISO. I am just impressed that Tesla was able to find them and hire them for the technical position that they did. It suggests competence.

2ChristianKl8d
Then I don't see how the example is relevant to the issue. The problem of Samsung isn't that they don't employ anyone who's good at design. It's that inside the organization it's impossible to give those people who are good at design the power to shape how the final design looks like in a way that's similar to Apple.
Why I Am Skeptical of AI Regulation as an X-Risk Mitigation Strategy

Let me put it like this: If a dictator suddenly rose to power in a first world country on the promise of building a Deep Learning enabled eutopia, and then used that ideology to rationalize the execution of eleven million civilians during WW3, that would be the first of a series of unlikely events to enable others to implement the taboo you seek, if you wanted to do it the way it was done the first time around.

There are intermediary positions. Capabilities engineers have sometimes been known to stop Capabilities Engineering when they realize they're going to end the world. But that particular moratorium on research was kind of a one time thing.

2TekhneMakre10d
I didn't think the taboo about human cloning was due to WW2, but if it is then I'd buy your argument more. The first few results on a quick google-scholar search for "human cloning ethics" don't seem to mention "Nazi", but I don't know, maybe that's the root of it. Edit: So for example, what if you get the Christians very upset about the creation of minds? (Not saying this is a good idea.)
Why I Am Skeptical of AI Regulation as an X-Risk Mitigation Strategy

Even thinking about trying to pull something like this could directly cause catastrophically bad outcomes.

I wouldn't go that far, but yes, some possible action plans seem particularly unlikely to be net-positive.

1Trevor110d
AI engineers seem to be a particularly sensitive area to me, they're taken very seriously as a key strategic resource.
2TekhneMakre10d
Why?
1Trevor110d
Agreed wholeheartedly. Even thinking about trying to pull something like this could directly cause catastrophically bad outcomes.
"Just hiring people" is sometimes still actually possible

This is kind of what bug bounties are! See also https://www.synack.com/red-team/. The limitation with crowdsourcing and bug bounties is that you can generally only use them to find publicly accessible technical problems with your products, and the hackers aren't allowed to do things like social engineering. I haven't heard of a consultancy that has this same policy with their pentests, but generally it'd have to be the company contracting them to come up with the compensation policy, as which assets are important and what further "compromise" entails varies between organizations.

Shortform

I was reacting more to the very detailed rules that don't (to me) match my intuitions of good commenting on LW, and the declaration of perma-bans with fairly small provocation. A lot will depend on implementation - how many comments lc allows, and how many commenters get banned.

There's practically no reason on a rationality forum for you to assert your identity or personal status over another commenter. I agree the rules I've given are very detailed. I don't agree that any of the vast majority of valuable comments on LessWrong are somehow bannable by m... (read more)

Shortform

As I said, obviously this is not a retroactively applying policy, I have not followed it until now, and I will not ban anybody for commenting differently on my posts. I'm not going to ban you pre-emptively or judge you harshly for not following all of my ridiculously complicated rules. Feel free to continue commenting on my posts as you please and just let me eventually ban you; that's honestly fine by me and you should not feel bad about it.

I personally hope they would not refuse to frontpage my posts from now on for having a restrictive comment policy when it's not obviously censoring criticism of the post itself, but I have already forfeited arbitrarily large amounts of exposure and the mods can do what they wish.

Shortform

Made an opinionated "update" for the anti-kibitzer mode script; it works for current LessWrong with its agree/disagree votes and all that jazz, fixes some longstanding bugs that break the formatting of the site and allow you to see votes in certain places, and doesn't indent usernames anymore. Install Tampermonkey and browse to this link if you'd like to use it. 

Semi-related, I am instituting a Reign Of Terror policy for my poasts/shortform, which I will update my moderation policy with. The general goal of these policies is to reduce the amount of ti... (read more)

2Dagon11d
I have no clue whether any of my previous comments on your posts will qualify me for perma-ban, but if so, please do so now, to save the trouble of future annoyance since I have no intention of changing anything. I am generally respectful, but I don't expect to fully understand these rules, let alone follow them. I have no authority over this, but I'd hope the mods choose not to frontpage anything that has a particularly odd and restrictive comment policy, or a surprisingly-large ban list.
AGI ruin scenarios are likely (and disjunctive)

Me neither, but I wanted to outline a Really Bad, detailed, pre-nanofactory scenario, since the last few times I've talked to people about this they kept either underestimating its consequences or asserting without basis that it was impossible. Also see the last paragraph.

AGI ruin scenarios are likely (and disjunctive)

Some people like to tell themselves that surely we'll get an AI warning shot and that will wake people up; but this sounds to me like wishful thinking from the world where the world has a competent response to the pandemic warning shot we just got.

When I think "AI warning shots", the warning shot I'm usually imagining involves the death of 10-50% of working-age and politically relevant people, if not from the shot itself then the social and political upheaval that happened afterwards. The "warning" in "warning shot" is "warning" that the relevant decisi... (read more)

3Daniel Kokotajlo17d
I agree that events of that magnitude would wake people up. I just don't think we'll get events of that magnitude until it's too late.
Shortform

thanks fren

Shortform

Spoilered, semi-nsfw extremely dumb question

If you've already had sex with a woman, what's the correct way to go about arranging sex again? How indirect should I actually be about it?

7French Marty19d
That is a very context dependent question. Your safest bet is to just arrange meeting her in a context where sex is a possibility (for example: "hey, do you want to go for coffee then stop at your place afterwards sometime?"). The desire to have sex isn't something you can forecast far in advance, it can quickly change just like the weather. You can have sexual conversation and establish the general desire for her to have you as a sexual partner. Essentially like saying she likes a particular restaurant but doesn't schedule going there days or even hours in advance, she's just open to going there when and if she feels the desire. As far as how to be good at sexual talk in general, unfortunately it takes careful practice. You just have to risk being akward or turning her off (within reasonable limits, don't immediately test saying something too crazy). Trial and error within reasonable bounds.
What should you change in response to an "emergency"? And AI risk

An alternative model here is one big bucket called "fucking around like you're a monkey" and then two much smaller buckets called "general resource maximizing" and "narrow resource maximizing".

Just want to mention that this was a very funny comment and I laughed at it.

Addendum: A non-magical explanation of Jeffrey Epstein

I'm confused on what exactly we're talking about anymore. The point about external compromise was just something I tagged on in response to

The FBI may not have much of a record of outside corruption

The original point I was attempting to convey was that the

long track record of corruption by the government, mainly coverups of anything and everything, but also corruption by the CIA.

occured under Hoover and I'm generally unaware of any large scale abuses of power by the FBI after his death. If you have some counterexamples or countersuspicions I am genuinely interested in hearing about them.

Don't take the organizational chart literally

Dmed; apologies for not responding earlier. Crippling adhd etc.

Addendum: A non-magical explanation of Jeffrey Epstein

I don't see how this is any way relevant to my comment.

It's relevant to your comment because the severe "internal" abuses of power you're referring to happened during J. Edgar Hoover's 50-year stint as a political supervillain. If you have later examples I haven't heard of them.

The problem is that its purpose is to control and destroy information. It is at war with humanity, including the American public.

Very edgy, but my opinion remains that life for me and many other people I know would be a lot harder without some form of federal law enforcement, even if it were abusive.

2Douglas_Knight21d
Saying that Hoover was externally compromised would be a ridiculous response to someone saying that he was internally compromised. But I wasn't talking about Hoover, because I bothered to read you before responding.
Don't take the organizational chart literally

The point I was making in the post is that they would still effectively be members of the conspiracy, and that the conspiracy is thus larger than just one or two people. A more complicated question is "would this particular party defect if they were?", which I don't really think is possible to answer without any more specific details in particular cases.

Connor Leahy on Dying with Dignity, EleutherAI and Conjecture

Is there a way we could get sure on the metaphysics here? It feels like it's an important thing to know if it actually happens to be true.

Connor Leahy on Dying with Dignity, EleutherAI and Conjecture

I funnily enough ended up retracting the comment around 9 minutes before you posted yours, triggered by this thread and the concerns you outlined about this sort of psychologizing being unproductive. I basically agree with your response.

Connor Leahy on Dying with Dignity, EleutherAI and Conjecture

You know what, I've retracted the comment because frankly you're probably right. Even if what I said is literally true, attacking the decision making architecture of Connor Leahy when he's basically doing something good is not two of (true, kind, necessary). It makes people sad, it's the kind of asymmetric justice thing I hate, and I don't want to normalize it. Even when it's attached with disclaimers or say "I'm just attacking you just cuz bro don't take it personal."

7Conor Sullivan23d
Thanks!
Connor Leahy on Dying with Dignity, EleutherAI and Conjecture

I think this comment is getting enough vote & discussion heat for me to feel the merit in clarifying with the following statements:

  • Most of the value in the world is created by ambitious people. It's not evil to have strong preferences.
  • If the average ML engineer or entrepreneur had the ethics-oracle of Connor Leahy, we would be in much better shape. Most seem to be either not f(curious,honest) enough or principled enough to even make a plausible attempt at not ending the world. Sam Altman needs to hear the above lecture-rant 10,000 times before Conno
... (read more)

I strong downvoted your comment in both dimensions because I found it disagreeable and counterproductive. This kind of "Kremlinology of the heart" is toxic and demoralizing. It's why I never ever bother to do anything motivated by altruism: because I know when I start trying to do the right thing, I'll get attacked by people who think they know what's in my heart. When I openly act in selfish self-interest, nobody has anything to say about it, but any time I do selfless things, people start questioning my motives; it's clear what I'm incentivized to do. If you really want people to do good thing, don't play status games like this. Incentivize the behavior you want.

5Tomás B.23d
Most VC-types are easier to get a hold of than you think. They're sort of in the business of being easy to get a hold of by smart weirdos. If you think you have something to say to him that might change his mind, there's a good shot he'll read your cold email.
Connor Leahy on Dying with Dignity, EleutherAI and Conjecture

Just FYI I deleted that comment before you made the reply, which is why your comment is in some sort of Twilight Zone. I also removed the quote because it does have other interpretations, though I prefer mine.

Connor Leahy on Dying with Dignity, EleutherAI and Conjecture

Edit+Disclaimer: I keep going back and forth on whether or not posting this comment was good on net. I think more people should take stabs at the alignment problem in their own idiosyncratic way, and this is a very niche criticism guarding against a hypothetical failure mode that I'm not even really sure exists. I think I'm going to settle on retracting this but leaving up because it's fundamentally criticizing someone who is doing good that I'm not doing and I don't like doing that. If you really want to read this you can figure out how to remove HTML str... (read more)

[This comment is no longer endorsed by its author]Reply

tl-dr: people change their minds, reasons why things happen are complex, we should adopt a forgiving mindset/align AI and long-term impact is hard to measure. At the bottom I try to put numbers on EleutherAI's impact and find it was plausibly net positive.

I don't think discussing whether someone really wants to do good or whether there is some (possibly unconscious?) status-optimization process is going to help us align AI.

The situation is often mixed for a lot of people, and it evolves over time. The culture we need to have on here to solve AI existential... (read more)

I think this comment is getting enough vote & discussion heat for me to feel the merit in clarifying with the following statements:

  • Most of the value in the world is created by ambitious people. It's not evil to have strong preferences.
  • If the average ML engineer or entrepreneur had the ethics-oracle of Connor Leahy, we would be in much better shape. Most seem to be either not f(curious,honest) enough or principled enough to even make a plausible attempt at not ending the world. Sam Altman needs to hear the above lecture-rant 10,000 times before Conno
... (read more)

I don't know exactly what goes into the decision between for-profit vs nonprofit, or whether Conjecture's for-profit status was the right decision, but I do want to suggest that it's not as simple as "for-profit means I plan to make money, nonprofit means I plan to benefit the world".

I used to work at a nonprofit in the military-industrial complex in the USA; there was almost no day-to-day difference between what we were doing versus what (certain units within) for-profits like Raytheon were doing. Our CEO still had a big salary, we still were under pressu... (read more)

3Not Relevant24d
Just to state a personal opinion, I think if it makes you work harder on alignment, I’m fine with that being your subconscious motivation structure. There are places where it diverges, and this sort of comment can be good in that it highlights to such people that any detrimental status seeking will be noticed and punished. But if we start scaling down how much credit people should get based on purity of subconscious heart, we’re all going to die.
Don't take the organizational chart literally

Identifying which ones make the most sense, individually and in combination, would require sifting through a lot of facts and alleged facts, and I'm sure I have more important things to do.

Just to be clear, I know you probably do, and don't mean to pick on you in particular. I had an opportunity to demonstrate a common fallacy and I took it. Your comments so far have been pretty reasonable.

Don't take the organizational chart literally

The idea that Mark Felt was mainly driven by moral considerations about Nixon's failings seems strange given how Mark Felt himself was responsible for highly illegal operations like COINTELPRO.

Perhaps there's some critical difference between the kind of criminal activity inherent in things like COINTELPRO, or NSA surveillance, and the kind of criminal activity inherent in what Nixon did, and that that difference would also apply to covering up Epstein's murder. The existence of such a distinction between agency-wide abuses of power that plausibly have s... (read more)

Don't take the organizational chart literally

When the CIA violates the US constitution most CIA officials side with the CIA and are not working to protect US from the attacks of the CIA on the constitutionally guaranteed freedom. There's little loyalty towards the constitution.

I agree.

The ideological loyalty that CIA analysts have is loyalty to CIA orthodoxy.

To some degree, yes. To some degree, no. Every government bureaucracy possesses a moral maze-like loyalty to itself. But you're making broad-based statements about the motivations of all CIA officers that I am fairly certain don't happen to be true.

Don't take the organizational chart literally

Second comment to respond to footnote:

This is most clearly evinced by CIA hiring practices... They intentionally search for blackmail and anything in an agent's history they can use against them... From their description of the interview process it seems as if they were rejected for lacking the required ideology. Both friends are decidedly un-ideological, but that is not enough for the US intelligence agencies, you must be willing to toe the line or they won't accept you.

If you are suggesting that the CIA intentionally accepts people with dark secrets,... (read more)

When the CIA violates the US constitution most CIA officials side with the CIA and are not working to protect US from the attacks of the CIA on the constitutionally guaranteed freedom. There's little loyalty towards the constitution. 

The ideological loyalty that CIA analysts have is loyalty to CIA orthodoxy. 

Don't take the organizational chart literally

As far as principles go I agree with pretty much everything you said; my analysis entirely depends on how much power and influence a potential leader has over his subordinates in practice. As a trivial case, Stalin basically was the federal government of Russia. And getting your subordinates to break the law by giving a neutral measure, say reducing the number of reported roberries, without explicitly enumerating the ways they're supposed to cheat can be an extremely effective technique for preventing any actual legal troubles, because we're not in Dath Il... (read more)

6ChristianKl25d
Nixon also said things like "Can you please give me the files of what happened around the Kennedy assassination?" which made him pretty unpopular with the CIA and FBI. The US government is currently violating laws to not give the citizens access to those files on the ground that releasing those files would have important real-world implications. The idea that Mark Felt was mainly driven by moral considerations about Nixon's failings seems strange given how Mark Felt himself was responsible for highly illegal operations like COINTELPRO.
Don't take the organizational chart literally

Heh, just made the edit that messed that up. Sorry.

Sexual Abuse attitudes might be infohazardous

There's probably an important distinction to be made between men who have such high sex drives/wide preferences that they'll sleep with anybody, and men who don't care that much about sexual violence.

Don't take the organizational chart literally

Isn't this a straw man?

I'm exaggerating for comedic effect. Obviously the entire U.S. government does not have to literally be in on the scam.

If someone powerful wanted Epstein dead, how many people does that require, and how many of them even have to know why they're doing what they're doing? It seems to me that only one person -- the murderer -- absolutely has to be in on it. Other people could get orders that sound innocuous, or maybe just a little odd, without knowing the reasons behind them.

I guess some of these orders are more suspicious than ... (read more)

1ksvanhorn16d
I assume you mean "who ordered him killed." Could be a lot more subtle than that. Just ask for the wrong video. A little mess up in procedures. Maybe some operative clandestinely gets into the system and causes some "technical errors." I'm not an expert on assassinations, and I suspect you aren't either. It seems to me that you're using the argument from lack of imagination -- "I can't think of a way to do it, therefore it can't be done." If, say, the CIA were behind Epstein and didn't want him to talk, is it unreasonable to suspect that they would know of many techniques to assassinate someone while covering their tracks that neither you nor I would have a clue about? Note that I'm not claiming that there's a strong case that Epstein was assassinated, just that it's not so easy to rule out.
4clone of saturn23d
If I were a prison guard who had just seen a well-connected group of conspirators murder someone who had become inconvenient to them and easily get away with it, it seems to me that one of the stupidest things I could possibly do would be to tell anyone about it. Why would they need to explicitly threaten me? We both understand there's no one I could "defect" to who could stop them or protect me.
Addendum: A non-magical explanation of Jeffrey Epstein

The FBI may not have much of a record of outside corruption, but it has a long track record of corruption by the government, mainly coverups of anything and everything, but also corruption by the CIA.

The "last 50 years" qualifier is important. I basically separate the FBI into two eras: The pre- and post-Hoover era. J. Edgar Hoover was definitely compromised, including, in my and many others opinions, externally by the mafia.

3Douglas_Knight23d
I don't see how this is any way relevant to my comment. I didn't say anything about the mafia 50 years ago. I seems to me like it exists for purely formal reasons, to produce the deceit that you have responded to my comment. But let me word my comment differently: the FBI is never corrupted. The problem is that its purpose is to control and destroy information. It is at war with humanity, including the American public. You can see this just by looking at its behavior in this case.
Sexual Abuse attitudes might be infohazardous

I basically agree, as far as that goes.

Sexual Abuse attitudes might be infohazardous

You are playing with words here. /u/green_leaf's point is that there are greater degrees of violation of sexual autonomy than what OP suffered through and that rape is usually used to describe something more severe. If someone used the word "rape" to describe OP's experiences out of context to me it would be meaningfully misleading.

0Slider1mo
That sounds to me like saying that it is misleading to call somebody a thief when they are a shoplifter rather than a bank robber. Or that frauding for 2 million is not really theft when bank robbing for 100 000 is what theft looks like. There are already quite a lot of intensifiers, so I have somewhat trouble thinking what would bump this to "proper" rape. It already has long duration. It already has threat of violence. I guess it doesn't have lasting injuries or threat of death. What kind of things would feel mislead about if encountering such a out of context representation? I think I am having a genuine grouping disagreement rather than just word-label disagreement. I guess this kind of stuff is what they meant when there was an issue about whether to center the criminal definition of rape around use of violence or lack of consent.
0ChristianKl1mo
While rape is usually used to describe something more severe, it's also often pointing to one experience. The OP suffered regular sexual assault over months which is worse than just having one experience of sexual assault.
Sexual Abuse attitudes might be infohazardous
lc1mo3919

This may be trivializing your experiences as well, but I think an important consideration here is that you're a man. Many of the circumstances others in this thread are citing also involve male victims.

Men tend not to react to sexual abuse the same way that women do, and there's no reason to expect that they would. Some of the reasons people aggressively protest that fact or attempt to have anecdotal evidence of it dismissed are:

  • They, male or female, genuinely would be distraught if that had happened to them, and project that strong preference onto other
... (read more)

Not sure if a single anecdote is worth anything at all, but I am a woman, and I experienced what is legally and culturally considered rape at least twice (arguably 3x), and it really didn't bother me very much (though I think different versions, e.g. more violent ones or one perpetrated by people I looked up to, would have been much more damaging). One of the people who technically raped me (it was a very drunken screwup with, I believe, no malevolent intent) is still a friend of mine. I feel scared about people finding this out about our friendship, mostl... (read more)

This is may be trivializing your experiences as well, but I think an important consideration here is that you're a man. Many of the circumstances others in this thread are citing also involve male victims.

Agreed, from an evolutionary standpoint rape is vastly more impactful for women than men, in a world with no abortion or contraception, rape means the removal of a woman's procreative agency, while it's merely very unpleasant for men, maybe on the level of being humiliated in a fist fight. The closest thing that I can think of to make myself (a man) have ... (read more)

Load More