All of Muyyd's Comments + Replies

Here goes my one comment for a day. Or may be not? Who knows? It is not like i can look up my restrictions or their history in my account page. I will have to make two comments to figure out if there a some changes. 

One comment per day is heavily discuraging to participation. 

Here how you can talk about bombing unlicensed datacenters without using "strike" and "bombing".

If we can probe the thoughts and motivations of an AI and discover wow, actually GPT-6 is planning to takeover the world if it ever gets the chance. That would be an incredibly valuable thing for governments to coordinate around because it would remove a lot of the uncertainty, it would be easier to agree that this was important, to have more give on other dimensions and to have mutual trust that the other side actually also cares about this because you can't al

... (read more)
I think it's pretty easy to talk about bombing without saying "bombing", it's just... less clear. (depending on how you do it and how sustained it is, it feels orwellian and dishonest. I think Carl's phrasing here is fine but I do want someone somewhere being clear about what'd be required) (It seems plausibly an actually-good strategy to have Eliezer off saying extreme/clear things and moving the overton window while other people say reasonable-at-first-glance sounding things)

The thing was already an obscene 7 hours with a focus on intelligence explosion and mechanics of AI takeover (which are under-discussed in the discourse and easy to improve on, so I wanted to get concrete details out). More detail on alignment plans and human-AI joint societies are planned focus areas for the next times I do podcasts.

How slow humans perception comparing to AI? Is it a pure difference of "signal speed of neurons" and "signal speed of copper/aluminum"?

It's hard to say. This CLR article lists some advantages that artificial systems have over humans. Also see this section of 80k's interview with Richard Ngo:

Absent hypotheses do not produce evidence. Often (in some cases you can notice confusion but it is hard thing to notice until it up in your face) you need to have a hypothesis that favor certain observation as evidence to even observe it, to notice it. It is source for a lot of misunderstandings (along with a stupid priors of course). If you forget that other people can be tired or in pain or in a hurry, it is really easy to interpret harshness as evidence in favor of "they dont like me" (they can still be in a hurry and dislike you, but well...) and be do... (read more)

Did not expect to see such strawmanning from Hanson. I can easily imagine a post with less misrepresentation. Something like this.

Yudkowsky and the signatories to the moratorium petition worry most about AIs getting “out of control.” At the moment, AIs are not powerful enough to cause us harm, and we hardly know anything about the structures and uses of future AIs that might cause bigger problems. But instead of waiting to deal with such problems when we understand them better and can envision them more concretely later, AI “doomers” want to redirect most

... (read more)

Evolutionary metaphors is about huge differences between evolutionary pressure in ancestral environment and what we have now: ice cream, transgenders, lesswrong, LLMs, condoms and other contraceptives. What kind of "ice cream" AGI and ASI will make for itself? May be it can be made out of humans, put them in vats and let them dream inputs for GPT10?

Mimicry is product of evolution too. Also - social mimicry.

I have thoughts about reasons for AI to evolve human-like morality too. But i also have thoughts like "this coin turned up heads 3 times in a row, so it must turn tails next". 

>But again... does this really translate to a proportional probability of doom?

If you buy a lottery ticket and get all (all out of n) numbers right, then you have glorious transhumanists utopia (still some people will get very upset). And if you get wrong a single number, then you get a weirdtopia and may be distopia. There is an unknown quantity of numbers to guess, and single ticket cost a billion now (and here enters the discrepancy). Where i get so many losing tickets? From Mind Design Space. There is also and alternative that suggests that space of... (read more)

From how discrepancy between temp/resources allocated to alignment research and capability research looks to lay person (to me), the doom scenario is closer to a lottery than to a story. I don't see why it would be winning number. I 99,999 sure that ASI will be proactive (and all kind of synonyms to this word). It all mostly can be summarised with "fast takeoff" and "human values are fragile". 

I do find the discrepancy deeply worrying, and have argued before that calling for more safety funding (and potentially engaging in civil disobedience for it) may be one of the most realistic and effectual goals for AI safety activism. I do think it is ludicrous to spend so little on it in comparison. But again... does this really translate to a proportional probability of doom? I don't find it intuitively completely implausible that getting a capable AI requires more money than aligning an AI. In part because I can imagine lucky sets of coincidences that lead to AIs gaining some alignment through interaction with aligned humans and the consumption of human-aligned training data, but cannot really imagine lucky sets of coincidences that lead to humans accidentally inventing an artificial intelligence. It seems like the latter needs funding and precision in all worlds, while in the former, it would merely seem extremely desirable, not 100 % necessary - or at least not to the same degree.  (Analogously, humans have succeeded in raising ethical children, or taming animals successfully, even if they often did not really know what they were doing. However, the human track record in creating artificial life is characterised by a need for extreme precision and lengthy trial and error, and a lot of expense. I find it more plausible that a poor Frankenstein would manage to make his monster friendly by not treating it like garbage, than that a poor Frankenstein would manage to create a working zombie while poor.)

It is not really a question but i will formulate it as if it was. 
Is current LLMs are capable enough to output real rationality exercises (and not a somewhat plausible sounding nonsense) that follows natural way (you dont get a "15% of the cabs are Blue" usually) of how information is presented in life? Can it give 5 new problems to do every day so i can train my sensitivity to prior probability of outcomes? In real life you don't get percentages. Just a problems like: "can't find my wallet, was it stolen?" Can they guide through solution process?
Ther... (read more)

Is there are work being done to map (in transcommunity or somewhere else) with precise language this distinct observable objects? Or it is to early and there is no consensus?

  • Born female, calls themselves "woman" (she/her)
  • Born female, calls themselves "man" (he/him)
  • Born female, had pharmacological and/or surgical interventions, calls themselves "man" (he/him)
  • Born male, calls themselves "man" (he/him)
  • Born male, calls themselves "woman" (she/her)
  • Born male, had pharmacological and/or surgical interventions, calls themselves "woman" (she/her)

I was hoping that he meant some concrete examples but did not elaborate on this due this being letter in magazine and not a blog post. The only thing that comes to my mind in somehow measure unexpected behavior and if bridge some times lead people in circles then it will be definitely cause for concern and reevaluation of used technics.

Humans, presumably, wont have to deal with deception between themselves so if there is sufficient time they can solve Alignment. If pressed for time (as it is now) then they will have to implement less understood solutions because thats the best they will have at the time. 

Capabilities advance much faster that alignment, so there is likely no time to do meticulous research. And if you will try to use weak AIs as shortcut to outrun current "capabilities timeline" then you will somehow have to deal with suggestor and verifier problem (with much harder to verify suggestions than a simple math problems) which is not wholly about deception but also filtering somewhat working staff that may steer alignment in right direction. And may be not. 

But i agree that this collaboration will be successfully used for patchwork (because shortcuts) alignment of weak AIs to placate general public and politicians. All of this depends on how hard Alignment problem is. Hard as EY think or may be harder or easier.

Do we have an idea of how this tables about ML should look like? I dont know about ML that much.

Well, Evals and that stuff OpenAI did with predicting loss could be a starting point to work in the tables.  But we dont really know, I guess that's the point EY is trying to make. 
  • If we held anything in the nascent field of Artificial General Intelligence to the lesser standards of engineering rigor that apply to a bridge meant to carry a couple of thousand cars, the entire field would be shut down tomorrow.

What are examples that can help to see this tie more clearly? Procedures that works similarly enough to say "we do X during planning and building a bridge and if we do X in AI building...". Is there are even exist such X that can be applied to enginering a bridge and enginering an AI? 

Use tables for concrete loads and compare experimentally with the to be poured concrete, if a load its off, reject it. We dont even have the tables about ML. Start making tables, dont build big bridges until you got the fucking tables right. Enforce bridge making no larger than the Yudkowski Airstrike Threshold.
X = "use precise models".

DeepMind can't just press a button and generate a million demonstrations of scientific advances, and objectively score how useful each advance is as training data, while relying on zero human input whatsoever.

It can't now (or it can?). Is there no 100 robots in 100 10x10 meters labs trained with recreating all human technology from stone age and after? If it is cost less than 10 mil then they probably are. This is a joke but i don't know how offtarget it is.

Discussion of human generality.
It should be named Discussion of "human generality versus Artificial General Intelligence generality". And there is exist example of human generality much closer to 'okay, let me just go reprogram myself a bit, and then I'll be as adapted to this thing as I am to' which is not "i am going to read a book or 10 on this topic" but "i am going to meditate for couple of weeks to change my reward circuitry so i will be as interested in coding after as i am interested in doing all side quests in Witcher 3 now"and "i as a human have ... (read more)

Does 'ethical safe AI investments' means 'to help make AI safer and make some money at the same time'?

1Max TK8mo
This is an important question. To what degree are both of these (naturally conflicting) goals important to you? How important is making money? How important is increasing AI-safety?

But how good it can be, realistically? I will be so so much surprised if all this details wont be leaked in next week. May be they will try to make several false leaks to muddle things a bit.

6Gerald Monroe9mo
It could leak when OAI employees take an offer to work at another lab.
Strong agreement here. I find it unlikely that most of these details will still be concealed after 3 months or so, as it seems unlikely, combined, that no one will be able to infer some of these details or that there will be no leak. Regarding the original thread, I do agree that OpenAI's move to conceal the details of the model is a Good Thing, as this step is risk-reducing and creates / furthers a norm for safety in AI development that might be adopted elsewhere. Nonetheless, the information being concealed seems likely to become known soon, in my mind, for the general reasons I outlined in the previous paragraph.

Lack of clarity when i think about this limits makes hard for me to see how end result will change if we could somehow "stop discounting" them. 
It seems to my that we will have to be much more elaborete in describing parameters of this thought experiment. In particular we will have to agree on deeds and real world achivments that hypothetical AI has, so we will both agree to call it AGI (like writing interesting story and making illustrations so this particular research team now have a new revenue strem from selling it online - will this make AI an AG... (read more)

It is the very same rationale that stands behind assumptions like "why Stockfish won't execute losing set of moves" - it is just that good at chess. Or better - it is just that smart when it come down to chess.

In this thought experiment the way to go is not to "i see that AGI could likely fail at this step, therefore it will fail" but to keep thinking and inventing better moves for AGI to execute, which won't be countered as easily. It is an important part of "security mindset" and probably major reason why Eliezer speaks about lack of pessimism in the field.

1Charlie Sanders9mo
There exists a diminishing returns to thinking about moves versus performing the moves and seeing the results that the physics of the universe imposes on the moves as a consequence.  Think of it like AlphaGo - if it only ever could train itself by playing Go against actual humans, it would never have become superintelligent at Go. Manufacturing is like that - you have to play with the actual world to understand bottlenecks and challenges, not a hypothetical artificially created simulation of the world. That imposes rate-of-scaling limits that are currently being discounted. 

Both times my talks went that way (why they did not raise him good - why we could not program AI to be good; cant we keep on eye on them, and so on), but it would take to long to summarise something like 10 minutes dialog, so i am not going to do this. Sorry. 

Evolution: taste buds and ice cream, sex and condoms... This analogy always was difficult to use in my experience. A year ago i came up with less technical. KPIs (key performance indicators) as inevitable way to communicate goals (to AI) to ultra-high-IQ psycopath-genius who's into malicious compliance (kinda cant help himself being clone of Nicola Tesla, Einstain and bunch of different people, some of them probably CEO, becouse she can). 

I have used it only 2 times and it was way easier than talks about different optimisation processes. And it took me only something like 8 years to come up with!

This analogy will be better for communicating with some people, but I feel like it was the goto at some earlier point, and the evolution analogy was invented to fix some problems with this one.  IE, before "inner alignment" became a big part of the discussion, a common explanation of the alignment problem was essentially what would now be called the outer alignment problem, which is precisely that (seemingly) any goal you write down has smart-alecky misinterpretations which technically do better than the intended interpretation. This is sometimes called nearest unblocked strategy or unforseen maximum or probably other jargon I'm forgetting. The evolution analogy improves on this in some ways. I think one of the most common objections to the KPI analogy is something along the lines of "why is the AI so devoted to malicious compliance" or "why is the AI so dumb about interpreting what we ask it for". Some OK answers to this are... * Gradient descent only optimizes the loss function you give it. * The AI only knows what you tell it. * The current dominant ML paradigm is all about minimizing some formally specified loss. That's all we know how to do.  ... But responses like this are ultimately a bit misleading, since (as the Shard-theory people emphasize, and as the evolution analogy attempts to explain) what you get out of gradient descent doesn't treat loss-minimization as its utility function, and we don't know how to make AIs which just intelligently optimize some given utility (except in very well-specified problems where learning isn't needed), and the AI doesn't only know what you tell it. So for some purposes, the evolution analogy is superior. And yeah, probably neither analogy is great.
2Quintin Pope9mo
I dislike both of those analogies, since the process of training an AI has little relation with evolution, and because the psychopath one presupposes an evil disposition on the part of the AI without providing any particular reason to think AI training will result in such an outcome.