Here how you can talk about bombing unlicensed datacenters without using "strike" and "bombing".
...If we can probe the thoughts and motivations of an AI and discover wow, actually GPT-6 is planning to takeover the world if it ever gets the chance. That would be an incredibly valuable thing for governments to coordinate around because it would remove a lot of the uncertainty, it would be easier to agree that this was important, to have more give on other dimensions and to have mutual trust that the other side actually also cares about this because you can't al
The thing was already an obscene 7 hours with a focus on intelligence explosion and mechanics of AI takeover (which are under-discussed in the discourse and easy to improve on, so I wanted to get concrete details out). More detail on alignment plans and human-AI joint societies are planned focus areas for the next times I do podcasts.
How slow humans perception comparing to AI? Is it a pure difference of "signal speed of neurons" and "signal speed of copper/aluminum"?
Absent hypotheses do not produce evidence. Often (in some cases you can notice confusion but it is hard thing to notice until it up in your face) you need to have a hypothesis that favor certain observation as evidence to even observe it, to notice it. It is source for a lot of misunderstandings (along with a stupid priors of course). If you forget that other people can be tired or in pain or in a hurry, it is really easy to interpret harshness as evidence in favor of "they dont like me" (they can still be in a hurry and dislike you, but well...) and be do...
Did not expect to see such strawmanning from Hanson. I can easily imagine a post with less misrepresentation. Something like this.
...Yudkowsky and the signatories to the moratorium petition worry most about AIs getting “out of control.” At the moment, AIs are not powerful enough to cause us harm, and we hardly know anything about the structures and uses of future AIs that might cause bigger problems. But instead of waiting to deal with such problems when we understand them better and can envision them more concretely later, AI “doomers” want to redirect most
Evolutionary metaphors is about huge differences between evolutionary pressure in ancestral environment and what we have now: ice cream, transgenders, lesswrong, LLMs, condoms and other contraceptives. What kind of "ice cream" AGI and ASI will make for itself? May be it can be made out of humans, put them in vats and let them dream inputs for GPT10?
Mimicry is product of evolution too. Also - social mimicry.
I have thoughts about reasons for AI to evolve human-like morality too. But i also have thoughts like "this coin turned up heads 3 times in a row, so it must turn tails next".
>But again... does this really translate to a proportional probability of doom?
If you buy a lottery ticket and get all (all out of n) numbers right, then you have glorious transhumanists utopia (still some people will get very upset). And if you get wrong a single number, then you get a weirdtopia and may be distopia. There is an unknown quantity of numbers to guess, and single ticket cost a billion now (and here enters the discrepancy). Where i get so many losing tickets? From Mind Design Space. There is also and alternative that suggests that space of...
From how discrepancy between temp/resources allocated to alignment research and capability research looks to lay person (to me), the doom scenario is closer to a lottery than to a story. I don't see why it would be winning number. I 99,999 sure that ASI will be proactive (and all kind of synonyms to this word). It all mostly can be summarised with "fast takeoff" and "human values are fragile".
It is not really a question but i will formulate it as if it was.
Is current LLMs are capable enough to output real rationality exercises (and not a somewhat plausible sounding nonsense) that follows natural way (you dont get a "15% of the cabs are Blue" usually) of how information is presented in life? Can it give 5 new problems to do every day so i can train my sensitivity to prior probability of outcomes? In real life you don't get percentages. Just a problems like: "can't find my wallet, was it stolen?" Can they guide through solution process?
Ther...
Is there are work being done to map (in transcommunity or somewhere else) with precise language this distinct observable objects? Or it is to early and there is no consensus?
I was hoping that he meant some concrete examples but did not elaborate on this due this being letter in magazine and not a blog post. The only thing that comes to my mind in somehow measure unexpected behavior and if bridge some times lead people in circles then it will be definitely cause for concern and reevaluation of used technics.
Humans, presumably, wont have to deal with deception between themselves so if there is sufficient time they can solve Alignment. If pressed for time (as it is now) then they will have to implement less understood solutions because thats the best they will have at the time.
Capabilities advance much faster that alignment, so there is likely no time to do meticulous research. And if you will try to use weak AIs as shortcut to outrun current "capabilities timeline" then you will somehow have to deal with suggestor and verifier problem (with much harder to verify suggestions than a simple math problems) which is not wholly about deception but also filtering somewhat working staff that may steer alignment in right direction. And may be not.
But i agree that this collaboration will be successfully used for patchwork (because shortcuts) alignment of weak AIs to placate general public and politicians. All of this depends on how hard Alignment problem is. Hard as EY think or may be harder or easier.
Do we have an idea of how this tables about ML should look like? I dont know about ML that much.
What are examples that can help to see this tie more clearly? Procedures that works similarly enough to say "we do X during planning and building a bridge and if we do X in AI building...". Is there are even exist such X that can be applied to enginering a bridge and enginering an AI?
DeepMind can't just press a button and generate a million demonstrations of scientific advances, and objectively score how useful each advance is as training data, while relying on zero human input whatsoever.
It can't now (or it can?). Is there no 100 robots in 100 10x10 meters labs trained with recreating all human technology from stone age and after? If it is cost less than 10 mil then they probably are. This is a joke but i don't know how offtarget it is.
Discussion of human generality.
It should be named Discussion of "human generality versus Artificial General Intelligence generality". And there is exist example of human generality much closer to 'okay, let me just go reprogram myself a bit, and then I'll be as adapted to this thing as I am to' which is not "i am going to read a book or 10 on this topic" but "i am going to meditate for couple of weeks to change my reward circuitry so i will be as interested in coding after as i am interested in doing all side quests in Witcher 3 now"and "i as a human have ...
Does 'ethical safe AI investments' means 'to help make AI safer and make some money at the same time'?
Lack of clarity when i think about this limits makes hard for me to see how end result will change if we could somehow "stop discounting" them.
It seems to my that we will have to be much more elaborete in describing parameters of this thought experiment. In particular we will have to agree on deeds and real world achivments that hypothetical AI has, so we will both agree to call it AGI (like writing interesting story and making illustrations so this particular research team now have a new revenue strem from selling it online - will this make AI an AG...
It is the very same rationale that stands behind assumptions like "why Stockfish won't execute losing set of moves" - it is just that good at chess. Or better - it is just that smart when it come down to chess.
In this thought experiment the way to go is not to "i see that AGI could likely fail at this step, therefore it will fail" but to keep thinking and inventing better moves for AGI to execute, which won't be countered as easily. It is an important part of "security mindset" and probably major reason why Eliezer speaks about lack of pessimism in the field.
Both times my talks went that way (why they did not raise him good - why we could not program AI to be good; cant we keep on eye on them, and so on), but it would take to long to summarise something like 10 minutes dialog, so i am not going to do this. Sorry.
Evolution: taste buds and ice cream, sex and condoms... This analogy always was difficult to use in my experience. A year ago i came up with less technical. KPIs (key performance indicators) as inevitable way to communicate goals (to AI) to ultra-high-IQ psycopath-genius who's into malicious compliance (kinda cant help himself being clone of Nicola Tesla, Einstain and bunch of different people, some of them probably CEO, becouse she can).
I have used it only 2 times and it was way easier than talks about different optimisation processes. And it took me only something like 8 years to come up with!
Here goes my one comment for a day. Or may be not? Who knows? It is not like i can look up my restrictions or their history in my account page. I will have to make two comments to figure out if there a some changes.
One comment per day is heavily discuraging to participation.