Wiki Contributions

Comments

Muyyd7mo-20

Here goes my one comment for a day. Or may be not? Who knows? It is not like i can look up my restrictions or their history in my account page. I will have to make two comments to figure out if there a some changes. 

One comment per day is heavily discuraging to participation. 

Muyyd10mo61

Here how you can talk about bombing unlicensed datacenters without using "strike" and "bombing".

If we can probe the thoughts and motivations of an AI and discover wow, actually GPT-6 is planning to takeover the world if it ever gets the chance. That would be an incredibly valuable thing for governments to coordinate around because it would remove a lot of the uncertainty, it would be easier to agree that this was important, to have more give on other dimensions and to have mutual trust that the other side actually also cares about this because you can't always know what another person or another government is thinking but you can see the objective situation in which they're deciding. So if there's strong evidence in a world where there is high risk of that risk because we've been able to show actually things like the intentional planning of AIs to do a takeover or being able to show model situations on a smaller scale of that I mean not only are we more motivated to prevent it but we update to think the other side is more likely to cooperate with us and so it's doubly beneficial.

Here is alternative to dangerous experiments to develop enhanced cognition in humans. Sounds less extreme and little more doable.

Just going from less than 1% of the effort being put into AI to 5% or 10% of the effort or 50% or 90% would be an absolutely massive increase in the amount of work that has been done on alignment, on mind reading AIs in an adversarial context. 

If it's the case that as more and more of this work can be automated and say governments require that you put 50% or 90% of the budget of AI activity into these problems of make this system one that's not going to overthrow our own government or is not going to destroy the human species then the proportional increase in alignment can be very large even just within the range of what we could have done if we had been on the ball and having humanity's scientific energies going into the problem. Stuff that is not incomprehensible, that is in some sense is just doing the obvious things that we should have done.

Also pretty bizarre that in response to 

Dwarkesh Patel 02:18:27

So how do we make sure it's not the thing it learns is not to manipulate us into rewarding it when we catch it not lying but rather to universally be aligned.

Carl Shulman 02:18:41

Yeah, so this is tricky. Geoff Hinton was recently saying there is currently no known solution for this. 

The answer was: yes, but we are doing it anyway. But with a twists like adversarial examples, adversarial training and simulations. If Shulman had THE ANSWER to Alignment problem then he would not kept it secret, but i cant help but feel some disappointment, because he sounds SO hopeful and confident. I somehow expected something different than variation of "we are going to us weaker AIs to help us to align stronger AIs while trying to outrun capabilities research teams". Even if this variation (in his description) seems very sophisticated with mind reading and inducing hallucinations.

Muyyd1y10

How slow humans perception comparing to AI? Is it a pure difference of "signal speed of neurons" and "signal speed of copper/aluminum"?

Muyyd1y50

Absent hypotheses do not produce evidence. Often (in some cases you can notice confusion but it is hard thing to notice until it up in your face) you need to have a hypothesis that favor certain observation as evidence to even observe it, to notice it. It is source for a lot of misunderstandings (along with a stupid priors of course). If you forget that other people can be tired or in pain or in a hurry, it is really easy to interpret harshness as evidence in favor of "they dont like me" (they can still be in a hurry and dislike you, but well...) and be done with it. After several instances of it you will be convinced enough to make changing your mind very difficult (confirmation bias difficult) so alternatives need to be present in your mind before encounter with observation. 

Vague hypotheses ("what if we are wrong?") and negative ("what if he did not do this?") are not good at producing evidence to. To be useful they have to be precise and concrete and positive (this is easy to check in some cases by visualisation - how hard it is to do and if it even possible to visualise).

Muyyd1y-2-1

Did not expect to see such strawmanning from Hanson. I can easily imagine a post with less misrepresentation. Something like this.

Yudkowsky and the signatories to the moratorium petition worry most about AIs getting “out of control.” At the moment, AIs are not powerful enough to cause us harm, and we hardly know anything about the structures and uses of future AIs that might cause bigger problems. But instead of waiting to deal with such problems when we understand them better and can envision them more concretely later, AI “doomers” want to redirect most  if not all computational, capital and human resources from making black-boxed AIs more capable to research avenues that directed to the goal of obtaining precise understanding of inner structure of current AIs now and make this redirection enforced by law including most dire (but legal) methods of law enforcement.

instead of this (original). But that's would be a different article written by someone else. 

Yudkowsky and the signatories to the moratorium petition worry most about AIs getting “out of control.” At the moment, AIs are not powerful enough to cause us harm, and we hardly know anything about the structures and uses of future AIs that might cause bigger problems. But instead of waiting to deal with such problems when we understand them better and can envision them more concretely, AI “doomers” want stronger guarantees now.

Muyyd1y10

Evolutionary metaphors is about huge differences between evolutionary pressure in ancestral environment and what we have now: ice cream, transgenders, lesswrong, LLMs, condoms and other contraceptives. What kind of "ice cream" AGI and ASI will make for itself? May be it can be made out of humans, put them in vats and let them dream inputs for GPT10?

Mimicry is product of evolution too. Also - social mimicry.

I have thoughts about reasons for AI to evolve human-like morality too. But i also have thoughts like "this coin turned up heads 3 times in a row, so it must turn tails next". 

Muyyd1y10

>But again... does this really translate to a proportional probability of doom?

If you buy a lottery ticket and get all (all out of n) numbers right, then you have glorious transhumanists utopia (still some people will get very upset). And if you get wrong a single number, then you get a weirdtopia and may be distopia. There is an unknown quantity of numbers to guess, and single ticket cost a billion now (and here enters the discrepancy). Where i get so many losing tickets? From Mind Design Space. There is also and alternative that suggests that space of possibilities is much smaller.

It is not enough to get some alignment, and it seems that we need to get clear on difference between utility maximisers (ASI and AGI) and behavior executors (humans and dogs and monkeys). That's is where "AGI is proactive (and synonyms)" part based on.

So the probability of doom is proportioned to the probability of buying a losing (not getting all numbers right) ticket.

Muyyd1y10

From how discrepancy between temp/resources allocated to alignment research and capability research looks to lay person (to me), the doom scenario is closer to a lottery than to a story. I don't see why it would be winning number. I 99,999 sure that ASI will be proactive (and all kind of synonyms to this word). It all mostly can be summarised with "fast takeoff" and "human values are fragile". 

Muyyd1y10

It is not really a question but i will formulate it as if it was. 
Is current LLMs are capable enough to output real rationality exercises (and not a somewhat plausible sounding nonsense) that follows natural way (you dont get a "15% of the cabs are Blue" usually) of how information is presented in life? Can it give 5 new problems to do every day so i can train my sensitivity to prior probability of outcomes? In real life you don't get percentages. Just a problems like: "can't find my wallet, was it stolen?" Can they guide through solution process?
There also:

  • P(D/~H) weight of evidence assigned by alternative hypotheses types of problems
  • How to notice hidden conjunctions (as in Linda problem)
  • Less obvious problems than probability of pregnancy, given that intercourse has occurred and probability of intercourse, given that pregnancy has occurred. 
  • Syllogisms.
Muyyd1y20

Is there are work being done to map (in transcommunity or somewhere else) with precise language this distinct observable objects? Or it is to early and there is no consensus?

  • Born female, calls themselves "woman" (she/her)
  • Born female, calls themselves "man" (he/him)
  • Born female, had pharmacological and/or surgical interventions, calls themselves "man" (he/him)
  • Born male, calls themselves "man" (he/him)
  • Born male, calls themselves "woman" (she/her)
  • Born male, had pharmacological and/or surgical interventions, calls themselves "woman" (she/her)
Load More