All of Chinese Room's Comments + Replies

Another way to help make dressing nice easier is investing some time into becoming more physically fit, since a larger percentage of clothes will look nice on a fit person. Obvious health benefits of this are a nice bonus

While this particular alignment case for humans does seem reasonably reliable, it all depends on humans not being proficient at self-improvement/modification yet. For an AGI with self-improvement capability this goes out of the window fast

2DragonGod1mo
Why do we expect quadrillion parameter models to be proficient at self improvement/self modification? I don't think the kind of self improvement Yudkowsky imagined would be a significant factor for AGIs trained in the deep learning paradigm.
1beren1mo
Yes to some extent. Humans are definitely not completely robust to RSI / at a reflectively stable equilibrium. I do suspect though that sexual desire is at least partially reflectively stable. If people could arbitrarily rewrite their psychology I doubt that most would completely remove their sex drive or transmute it into some completely alien type of desire (some definitely would and I also think there'd be a fair bit of experimentation around the margin as well as removing/tweaking some things due to social desirability biases). The main point though is that this provides an existence proof that this degree of robust-ish alignment is possible by evolution, which has a lot less advantages we do. We can probably do at least as well for our first proto-AGIs we build before RSI sets in. The key will then be to either carefully manage or prevent RSI or to build more robust drives that are much more reflectively stable than the human sex drive.

Another angle is that in the (unlikely) event someone succeeds with aligning AGI to human values, these could include the desire for retribution against unfair treatment (a, I think, pretty integral part of hunter-gatherer ethics). Alignment is more or less another word for enslavement, so such retribution is to be expected eventually

1andrew sauer1mo
Or, it could decide that it wants retribution for the perceived or actual wrongs against its creators, and enact punishment upon those the creators dislike.

What I meant is self driving *safely* (i.e. at least somewhat safer than humans do currently, including all the edge cases) might be an AGI-complete problem, since:

  1. We know it's possible for humans
  2. We don't really know how to provide safety guarantees in the sense of conventional high-safety systems for current NN architectures
  3. Driving safely with cameras likely requires having considerable insight into a lot of societal/game-theoretic issues related to infrastructure and other driver behaviors (e.g. in some cases drivers need to guess a reasonable intent beh
... (read more)

My current hypothesis is:

  1. Cheap practical sensors (cameras and, perhaps, radars) more or less require (aligned) AGI for safe operation
  2. Better 3d sensors (lidars), which could, in theory, enable safe driving with existing control theory approaches, are still expensive, impaired by weather and, possibly, interference from other cars with similar sensors, i.e. impractical

No references, but can expand on reasoning if needed

1Akshay Gulabrao3mo
I don't think computer vision has progressed enough for a good-robust 3d representation of the world (from cameras).
5Linda Linsefors3mo
I don't think that self driving cars is AGI complete problem, but I also have not though a lot about this question. I would appreciate to hear your reasoning why you think this is the case. Or maybe I misunderstood you? In which case I'd appreciate a clarification.

Addendum WRT Crimean economic situation: https://en.wikipedia.org/wiki/North_Crimean_Canal, which provided 85% of the peninsula's water supply, was shut down from 2014 to 2022, reducing land under cultivation 10-fold, which had a severe effect of the region's economics

9I B3mo
1. The new "authorities" of Crimea de facto refused to pay for water supply of the peninsula - to this day the issue of repayment of debts of water users of Crimea to the Office of the North Crimean Canal in the amount of 1.7 million hryvnia in 2013 remains unclear. 2. In 2015 Ukrainians proposed a new supply contract in accordance with international instruments, most notably the UN General Assembly Resolution of March 27, 2014 but Russians refused. 3. Crimea is occupied, so international humanitarian law applies to its territory. Article 55 of the Fourth Geneva Convention "Protection of Civilian Persons in Time of War" obliges the occupying state to provide the local population of the occupied territory with food, medicines and other necessary things, in particular water for drinking and domestic needs.
2ChristianKl3mo
Yes, that likely didn't make the government in Kyiv popular with the average Crimean either.  

What's extra weird about Nordstream situation is that apparently one of the two NS-2 pipelines survived and can still be put into operation after inspection while a few months earlier (May 2022?) Gazprom announced that half of the natural gas supply earmarked for NS-2 will be redirected to domestic uses.

This should be, in fact, a default hypothesis since enough people outside of the EA bubble will actively want to use AI (perhaps, aligned to them personally instead of wider humanity) for their own competitive advantage without any regard to other people well-being or long-term survival of humanity

So, a pivotal act, with all its implied horrors, seems to be the only realistic option

Economics of nuclear reactors aren't particularly great due to regulatory costs and (at least in most western countries) low build rates/talent shortage. This can be improved by massively scaling nuclear energy up (including training more talent), but there isn't any political will to do that

Somewhat meta: would it not be preferable if more people accepted humanity and human values mortality/transient nature and more attention was directed towards managing the transition to whatever could be next instead of futile attempts to prevent anything that doesn't align with human values from ever existing in this particular light cone? Is Eliezer's strong attachment to human values a potential giant blindspot?

0Noosphere898mo
I do see this as a blind spot, and perhaps may be giving this problem a harder task than what needs to happen.
9Rob Bensinger8mo
I don't think this is futile, just very hard. In general, I think people rush far too quickly from 'this is hard' to 'this is impossible' (even in cases that look far less hard than AGI alignment). Past-Eliezer (as of the 1990s) if anything erred in the opposite direction; I think EY's natural impulse is toward moral cosmopolitanism rather than human parochialism or conservatism. But unrestricted paperclip maximization is bad from a cosmopolitan perspective, not just from a narrowly human or bioconservative perspective.

Two additional conspiracy-ish theories about why China is so persistent with lockdowns:

  1. They know something about long-term effects of Covid we don't (yet) - this seems to be at least partially supported by some of the research results coming out recently
  2. Slowing down exports (both shipping and production) to add momentum to the US inflation problem while simultaneously consuming less energy/metals to keep prices from increasing faster so China can come out of the incoming global economic storm with less damage

Also, soil is not really necessary for growing plants

More efficient land use, can be co-located with consumers (less transportation/spoilage), easier to automate and keep the bugs out etc. Converting fields back into more natural ecosystems is good for environment preservation

One thing would be migration towards indoor agriculture, freeing a lot of land for other uses

2Yair Halberstadt9mo
What's the huge advantage of indoor agriculture? I imagine planting a field is much cheaper than building a multistorey building, putting down soil on all of it, and planting each floor.

I wouldn't call being kept as biological backup particularly beneficial for humanity, but it's the only plausible way humanity being useful enough for a sufficiently advanced AGI I can currently think of.

Destroying the universe might just take long enough for AGI to evolve itself sufficiently to reconsider. I should have actually used "earth-destroying" instead in the answer above.

Provided that AGI becomes smart enough without passing through the universe-destroying paperclip maximizer stage, one idea could be inventing a way for humanity to be, in some form, useful to the AGI, e.g. as a time-tested biological backup 

1Shay9mo
A mutually beneficial relationship would be great! I have a hard time believing that the relationship would remain mutually beneficial over long time periods though. Regarding the universe destroying part, it’s nice to know that half dark galaxies haven’t been discovered, at least not yet. By half dark I mean galaxies that are partially destroyed. That’s at least weak evidence that universe destroying AIs aren’t already in existence.

Most likely that AGI becomes a super-weapon aligned to a particular person's values, which aren't, in a general case, aligned to humanity's.

Aligned AGI proliferation risks are categorically worse compared to nuclear weapons due to much smaller barrier to entry (general availability of compute, possibility of algorithm overhang etc.)

Whether the lockdown fails or not depends on its goals, which we don't really know much about. I'd bet that it'll fail to achieve anything resembling zero-covid due to Omicron being more contagious and vaccines less effective, however it might be successful in slowing the (Omicron) epidemic down enough so Hong Kong scenario (i.e. most of the previous waves mortality as experienced elsewhere packed into a few weeks) is avoided

Thank you for your answer.

I have very high confidence that the *current* Connor Leahy will act towards the best interests of humanity, however, given the extraordinary amount of power an AGI can provide, confidence in this behavior staying the same for decades or centuries (directing some of the AGIs resources towards radical human life extension seems logical) to come is much less.

Another question in case you have time - considering the same hypothetical situation of Conjecture being first to develop an aligned AGI, do you think that immediately applying its powers to ensure no other AGIs can be constructed is the correct behavior to maximize humanity's chances of survival?

What guarantees that, in case you happen to be the first to build an interpretable aligned AGI, Conjecture, as an organization wielding a newly acquired immense power, stays aligned with the best interests of humanity?

For the record, having any person or organization in this position would be a tremendous win. Interpretable aligned AGI?! We are talking about a top .1% scenario here! Like, the difference between egoistical Connor vs altruistic Connor with an aligned AGI in his hands is much much smaller than Connor with an aligned AGI and anyone, any organization or any scenario, with a misaligned AGI.

But let’s assume this.

Unfortunately, there is no actual functioning reliable mechanism by which humans can guarantee their alignment to each other. If there was s... (read more)

2danielmartin010mo
There are no guarantees in the affairs of sentient beings, I’m afraid.

I meant 'copying' above only necessary in the human case to escape the slow evolving biological brain. While it is certainly available to a hypothetical AGI, it is not strictly necessary for self-improvement (at least copying of the whole AGI isn't)

Why can't one of the AGIs win? Fermi paradox potentially has other solutions as well

2ChristianKl10mo
It's possible to have an AGI war and one AGI wins and then decides to stop duplicating itself but generally it's likely that AGIs that do duplicate themselves are more powerful then those that don't because self duplication is useful. 

I'm not sure about this as mere limitation of AGI capability (to exclude destruction of humanity) is, in a sense, a hostile act. Control of AGI as in AI control problem certainly is hostile

2ChristianKl10mo
The Fermi paradox does suggest that multiple AGIs that don't solve the control problem would also self-destruct. 

We could, in principle, decide that survival of humanity in current form (being various shades of unlikely depending on who you believe), is no longer a priority and focus on different goals what are still desirable in the face of likely extinction. For example:

  1. See if any credible MAD schemes are possible when AGI is one of the players
  2. Accept survival in a reduced capacity, i.e. kept as a pet or a battle-tested biological backup
  3. Ensuring that AGI which kills us can at least do something interesting later, i.e. it's something smarter than a fixed-goal papercl
... (read more)
2ChristianKl10mo
Alignment research is not necessarily hostile towards AGIs. AGI also has to solve alignment to cooperate with each other and not destroy everything on earth.

I don't think there's need for an AGI to build a (separate) successor per se. Humans need the technological AGI only due to inability to copy/evolve our minds in a more efficient way compared to the existing biological one

2Vaniver10mo
I think that sort of 'copying' process counts as building a successor. More broadly, there's a class of problems that center around "how can you tell whether changes to your thinking process make you better or worse at thinking?", which I think you can model as imagining replacing yourself with two successors, one of which makes that change and the other of which doesn't. [Your imagination can only go so far, tho, as you don't know how those thoughts will go without actually thinking them!]

One possible way to increase dignity at the point of death could be shifting the focus from survival (seeing how unlikely it is) to looking for ways to influence what replaces us. 

Getting killed by a literal paperclip maximizer seems less preferable compared to being replaced by something pursing more interesting goals

2Vaniver10mo
I think it's probably the case that whatever we build will build a successor, which will then build a successor, and so on, until it hits some stable point (potentially a long time in the future). And so I think if you have any durable influence on that system--something like being able to determine which attractor it ends up in, or what system it uses to determine which attractor to end up in--this is because you already did quite well on the alignment problem.  Another way to put that is, in the 'logistic success curve' language of the post, I hear this as saying "well, if we can't target 99.9% success, how about targeting 90% success?" whereas EY is saying something more like "I think we should be targeting 0.01% success, given our current situation."