avturchin

Wikitag Contributions

Comments

Sorted by

Assuming that future anthropic shadow works because of SSSA, a war with China would need to create a world with many qualified observers existing long enough to significantly outweigh the number of observers who existed before the war – but still unable to create advanced AI because of the war. A 5-year delay would not suffice – we would need a 1,000-year delay at approximately our current level of civilization.

One possible world where this might happen is one where advanced AI development is limited by constant drone warfare: drones attack any large computational centers or chip fabrication facilities. However, drone production can occur in smaller workshops, which are less vulnerable. Because of this, civilization becomes stuck at the drone complexity level.

There is at least one anthropic miracle that we can constantly observe: life on Earth has not been destroyed in the last 4 billion years by asteroids, supervolcanoes, or runaway global warming or cooling, despite changes in Solar luminosity. According to one geologist, the atmospheric stability is the most surprising aspect of this.

Meanwhile, A. Scherbakov noted that the history of the Earth’s atmosphere is strangely correlated with the solar luminosity and the history of life, which could be best explained by anthropic fine-tuning, in the article “Anthropic principle in cosmology and geology” (Shcherbakov, 1999). In particular, he wrote that the atmospheric temperature was closely preserved in the range of 10–40 °C, and on four occasions the Earth came close to a “snowball” steady-state, and on four occasions came close to turning into a water vapor greenhouse where the temperature could reach of hundreds of degrees centigrade. However, these life-ending outcomes were prevented by last-minute events such as volcanic eruptions or covering of volcanoes in the ocean by water, which regulates the CO2 level following an eruption. Such “miracles” are best explained by observation selection effects. link

A better question: can a person who is expecting to be executed sign up to cryonics?

The more AI companies suppress AI via censorship, the bigger the black market for completely uncensored models will be. Their success is therefore digging our own grave. In other words, mundane alignment has a net negative effect.

Yes. Identity is a type of change which preserves some sameness. (Exact sameness can't be human identity as only dead frozen body remains the same.) From this follows that there can be several types of identity. 

Immortality and identity. 
https://philpapers.org/rec/TURIAI-3
Abstract:
We need understanding of personal identity to develop radical life extension technologies: mind uploading, cryonics, digital immortality, and quantum (big world) immortality. A tentative solution is needed now, due to the opportunity cost of delaying indirect digital immortality and cryonics.

The main dichotomy in views on personal identity and copies can be presented as: either my copy = original or a soul exists. In other words, some non-informational identity carrier (NIIC) may exist that distinguishes the original from its exact copy. Typically, it is often claimed that NIIC is either continuity of consciousness, soul, perspective, sameness of atoms, or position in space. We create an exhaustive map of identity theories.

To resolve the main dichotomy, we must recognize that personal identity requires an overarching validating system: God, qualia world, social agreement, blockchain or evolutionary fitness. This means that we cannot solve identity without solving metaphysics (and the nature of time). It is unlikely we'll solve this before creating superintelligent AI.

Therefore, a conservative approach to personal identity is preferable: as we don't know the nature of identity, we should preserve as much as possible and avoid situations similar to Mars Teleporting unless necessary for survival.

There are several tricks which can help us answer identity-related problems without solving all needed metaphysics; these tricks are variants of the conservative approach:

  1. Mind merging: we can escape the Mars Transporter problem (even the broken one) by incorporating mind merging later.
  2. Indexical uncertainty: I should care about my copy because I don't know if I am the original or my copy.
  3. Dividing the notion of "copy" into "mirror copy," "personality-copy," and "future copy." Many paradoxes can be solved if the correct type of copy is defined.
  4. Accepting two types of identity. Human personal identity consists of two intertwined types of identity: informational identity, which predicts sameness, and identity of consciousness, which predicts what I will experience in the next moment of time.
  5. Continuity passing eventually through all possible minds. If both cyclic universe and continuity as identity are true, I will eventually become any of my copies. MWI is functionally equivalent to cyclic universe, so I will become any copy in different timesteps. Therefore, we should care about parallel copies only if future copies don't exist (though in MWI future copies always exist, plus the chance to become someone else).
  6. Self-defining and evolving identity. Another important feature of human personal identity is that it is observed and measured internally, by the identity subject himself: by redefing my identity I get the power to solve the problem. Human personal identity evolves in time, so it is not sameness. Creation of copy is a step of evolution of my identity.
  7. Preserving continuity without mind. We demonstrate that the idea of continuity of consciousness is very similar to the idea of soul, but also has several problems: Continuity also can be paradoxically preserved without preserving body and mind as a separate process like flame. We explore the connection of continuity and the nature of qualia which are always continuous between two points in time.
  8. Rainbow of qualia: a specific set of personal qualia becomes the personality carrier.

There are other possible tricks: branching identity, bundle and self-repairing identity, gradual identity, and identity in MWI. All of them do not solve the hard problem of identity.

We suggest a hypothetical test for identity theories: quantum Mars Transporter: if copy ≠ original, I will always experience broken Mars transporter. 
 

The main AI safety risk is not from LLM models, but from specific prompts and the following "chat windows" and specific agents which start from such prompts.

Moreover, a powerful enough prompt may be model-agnostic. For example, my sideloading prompt is around 200K tokens in its minimal version and works on most models, producing similar results in similarly intelligent models.

Self-evolving prompt can be written; I experimented with small versions, and it works. 

They provide more surprising information, as I understand

For an unaligned AI, it is either simulating alternative histories (which is the focus of this post) or creating material for blackmail.

For an aligned AI:
a) It may follow a different moral theory than our version of utilitarianism, in which existence is generally considered good despite moments of suffering.
b) It might aim to resurrect the dead by simulating the entirety of human history exactly, ensuring that any brief human suffering is compensated by future eternal pleasure.
c) It could attempt to cure past suffering by creating numerous simulations where any intense suffering ends quickly, so by indexical uncertainty, any person would find themselves in such a simulation.

I don't think both list compensate each other: take, for example, medicine: there are 1000 ways to die and 1000 ways to be cured – but we eventually die. 

Load More