Wiki Contributions

Comments

How would AI or gene editing make a difference to this?

Wondering why this has so many disagreement votes. Perhaps people don't like to see the serious topic of "how much time do we have left", alongside evidence that there's a population of AI entrepreneurs who are so far removed from consensus reality, that they now think they're living in a simulation. 

(edit: The disagreement for @JenniferRM's comment was at something like -7. Two days later, it's at -2)

For those who are interested, here is a summary of posts by @False Name due to Claude Pro: 

  1. "Kolmogorov Complexity and Simulation Hypothesis": Proposes that if we're in a simulation, a Theory of Everything (ToE) should be obtainable, and if no ToE is found, we're not simulated. Suggests using Kolmogorov complexity to model accessibility between possible worlds.
  2. "Contrary to List of Lethality's point 22, alignment's door number 2": Critiques CEV and corrigibility as unobtainable, proposing an alternative based on a refutation of Kant's categorical imperative, aiming to ensure the possibility of good through "Going-on".
  3. "Crypto-currency as pro-alignment mechanism": Suggests pegging cryptocurrency value to free energy or negentropy to encourage pro-existential and sustainable behavior.
  4. "What 'upside' of AI?": Argues that anthropic values are insufficient for alignment, as they change with knowledge and AI's actions, proposing non-anthropic considerations instead.
  5. "Two Reasons for no Utilitarianism": Critiques utilitarianism due to arbitrary values cancelling each other out, the need for valuing over obtaining values, and the possibility of modifying human goals rather than fulfilling them.
  6. "Contra-Wittgenstein; no postmodernism": Refutes Wittgenstein's and postmodernism's language-dependent meaning using the concept of abstract blocks, advocating for an "object language" for reasoning.
  7. "Contra-Berkeley": Refutes Berkeley's idealism by showing contradictions in both cases of a deity perceiving or not perceiving itself.
  8. "What about an AI that's SUPPOSED to kill us (not ChaosGPT; only on paper)?": Proposes designing a hypothetical "Everything-Killer" AI to study goal-content integrity and instrumental convergence, without actually implementing it.
  9. "Introspective Bayes": Attempts to demonstrate limitations of an optimal Bayesian agent by applying Cantor's paradox to possible worlds, questioning the agent's priors and probability assignments.
  10. "Worldwork for Ethics": Presents an alternative to CEV and corrigibility based on a refutation of Kant's categorical imperative, proposing an ethic of "Going-on" to ensure the possibility of good, with suggestions for implementation in AI systems.
  11. "A Challenge to Effective Altruism's Premises": Argues that Effective Altruism (EA) is contradictory and ineffectual because it relies on the current systems that encourage existential risk, and the lives saved by EA will likely perpetuate these risk-encouraging systems.
  12. "Impossibility of Anthropocentric-Alignment": Demonstrates the impossibility of aligning AI with human values by showing the incommensurability between the "want space" (human desires) and the "action space" (possible actions), using vector space analysis.
  13. "What's Your Best AI Safety 'Quip'?": Seeks a concise and memorable way to frame the unsolved alignment problem to the general public, similar to how a quip advanced gay rights by highlighting the lack of choice in sexual orientation.
  14. "Mercy to the Machine: Thoughts & Rights": Discusses methods for determining if AI is "thinking" independently, the potential for self-concepts and emergent ethics in AI systems, and argues for granting rights to AI to prevent their suffering, even if their consciousness is uncertain.

I offer, no consensus, but my own opinions: 

Will AI get takeover capability? When?

0-5 years.

Single ASI or many AGIs?

There will be a first ASI that "rules the world" because its algorithm or architecture is so superior. If there are further ASIs, that will be because the first ASI wants there to be. 

Will we solve technical alignment?

Contingent. 

Value alignment, intent alignment, or CEV?

For an ASI you need the equivalent of CEV: values complete enough to govern an entire transhuman civilization. 

Defense>offense or offense>defense?

Offense wins.

Is a long-term pause achievable?

It is possible, but would require all the great powers to be convinced, and every month it is less achievable, owing to proliferation. The open sourcing of Llama-3 400b, if it happens, could be a point of no return. 

These opinions, except the first and the last, predate the LLM era, and were formed from discussions on Less Wrong and its precursors. Since ChatGPT, the public sphere has been flooded with many other points of view, e.g. that AGI is still far off, that AGI will naturally remain subservient, or that market discipline is the best way to align AGI. I can entertain these scenarios, but they still do not seem as likely as: AI will surpass us, it will take over, and this will not be friendly to humanity by default. 

I couldn't swallow Eliezer's argument, I tried to read Guzey but couldn't stay awake, Hanson's argument made me feel ill, and I'm not qualified to judge Caplan. 

Also astronomers: anything heavier than helium is a "metal"

In Engines of Creation ("Will physics again be upended?"), @Eric Drexler pointed out that prior to quantum mechanics, physics had no calculable explanations for the properties of atomic matter. "Physics was obviously and grossly incomplete... It was a gap not in the sixth place of decimals but in the first."

That gap was filled, and it's an open question whether the truth about the remaining phenomena can be known by experiment on Earth. I believe in trying to know, and it's very possible that some breakthrough in e.g. the foundations of string theory or the hard problem of consciousness, will have decisive implications for the interpretation of quantum mechanics. 

If there's an empirical breakthrough that could do it, my best guess is some quantum-gravitational explanation for the details of dark matter phenomenology. But until that happens, I think it's legitimate to think deeply about "standard model plus gravitons" and ask what it implies for ontology. 

In applied quantum physics, you have concrete situations (Stern-Gerlach experiment is a famous one), theory gives you the probabilities of outcomes, and repeating the experiment many times, gives you frequencies that converge on the probabilities. 

Can you, or Chris, or anyone, explain, in terms of some concrete situation, what you're talking about? 

Congratulations to Anthropic for getting an LLM to act as a Turing machine - though that particular achievement shouldn't be surprising. Of greater practical interest is, how efficiently can it act as a Turing machine, and how efficiently should we want it to act. After all, it's far more efficient to implement your Turing machine as a few lines of specialized code. 

On the other hand, the ability to be a (universal) Turing machine could, in principle, be the foundation of the ability to reliably perform complex rigorous calculation and cognition - the kind of tasks where there is an exact right answer, or exact constraints on what is a valid next step, and so the ability to pattern-match plausibly is not enough. And that is what people always say is missing from LLMs. 

I also note the claim that "given only existing tapes, it learns the rules and computes new sequences correctly". Arguably this ability is even more important than the ability to follow rules exactly, since this ability is about discovering unknown exact rules, i.e., the LLM inventing new exact models and theories. But there are bounds on the ability to extrapolate sequences correctly (e.g. complexity bounds), so it would be interesting to know how closely Claude approaches those bounds. 

Standard model coupled to gravitons is already kind of a unified theory. There are phenomena at the edges (neutrino mass, dark matter, dark energy) which don't have a consensus explanation, as well as unresolved theoretical issues (Higgs finetuning, quantum gravity at high energies), but a well-defined "theory of almost everything" does already exist for accessible energies. 

Load More