Mateusz Bagiński — LessWrong

Agent foundations, AI macrostrategy, civilizational sanity, human enhancement.

I endorse and operate by Crocker's rules.

I have not signed any agreements whose existence I cannot mention.

I think the downvotes are from the general norm of not posting comments with memes as the only/main content.

I agree that some people have this preference ordering, but I don't know of any difference in specific actionable recommendations that would be given by "don't until safely" and "don't ever" camps.

Camp B) “Don’t race to superintelligence”: People in this group typically argue that “racing to superintelligence is bad because of Y”. Here Y is typically some combination of “uncontrollable”, “1984”, “disempowerment” and “extinction”.

The main split is about whether racing in the current regime is desirable, so both "never build ASI" and "don't build ASI until it can be done safely" fall within the scope of camp B. Call these two subcamps B1 and B2. I think B1 and B2 give the same prescriptions within the actionable timeframe.

Why do you want this notion of equivalence or adjunction, rather than the stricter notion of isomorphism of categories?

As far as I understand/can tell, the context of discovery in category theory is mostly category theorists noticing that a particular kind of abstract structure occurs in many different contexts and thus deserves a name. The context of justification in category theory is mostly category theorists using a particular definition in various downstream things and showing how things fit nicely, globally, everything being reflected in everything else/the primordial ooze, that sort of stuff.

To give an example, if you have a category with all products and coproducts, you can conceive them as functors from the product category $C \times C$ to $C$ itself. We can define a "diagonal functor", $Δ : C \to C \times C$ that just "copies" each object and morphism, $Δ A = ⟨ A, A ⟩$ . It turns out the coproduct is its left adjoint and the product is its right adjoint: $⊔ ⊣ Δ ⊣ \times$ .

Now, if you fix any particular object $X$ and think of the product as an endofunctor on $C$ , $(- \times X) : C \to C$ and of exponentiation as another endofunctor $(-)^{X} : C \to C$ , then these two again form an adjunction: $(- \times X) ⊣ (-)^{X}$ . Using the definition of an adjunction in terms of hom-set isomorphisms, this is just the currying thing: ${Hom}_{C} (A \times X, B) ≅ {Hom}_{C} (A, B^{X})$ . In fact, this adjunction can be used as the basis to define the exponential object. For example, here's an excerpt from Sheaves in Geometry and Logic.

Actually, there's a more interesting way to go from partition-as-api to partition-as-coproduct-of-inclusions. You just pull the identity morphism along the epic $e : S ↠ I$ . The pullback of the identity functor is the partition-as-coproduct-of-inclusions.

Good post!

Why did you call it "exhaustive free association"? I would lean towards something more like "arguing from (falsely complete) exhaustion".

Re it being almost good reasoning, a main thing making it good reasoning rather than bad reasoning is having a good model of the domain so that you actually have good reasons to think that your hypothesis space is exhaustive.

As far as I understand, at least one of the authors has an unusual moral philosophy such as not believing in consciousness or first-person experiences, while simultaneously believing that future AIs are automatically morally worthy simply by having goals.

[narrow point, as I agree with most of the comment]

For what it's worth, I think this seems to imply that illusionism (roughly, people who, in a meaningful sense, "don't believe in consciousness") makes people more inclined to act in ethically deranged ways, but, afaict, this mostly isn't the case, because I've known a few illusionists (was one myself until ~1 year ago) and, afaict, they were all decent people, not less decent than the average of my social surroundings.

To give an example, Dan Dennett was an illusionist and very much not a successionist. Similarly, I wouldn't expect any successionist aspirations from Keith Frankish.

There are caveats, though in that I do think that a sufficient combination of ideas which are individually fine, even plausibly true (illusionism, moral antirealism, ...), and some other stuff (character traits, paycheck, social milieu) can get people into pretty weird moral positions.

So there's steelmanning, where you construct a view that isn't your interlocutor's but is, according to you, more true / coherent / believable than your interlocutor's.

[nitpick] while also being close to your interlocutor's (perhaps so that your interlocutor's view could be the steelmanned view with added noise / passed through Chinese whispers / degenerated).

A proposed term

Exoclarification? Alloclarification? Democlarification (dēmos - "people")?

Another perhaps example, though not quite analytic philosophy, but rather a neo-religion: Discordianism.

Specifically, see here: https://en.wikipedia.org/wiki/Principia_Discordia#Overview

Computers are getting smarter and making entities smarter than yourself, which you don't understand is very unsafe.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments