LESSWRONG
LW

1436
Mateusz Bagiński
2288Ω481558223
Message
Dialogue
Subscribe

Agent foundations, AI macrostrategy, civilizational sanity, human enhancement.

I endorse and operate by Crocker's rules.

I have not signed any agreements whose existence I cannot mention.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
2Mateusz Bagiński's Shortform
3y
35
Tomás B.'s Shortform
Mateusz Bagiński1d20

AFAIK, it is best (expected-outcomes-wise) to be short, but for "boring genetic reasons" (as opposed to (genetic or non-genetic) disease reasons), because fewer cells means smaller propensity to develop cancer and a bunch more stuff (holding everything else constant).

Reply
People Seem Funny In The Head About Subtle Signals
Mateusz Bagiński1d77

This and/or "selecting for a partner who is good at reading my signals" and/or plausible deniability is my go-to explanation for the dating case, but I don't think it applies to everything discussed in this post or even to all the stuff in the dating case.

Reply
People Seem Funny In The Head About Subtle Signals
Mateusz Bagiński2d40
  1. 40%?
  2. No, my impression has always been that you aim for comfy clothes.
    1. Maybe modulo cases of you wearing an AI Safety Camp t-shirt or something like that.
    2. Maybe you're kinda trying to signal preference for comfy clothes in addition to that by deliberately trying to choose clothes that someone would choose iff they prioritize comfiness above all else. Not that I have any specific evidence of that, just putting a hypothesis on the table.

       

Reply1
Mateusz Bagiński's Shortform
Mateusz Bagiński2d150

In his MLST podcast appearance in early 2023, Connor Leahy describes Alfred Korzybski as a sort of "rationalist before the rationalists":

Funny story: rationalists actually did exist, technically, before or around World War One. So, there is a Polish nobleman named Alfred Korzybski who, after seeing horrors of World War One, thought that as technology keeps improving, well, wisdom's not improving, then the world will end and all humans will be eradicated, so we must focus on producing human rationality in order to prevent this existential catastrophe. This is a real person who really lived and he actually sat down for like 10 years to like figure out how to like solve all human rationality God bless his autistic soul. You know, he failed obviously but you know you can see that the idea is not new in this regard.

Korzybski's two published books are Manhood of Humanity (1921) and Science and Sanity (1933).

E. P. Dutton published Korzybski's first book, Manhood of Humanity, in 1921. In this work he proposed and explained in detail a new theory of humankind: mankind as a "time-binding" class of life (humans perform time binding by the transmission of knowledge and abstractions through time which become accreted in cultures).

Having read the book (and having filtered it through some of my own interpretaion of it and perhaps some steelmanning) I am inclined to interpret his "time-binding" as something like (1) accumulation of knowledge from past experience across time windows that are inaccessible to any other animals (both individual (long childhoods) and cultural learning); and (2) the ability to predict and influence the future. This gets close in the neighborhood of "agency as time-travel", consequentialist cognition, etc.

In the wiki page of his other book:

His best known dictum is "The map is not the territory": He argued that most people confuse reality with its conceptual model.

(But that is relatively well-known.)

Korzybski intended the book to serve as a training manual. In 1948, Korzybski authorized publication of Selections from Science and Sanity after educators voiced concerns that at more than 800 pages, the full book was too bulky and expensive.

As Connor said...

God bless his autistic soul. You know, he failed obviously but

...but 60 years later, his project would be restarted.


See also: https://www.lesswrong.com/posts/qc7P2NwfxQMC3hdgm/rationalism-before-the-sequences 

Reply
Wei Dai's Shortform
Mateusz Bagiński3d20

[Tangent:]

There is a sort of upside to this, in that to the extent that people are more inclined to post shortforms than longforms due to the lower perceived/expected effort of the former, there is a possibility of (optional?) UX engineering to make writing longforms feel a bit more like writing shortforms, so that people who have something to write but also have a feeling of "ugh, that would be a lot of effort, I'll do it when I'm not as tired [or whatever]" would be more inclined to write and post it.

Relatedly, every few days, I find myself writing some long and detailed message in a DM, which I would be less motivated to write in my personal notes, let alone write a blog post about it, and sometimes the message turns out to look like a first draft of a blog post.[1] How to hijack this with UX?[2]

  1. ^

    After I started talking about it, I found out that apparently "write an article like a message to an intellectual-peer friend" is something like a folk advice.

  2. ^

    Of course, also: How to hijack this with stuff other than UX?

Reply
Intentionality
Mateusz Bagiński3d20

"Intentionality" fits somewhat nicely Michael Bratman's view of intentions as partial plans: you fix some aspect of your policy to satisfy a desire, so that you are robust against noisy perturbations (noisy signals, moments of "weakness of will", etc), can use the belief that you're going to behave in a certain way as an input to your further decisions and beliefs (as well as other agents' precommitments), not have to precompute everything in runtime, etc.[1]

A downside of the word is that it collides in the namespace with how "intentionality" is typically used in philosophy of mind, close to referentiality (cf. Tomasello's shared intentionality).

Perhaps the concept of "deliberation" from LOGI is trying to point in this direction, although it covers more stuff than consulting explicit representations.

The human mind, owing to its accretive evolutionary origin, has several major distinct candidates for the mind’s “center of gravity.” For example, the limbic system is an evolutionarily ancient part of the brain that now coordinates activities in many of the other systems that later grew up around it. However, in (cautiously) considering what a more foresightful and less accretive design for intelligence might look like, I find that a single center of gravity stands out as having the most complexity and doing most of the substantive work of intelligence, such that in an AI, to an even greater degree than in humans, this center of gravity would probably become the central supersystem of the mind. This center of gravity is the cognitive superprocess which is introspectively observed by humans through the internal narrative—the process whose workings are reflected in the mental sentences that we internally “speak” and internally “hear” when thinking about a problem. To avoid the awkward phrase “stream of consciousness” and the loaded word “consciousness,” this cognitive superprocess will hereafter be referred to as deliberation.

[ ... ]

Deliberation describes the activities carried out by patterns of thoughts. The patterns in deliberation are not just epiphenomenal properties of thought sequences; the deliberation level is a complete layer of organization, with complexity specific to that layer. In a deliberative AI, it is patterns of thoughts that plan and design, transforming abstract high-level goal patterns into specific low-level goal patterns; it is patterns of thoughts that reason from current knowledge to predictions about unknown variables or future sensory data; it is patterns of thoughts that reason about unexplained observations to invent hypotheses about possible causes. In general, deliberation uses organized sequences of thoughts to solve knowledge problems in the pursuit of real-world goals.

Cf. https://www.lesswrong.com/w/deliberate-practice. Wiktionary defines "deliberate" in terms of "intentional": https://en.wiktionary.org/wiki/deliberate#Adjective. 

  1. ^

    At least that's the Bratman-adjacent view of intention that I have.

Reply
Trying to understand my own cognitive edge
Mateusz Bagiński5d20

Thanks!

The entire thing seems to have a very https://www.lesswrong.com/posts/bhLxWTkRc8GXunFcB/what-are-you-tracking-in-your-head vibes, though that's admittedly not very specific.

What stands out to me in the b-money case is that you kept tabs on "what the thing is for"/"the actual function of the thing"/"what role it is serving in the economy", which helped you figure out how to make a significant improvement.

Very speculatively, maybe something similar was going on in the UDT case? If the ideal platonic theory of decision-making "should" tell you and your alt-timeline-selves how to act in a way that coheres (~adds up to something coherent?) across the multiverse or whatever, then it's possible that having anthropics as the initial motivation helped.

Reply
Trying to understand my own cognitive edge
Mateusz Bagiński5d40

the main thing that appears to have happened is that I had exceptional intuitions about what problems/fields/approaches were important and promising

I'd like to double-click on your exceptional intuitions, though I don't know what questions would be most revealing if answered. Maybe: could you elaborate on what you saw that others didn't see and that made you propose b-money, UDT, the need for an AI pause/slowdown, etc?

E.g., what's your guess re what Eliezer was missing (in his intuitions?) in that he came up with TDT but not UDT? Follow-up: Do you remember what the trace was that led you from TDT to UDT? (If you don't, what's your best guess what it was?)

Reply
leogao's Shortform
Mateusz Bagiński7d40

And many Santa Fe people more generally, e.g., https://www.sfipress.org/books/history-big-history-metahistory or https://www.amazon.com/Scale-Universal-Innovation-Sustainability-Organisms/dp/1594205582 or https://www.amazon.com/Making-Sense-Chaos-Better-Economics/dp/0300273770 

Reply
The Doomers Were Right
Mateusz Bagiński15d3-2

I think the downvotes are from the general norm of not posting comments with memes as the only/main content.

Reply1
Load More
83Reasons to sign a statement to ban superintelligence (+ FAQ for those on the fence)
26d
4
237Safety researchers should take a public stance
2mo
65
23Counter-considerations on AI arms races
6mo
0
14Comprehensive up-to-date resources on the Chinese Communist Party's AI strategy, etc?
Q
7mo
Q
6
35Goodhart Typology via Structure, Function, and Randomness Distributions
7mo
1
24Bounded AI might be viable
8mo
4
57Less Anti-Dakka
1y
5
9Some Problems with Ordinal Optimization Frame
2y
0
7What are the weirdest things a human may want for their own sake?
Q
2y
Q
16
26Three Types of Constraints in the Space of Agents
Ω
2y
Ω
3
Load More
Sufficiently optimized agents appear coherent
3 days ago
(+99/-75)
Sufficiently optimized agents appear coherent
3 days ago
Sufficiently optimized agents appear coherent
3 days ago
(+214/-164)
Sufficiently optimized agents appear coherent
3 days ago
(-27)
Relevant powerful agents will be highly optimized
3 days ago
(+19/-47)
Relevant powerful agents will be highly optimized
3 days ago
(+70/-75)
5-and-10
4 months ago
Alien Values
6 months ago
(+23/-22)
Corrigibility
8 months ago
(+119)
Corrigibility
8 months ago
(+12/-13)
Load More