Unfortunately also the most likely person to be in charge of an AGI company..

But I'm not likely to regularly check your blog.

Just noting in case you (or others reading) are not familiar that Substack provides an RSS feed for every blog.

Re 6:

Disclaimer: I've only read the FDT paper and did so a long time ago, so feel free to ignore this comment if it is trivially wrong.

I don't see why FDT would assume that the agent has access to its own source code and inputs as a symbol string. I think you can reason about different agents' decisions' logical correlation without it and in fact people do all the time: For example when it comes to voting, people often urge others by saying if no one voted we could not have a functional democracy or don't throw away that plastic bottle because if everyone did we would live in trash heaps, or reasoning about voting blue on pill questions on Twitter. The previous examples contain a reasoning which has the 3 key parts of FDT (as I understand it at least).

  1. Identifying the agents using these 3 steps in their reasoning. (other humans with similar cultural background resulting in a conception of morality influenced by this 3 step)
  2. Simulating the hypothetical worlds with each possible reasoning outcome and evaluating their value.
  3. Choosing the option resulting in the most value as the outcome of this reasoning process.

Of course only aspiring rationalists would call this "fdt", regular people would probably call this reasoning (a proper subset of) "being a decent person" and moral philosophers (a form of (instead of evaluating rules we evaluate possible algorithm outcomes)) "rule utilitarianism", but the reasoning is the same, no? (There is of course no (or at least very little) actual causal effect on me going to vote/throwing trash away on others and similarly very little chance of me being the deciding vote (by my calculations for an election with polling data and reasonable assumptions: even compared to the vast amount of value being at stake), so humans actually use this reasoning even if the steps are often just implied and not stated explicitly)

In conclusion, if you know something about the origins of you and other agents, you can detect logical correlations with some probability even without source codes. (In fact a source code is a special case of the general situation: if the source code is valid and you know this, you necessarily know of a causal connection between the printed out source code and the agent)

Embarrassingly minor nitpick I'm too neurotic to not mention: It's the ceil of N/2 instead of floor.

Asked 6 days ago, still no answer, yet OP commented a bunch in that time. Hmmm..

Personally, I think there are almost certainly no extraterrestrials here, so I'm not sure the 4chan post is worth reading. (I was just wondering whether the common elements were inspired by it or not.)

I'm curious, have you seen the 4chan leak before writing this (if you don't mind answering)?

I think I have a similar view to Dagon's, so let me pop in and hopefully help explain it.

I believe that when you refer to "consciousness" you are equating it with what philosophers would usually call the neural correlates of consciousness. Consciousness as used by (most) philosophers (or, and more importantly in my opinion, laypeople) refers specifically to the subjective experience, the "blueness of blue", and is inherently metaphysically queer, in this respect similar to objective, human-independent morality (realism) or non-compatibilist conception of free will. And, like those, it does not exist in the real world; people are just mistaken for various reasons. Unfortunately, unlike those, it is seemingly impossible to fully deconfuse oneself from believing consciousness exists, a quirk of our hardware is that it comes with the axiom that consciousness is real, probably because of the advantages you mention: it made reasoning/communicating about one's state easier. (Note, it's merely the false belief that consciousness exists, which is hardcoded, not consciousness itself).

Hopefully the answers to your questions are clear under this framework (we talk about consciousness, because we believe in it, we believe in it because it was useful to believe in it even though it is a false belief, humans have no direct knowledge about consciousness as knowledge requires the belief to be true, they merely have a belief, consciousness IS magic by definition, unfortunately magic does not (probably) exist)

After reading this, you might dispute the usefulness of this definition of consciousness, and I don't have much to offer. I simply dislike redefining things from their original meanings just so we can claim statements we are happier about (like compatibilist, meta-ethical expressivist, naturalist etc philosphers do).

Level of AI risk concern: medium

General level of risk tolerance in everyday life: low

Brief summary of what you do in AI: training NNs for this and that, not researching them, thought some amount about AI risk over a few years

Anything weird about you: I don't like to give too much information about myself online, but I do have a policy of answering polls I've interacted with (eg read replies) a bit to fight selection effects.

