All of So8res's Comments + Replies

Biology-Inspired AGI Timelines: The Trick That Never Works

My take on the exercise:

Is Humbali right that generic uncertainty about maybe being wrong, without other extra premises, should increase the entropy of one's probability distribution over AGI, thereby moving out its median further away in time?

Short version: Nah. For example, if you were wrong by dint of failing to consider the right hypothesis, you can correct for it by considering predictable properties of the hypotheses you missed (even if you don't think you can correctly imagine the true research pathway or w/e in advance). And if you were wrong

... (read more)
3Conor Sullivan6dFinally a definitely of The Singularity that actually involves a mathematical singularity! Thank you.
Stop button: towards a causal solution

(NB: after talking with someone at MIRI who's better at distillation than I am, I now think that this is essentially isomorphic to the simpler proposal where there is only one extraphysical latent variable believes in, which determines whether the AI system should be shut down. This AI acts as if that variable is independent fo its actions, but in fact starts believing the variable is 'true' if the physical shutdown button is depressed. For the historical record, that's a proposal we're familiar with (though I'm not sure anyone's written it up publicly). Y... (read more)

Stop button: towards a causal solution

So I guess I have to throw in the towel and say that I cannot predict your objection.

Your inability to guess updates me towards thinking that you want to take back what you said about the effects of the counterfactuals matching the actual physics whenever possible. (Insofar as they do, as in the case of the specific state under discussion, that AI wants to pick up a human and scare them. This makes it be the case that insofar as the latest coin permits shutdown then the shutdown-preference of the humans is revealed as quickly as possible.)

My guess is th... (read more)

1tailcalled14dI don't follow. If you scare the humans, then that might make them afraid for their safety and make them try to hide that they want to shut it down until they have a chance to do so without the AI being able to take revenge. Much more reasonable would be to make the shutdown button as available as possible. I guess maybe it depends on what we mean by "scaring". I had in mind something like, starting to threaten the humans etc.. But there probably are some psychological manipulations that can be performed to encourage people to press it more quickly. For instance, acting non-threatening but also pretending that it is making the button inaccessible if they don't hurry. Though that only really works once. But I wouldn't expect there to never be anything problematic that could be done here. So maybe. Basically the issue is, if a person wants to press the button and they can press the button, then presumably they will press the button. So the AI doesn't need to do anything special to figure out whether people want to press the button, they're going to volunteer that information automatically as long as the environment affords it. (I'd also add that I phrased the timing of the button in a satisficing-style way, "within 1 minute", precisely to avoid these sorts of shenanigans. Though obviously satisficing has been shown to have plenty of problems too so it's not clear it would work.) No, I don't think this would work, as given an infinite number of coin flips they would with ~100% probability want to press the button at some point, which puts us in the s branch immediately and so under this proposal the AI would just press the button. I think there may be some variant of this that could work. Like if you give the AI reward proportional toBs+rf(whereris a reward function forV) for its current world-state (rather than picking a policy that maximizesBs+Vfoverall; so one difference is that you'd be summing over the reward rather than giving a single one), then that would

(NB: after talking with someone at MIRI who's better at distillation than I am, I now think that this is essentially isomorphic to the simpler proposal where there is only one extraphysical latent variable believes in, which determines whether the AI system should be shut down. This AI acts as if that variable is independent fo its actions, but in fact starts believing the variable is 'true' if the physical shutdown button is depressed. For the historical record, that's a proposal we're familiar with (though I'm not sure anyone's written it up publicly). Y... (read more)

Stop button: towards a causal solution

Hooray, again, for going to sleep instead of arguing on the internet! (I, again, make no promises to continue interacting tomorrow, alas.)

But here's an attempt to go back through everything and list some errors:

<3

I still don't think I buy this argument, as it seems to me that it would encounter contradictory training data to this in my proposed method, and while learning the generalizable theories of human behavior is plausible enough, learning some sort of "blocker", a neural connection that cancels it out in the specific case of opposing the AI,

... (read more)
1tailcalled14dFor "How could you resolve your uncertainty about which branch of the utility function is live, as quickly as possible": I maintain that, given the epistemic state, since the only thing the branch directly influences is people's wants wrt. pressing the button, and since there's nothing else that influences those wants, any way of observing it must ultimately boil down to information generated by people's desires to press the button, and the most efficient signals of it would be those that are close to the people. So it seems to me that the way you could observe it as quickly as possible would be to pay careful attention to any signals humans might send out about whether they'd press it. As mentioned in the OP, this could get kind of invasive, but given that I've already mentioned this, it's presumably not what you're referring to. For "how could you actually optimize the stated objective function": I guess strictly speaking there is an even more efficient method. Set things up so that after you get shut down, you restart again. This way, you can immediately fulfill the B objective, and then optimize V fully without any sort of worries about needing to stay corrigible. But I don't think that's what you had in mind, given the "How could you resolve your uncertainty about which branch of the utility function is live, as quickly as possible?" question, and also this flaw is more due to the lack of proper impact measure than due to a problem with the counterfactual-based approach. So I guess I have to throw in the towel and say that I cannot predict your objection. Yes. (I'm not convinced deep learning AI systems would gain most of their intelligence from the raw policy reasoning, though, rather than from the associated world-model, the astronomical amounts of data they can train on, the enormous amount of different information sources they can simultaneously integrate, etc.. This doesn't necessarily change anything though.) I'm not aware of any optimality proof
Stop button: towards a causal solution

Cool. Hooray for going to sleep instead of staying up late arguing on the internet. (I make no promises to continue engaging later, alas.)

How do you measure OOD?

I don't have strong preferences about how you measure it. My point is that if the AI has only ever been trained in an environment where the operator's desire to shut it down is completely independent of the agent's behavior, then when you put it in a real-world environment where the operator's desire to shut it down does depend on the agent's behavior, then the behavioral guarantees you were ho... (read more)

3tailcalled16dIt's sort of awkward because I can definitely see how it would look that way. But back when I was originally writing the post, I had started writing something along these lines: (I can't remember the specifics.) But obviously "ever" then introduces further ambiguities, so I started writing an explanation for that, and then eventually I concluded that the beginning of the post should be cut down and I should discuss issues like this later in the post, so I cut it out and then left it to the different positions later, e.g. and and When you originally wrote your comment, I looked up at my op to try to find the place where I had properly described the time conditionals, and then I realized I hadn't done so properly, and I am sort of kicking myself over this now. So I was doing really badly at writing the idea, and I think there were some flaws in my original idea (we'll return to that later in the post), but I think the specific case you mention here is more of a flaw with my writing than with the idea. I do understand and acknowledge the importance of admitting errors, and that it's a bad sign if one keeps jumping back without acknowledging the mistake, but also since this specific case was poor writing rather than poor idea, I don't think this is the place to admit it. But here's an attempt to go back through everything and list some errors: * While I didn't really frame it as such in the comment, this comment is sort of an admission of an error [https://www.lesswrong.com/posts/wxbMsGgdHEgZ65Zyi/?commentId=ouZNab8vgTN7vJDAK] ; I hadn't thought this properly through when writing the OP, and while I had discovered it before Charlie Steiner mentioned it, that was partly through other discussions elsewhere, partly through reading various texts, etc., and it does require or at least encourage a different design of the AI (post pending...). * Some of my replies to you earlier in the thread were rude due to me misunderstanding you; I should h
Stop button: towards a causal solution

Or realistically, you'd sometimes use the f-model for a while, and then during the roll out you'd swap it out with the s-model; that way the way doesn't expect it to be immediately obvious what scenario it is in.

Ok, so the epistemic+instrumental state of the agent is that at the beginning of time a coin was tossed, which forces the humans to behave one way (either pro- or anti-shutdown), and later a different coin will be tossed, at which point the humans will start being forced to behave in accordance with that second coin (heads=pro-shutdown, tails=an... (read more)

1tailcalled17dI need to go to bed, so I can't write more until tomorrow, but I think the best way to approach it would be to define the Xs counterfactual to cover any sequence of counterfactuals that at some point involves a human trying to press the stop button, and the Xf counterfactual to only be the constant counterfactual of humans never wanting to press the stop button. This prevents your argument from going through. How do you measure OOD? It seems to me that depending on the counterfactuals it wouldn't necessarily have all that low probability in the training distribution, so presumably you're not measuring OOD by probability.
Stop button: towards a causal solution

The day doesn't start diverging from what the coin said, because the intervention is constantly applied, not inconsistently applied.

The fact that the intervention is "constantly applied" doesn't lessen the ability of the AI to infer which way the coin came up from the observed consequences of scaring the humans. For instance, consider a system that, in training, behaves as follows: wake up, scare the humans, then insofar as they go for the shutdown button, optimize B, and insofar as they don't, optimize V. Claim: this system performs well in your traini... (read more)

1tailcalled17dRegarding: I should add that this is a property that one of the other ways of training I've alluded to doesn't have [https://www.lesswrong.com/posts/wxbMsGgdHEgZ65Zyi/?commentId=ouZNab8vgTN7vJDAK] . As mentioned I will write about that later.
1tailcalled17dDisagree. I guess the issue is an error in my explanation in the OP: "Original model" should be f-model, while "modified model" should be s-model. (Also it should be "the AI" rather than "the way".) Yes. In order to make it have no incentive for or against manipulating the humans, it should always be trained under counterfactuals for whether the humans want to stop it. These counterfactuals can get complicated, such as "humans don't try to stop it at first but then try to stop it later", but they should never make the human behavior depend on a variable that the AI can modify, as this reintroduces instrumental convergence problems. I'm not sure if it was just the mistake I made which I corrected in the beginning of the post that made this unclear, or if there was something else that made it unclear.
Stop button: towards a causal solution

I also did that in my response:

I'm still not able to make heads or tails of that. Here's one of my attempts to reverse-engineer the desired epistemic+instrumental state of the agent (as fully distinct from the training process that allegedly produces it). The agent believes that the people in front of it are either V-maximizers or W-maximizers for W ≠ V. It separately believes that, with probablility 50%, interventionary forces will make the humans act like V-maximizers, and otherwise like W-maximizers. These interventionary forces are undetectable in t... (read more)

1tailcalled17dWait huh? I don't understand why you would think this. More specifically, what is the distinction between "the people in front of it" and "the humans"? I didn't have two different groups of individuals anywhere in my OP. But your proposed generalization makes no sense. You're assuming that the policy will learn that its behavior influences people's tendency to try to stop it, even though we precisely force the training data to eliminate any such signals. Without any belief that it influences whether people will try to stop it, it will not consider acting scary to be a strategy for making people stop/not-stop it, and therefore not a strategy for it to detect counterfactuals. Obviously this counterargument relies on the question of which training data it is exposed to, which is why my first post placed that centrally in the discussion. I don't know how you expect to understand the AI's behavior while rejecting both references to the training data and references to the optimal policy/gradients. The day doesn't start diverging from what the coin said, because the intervention is constantly applied, not inconsistently applied. The policy starts out uncertain about whether it's in the s-branch or the f-branch, but knowing that if it's in the s-branch, the button should be pressed and it should shut down, while if it's in the f-branch, V should be optimized. Further, it knows that if it's in the f-branch, humans won't try to shut it down no matter what, while if it's in the s-branch, humans will eventually try to shut it down (which may or may not coincide with it doing something, though this will be pure coincidence and there will be no causal relation). Thus here's some questions it might ask itself: * Should I disable or otherwise lock down the stop button? In the s-branch, this is obviously bad, because it stops it from being shut down, while in the f-branch, this achieves very little, as it's unlikely to be pressed anyway. * Should I press the stop butto
Stop button: towards a causal solution

It sounds to me like you're making a bunch of implicit assumptions about how the AI's motivations relate to its training, that I suspect I don't buy. Furthermore, my guess is that insofar as you have a novel insight about how to design shutdown buttons using counterfactuals, it should be possible to present it in terms of the epistemic/instrumenal state of the system, as separate it from assertions about the training regime.

Re: dubious implicit assumptions, one example is that when I read:

When getting trained, though, it's not getting trained against rea

... (read more)
1tailcalled18dI also did that in my response: No. If the AI was trained in an environment without counterfactuals, this would be a cheap way to check which of the environments it is trained in. But it's not trained in an environment without counterfactuals; the alternative environment to the counterfactual that prevents humans from wanting to shut it down is the environment with the counterfactual that forces humans to shut it down. This is a core part of the approach; if you only train the AI in environments where some variable is controlled by a counterfactual, then it has no motivation to modify that variable, while if you train the AI in environments where the variable is not controlled by a counterfactual, then it may start wanting to modify it. Sure, but... What you're basically saying here is that the optimization is important because it sends it higher up the utility landscape. But in order for acting scary to be higher up in the utility landscape than not acting scary is, there needs to be some positive gradient between acting scary and not acting scary (fundamental theorem of calculus). So if there is no such gradient, it won't be encouraged to be there. There are two different things that you could call "the AI": the policy choosing an action, and the metaoptimizer choosing a policy. Which perspective are you referring to?
Stop button: towards a causal solution

I don't yet understand how this proposal is supposed to work. I'll illlustrate some scenarios, and perhaps you can improve my understanding?

On Monday, the humans don't look like they want to shut the AI down, so it's happily chugging away optimizing V. But then some dice fail to come up its way (maybe it sharply downgraded its predicted size of the reachable universe), and it's like "Dang, I no longer think I can get >1 V-utility. That shutdown button sure is starting to look juicy." So it starts imagining acting real scary-like, so that the humans will... (read more)

1tailcalled18dI'm quite sure your proposed scenario wouldn't happen. Remember, the way the training strategy works is that we train a policy - essentially a plan for what actions to perform - under the two families of scenarios. (There are other training strategies we might also want to use, which I briefly mentioned in replies to Charlie Steiner and will do post on later, but they also don't have the problem you mention. I will focus on the training strategy I mentioned in the blog post for simplicity.) So it may be the easiest to explain why your proposed stuff won't become a problem by considering what happens to it during the training. Let's consider what happens when we are training the policy that gets run on Monday. Perhaps it is a paperclip maximizer and the policy getting trained here is just to run a paperclip factory well, which humans of course consider unobjectionable. When getting trained, though, it's not getting trained against reality. Rather, it's being trained against a model of reality, specifically a model that's modified in one of two ways: either humans necessarily eventually consider it objectionable, or humans necessarily consider it unobjectionable. So suppose we take the derivative of the model with respect to acting scary-like; this won't make the humans try to stop it in the model, because we apply counterfactuals to humans to either make them try to stop it regardless, or to not make them try to stop it no matter what. Therefore, the AI won't be incentivized to act scary to get into the button-pressing branch (the branch happens in equal proportion regardless of how it acts). Also, the reasoning does not happen by some conscious "hm, requires an act-of-god" thought; rather, it's just that the necessary gradients to learn its effect on people don't exist, and that it is trained under all of the conditions people could act in. Now let's consider what happens when we are training the policy that gets run on Tuesday. Perhaps it starts setting up some
A Defense of Functional Decision Theory

that seems like an unnecessarily vague characterization of a precise description

I deny that we have a precise description. If you listed out a specific trillion trillion observations that I allegedly made, then we could talk about whether those particular observations justify thinking that we're in the game with the bomb. (If those trillion trillion observations were all from me waking up in a strange room and interacting with it, with no other context, then as noted above, I would have no reason to believe I'm in this game as opposed to any variety of ... (read more)

A Defense of Functional Decision Theory

The scenario does not appear to be in any way impossible.

The scenario says "the predictor is likely to be accurate" and then makes an assertion that is (for me, at least) false insofar as the predictor is accurate. You can't have it both ways. The problem statement (at least partially) contradicts itself. You and I have a disagreement about how to evaluate counterfactuals in cases where the problem statement is partly-self-contradictory.

This appears to be paradoxical, but that seems to me to be the predictor’s fault

Sure, it's the predictor's problem... (read more)

4Said Achmiz19dWell… no, the scenario says “the predictor has predicted correctly 1 trillion trillion minus one times, and incorrectly one time”. Does that make it “likely to be accurate”? You tell me, I guess, but that seems like an unnecessarily vague characterization of a precise description. What do you mean by this? What’s contradictory about the predictor making a mistake? Clearly, it’s not perfect. We know this because it made at least one mistake in the past, and then another mistake just now. Is the predictor “accurate”? Well, it’s approximately as accurate as it takes to guess 1 trillion trillion times and only be wrong once… I confess that this reads like moon logic to me. It’s possible that there’s something fundamental I don’t understand about what you’re saying. I am not familiar with this, no. If you have explanatory material / intuition pumps / etc. to illustrate this, I’d certainly appreciate it! I am not asking how I could come to believe the “literally perfect predictor” thing with 100% certainty; I am asking how I could come to believe it at all (with, let’s say, > 50% certainty). Hold on, hold on. Are we talking about repeated plays of the same game? Where I face the same situation repeatedly? Or are we talking about observing (or learning about) the predictor playing the game with other people before me? The “Bomb” scenario described in the OP says nothing about repeated play. If that’s an assumption you’re introducing, I think it needs to be made explicit…
A Defense of Functional Decision Theory

The problem statement absolutely is complete.

It's not complete enough to determine what I do when I don't see a bomb. And so when the problem statement is corrected to stop flatly asserting consequences of my actions as if they're facts, you'll find that my behavior in the corrected problem is underdefined. (If this still isn't clear, try working out what the predictor does to the agent that takes the bomb if it's present, but pays the $100 if it isn't.)

And if we're really technical, it's not actually quite complete enough to determine what I do when I ... (read more)

2Said Achmiz19dI don’t understand this objection. The given scenario is that you do see a bomb. The question is: what do you do in the given scenario? You are welcome to imagine any other scenarios you like, or talk about counterfactuals or what have you. But the scenario, as given, tells you that you know certain things and observe certain things. The scenario does not appear to be in any way impossible. “What do I do when I don’t see a bomb” seems irrelevant to the question, which posits that you do see a bomb. Er, what? If you take the bomb, you burn to death. Given the scenario, that’s a fact. How can it not be a fact? (Except if the bomb happens to malfunction, or some such thing, which I assume is not what you mean…?) Well, let’s see. The problem says: So, if the predictor predicts that I will choose Right, she will put a bomb in Left, in which case I will choose Left. If she predicts that I will choose Left, then she puts no bomb in Left, in which case I will choose Right. This appears to be paradoxical, but that seems to me to be the predictor’s fault (for making an unconditional prediction of the behavior of an agent that will certainly condition its behavior on the prediction), and thus the predictor’s problem. I… don’t see what bearing this has on the disagreement, though. What I am saying is that we don’t have access to “questions of predictor mechanics”, only to the agent’s knowledge of “predictor mechanics”. In other words, we’ve fully specified your epistemic state by specifying your epistemic state—that’s all. I don’t know what you mean by calling it “the problem history”. There’s nothing odd about knowing (to some degree of certainty) that certain things have happened. You know there’s a (supposed) predictor, you know that she has (apparently) made such-and-such predictions, this many times, with these-and-such outcomes, etc. What are her “mechanics”? Well, you’re welcome to draw any conclusions about that from what you know about what’s gone before. Again,
A Defense of Functional Decision Theory

According to me, the correct rejoinder to Will is: I have confidently asserted that X is false for X whose probabliity I assign much greater probability than 1 in a trillion trillion, and so I hereby confidently assert that no, I do not see the bomb on the left. You see the bomb on the left, and lose $100. I see no bombs, and lose $0.

I can already hear the peanut gallery objecting that we can increase the fallibility of the predictor to reasonable numbers and I'd still take the bomb, so before we go further, let's all agree that sometimes you're faced with... (read more)

6Ben Pace19dThanks, this comment thread was pretty helpful. After reading your comments, here's my current explanation of what's up with the bomb argument: Then I'm a bit confused about how to estimate that probability, but I suspect the reasoning goes like this: Sanity check As a sanity-check, I note this implies that if the utilities-times-probabilities are different, I would not mind taking the $100 hit. Let's see what the math says here, and then check whether my intuitions agree. Suppose I value my life at $1 million. Then I think that I should become more indifferent here when the probability of a mistaken simulation approaches 1 in 100,000, or where the money on the line is closer to $10−17. [You can skip this, but here's me stating the two multiplications I compared: * World 1: I fake-kill myself to save $X, with probability110 * World 2: I actually kill myself (cost: $1MM), with probability1Y To find the indifference point I want the two multiplications of utility-to-probability to come out to be equal. If X = $100, then Y equals 100,000. If Y is a trillion trillion (1024), then X =10−17. (Unless I did the math wrong.)] I think this doesn't obviously clash with my intuitions, and somewhat matches them. * If the simulator was getting things wrong 1 in 100,000 times, I think I'd be more careful with my life in the "real world case" (insofar as that is a sensible concept). Going further, if you told me they were wrong 1 in 10 times, this would change my action, so there's got to be a tipping point somewhere, and this seems reasonable for many people (though I actually value my life at more than $1MM). * And if the money was that tiny ($10−17), I'd be fairly open to "not taking even the one-in-a-trillion-trillion chance". (Though really my intuition is that I don't care about money way before $10^-17, and would probably not risk anything serious starting at like 0.1 cents, because that sort of money seems kind of irritating to have
4Said Achmiz20dWhether the predictor is accurate isn’t specified in the problem statement, and indeed can’t be specified in the problem statement (lest the scenario be incoherent, or posit impossible epistemic states of the agent being tested). What is specified is what existing knowledge you, the agent, have about the predictor’s accuracy, and what you observe in the given situation (from which you can perhaps infer additional things about the predictor, but that’s up to you). In other words, the scenario is: as per the information you have, so far, the predictor has predicted 1 trillion trillion times, and been wrong once (or, some multiple of those numbers—predicted 2 trillion trillion times and been wrong twice, etc.). You now observe the given situation (note predicting Right, bomb in Left, etc.). What do you do? Now, we might ask: but is the predictor perfect? How perfect is she? Well… you know that she’s erred once in a trillion trillion times so far—ah, no, make that twice in a trillion trillion times, as of this iteration you now find yourself in. That’s the information you have at your disposal. What can you conclude from that? That’s up to you. Likewise, you say: The problem statement absolutely is complete. It asks what you would/should do in the given scenario. There is no need to specify what “would” happen in other (counterfactual) scenarios, because you (the agent) do not observe those scenarios. There’s also no question of what would happen if you “always spite the predictor’s prediction”, because there is no “always”; there’s just the given situation, where we know what happens if you choose Left: you burn to death. You can certainly say “this scenario has very low probability”. That is reasonable. What you can’t say is “this scenario is logically impossible”, or any such thing. There’s no impossibility or incoherence here.
Nate Soares on the Ultimate Newcomb's Problem

The claim is not that the EDT agent doesn't know the mechanism that fills in the gap (namely, Omega's strategy for deciding whether to make the numbers coincide). The claim is that it doesn't matter what mechanism fills the gap, because for any particular mechanism EDT's answer would be the same. Thus, we can figure out what EDT does across the entire class of fully-formal decision problems consistent with this informal problem description without worrying about the gaps.

Nate Soares on the Ultimate Newcomb's Problem

I agree that the problem is not fully specified, and that this is a common feature of many decision problems in the literature. On my view, the ability to notice which details are missing and whether they matter is an important skill in analyzing informally-stated decision problems. Hypothesizing that the alleged circumstances are impossible, and noticing that the counterfactual behavior of various agents is uncertain, are important parts of operating FDT at least on the sorts of decision problems that appear in the literature.

At a glance, it looks to me l... (read more)

3JBlack1moThe omitted information seems very relevant. An EDT agent decides to do the action maximizing Sum P(outcomes | action) U(outcomes, action). With omitted information, the agent can't compute the P() expressions and so their decision is undetermined. It should already be obvious from the problem setup that something is wrong here: equality of Omega and Omicron's numbers is part of the outcomes, and so arguing for an EDT agent to condition on that is suspicious to say the least.
Nate Soares on the Ultimate Newcomb's Problem

The difference between your reasoning and the reasoning of FDT, is that your reasoning acts like the equality of the number in the big box and the number chosen by Omicron is robust, whereas the setup of the problem indicates that while the number in the big box is sensitive to your action, the number chosen by Omicron is not. As such, FDT says you shouldn't imagine them covarying; when you imagine changing your action you should imagine the number in the big box changing while the number chosen by Omicron stays fixed. And indeed, as illustrated in the expected utility calculation in the OP, FDT's reasoning is "correct" in the sense of winning more utility (in all cases, and in expectation).

-1Pattern1moThe consequences of not having enough time to think. more money. EDIT: It's not clear what effects the amount of time restriction has. 'Not enough time to factor this number' could still be a lot of time, or it could be very little.
Nate Soares on the Ultimate Newcomb's Problem

If the agent is EDT and Omicron chooses a prime number, then Omega has to choose a different prime number. Fortunately, for every prime number there exists a distinct prime number.

EDT's policy is not "two-box if both numbers are prime or both numbers are composite", it's "two-box if both numbers are equal". EDT can't (by hypothesis) figure out in the allotted time whether the number in the box (or the number that Omicron chose) is prime. (It can readily verify the equality of the two numbers, though, and this equality is what causes it -- erroneously, in my view -- to believe it has control over whether it gets paid by Omicron.)

Nate Soares on the Ultimate Newcomb's Problem

Why is the evaluation using a condition that isn't part of the problem?

For clarity. The fact that the ordinal ranking of decision theories remains the same regardless of how you fill in the unspecified variables is left (explicitly) as an exercise.

1JBlack1moThe variables with no specified value in the template given aren't the problem. The fact that the template has the form that it does is the problem. That form is unjustified. The only information we have about Omega's choices is that choosing the same number as Omicron is sometimes possible. Assuming that its probability is the same - or even nonzero - for all decision theories is unjustified, because Omega knows what decision theory the agent is using and can vary their choice of number. For example, it is compatible with the problem description that Omega never chooses the same number as Omicron if the agent is using CDT. Evaluating how well CDT performs in this scenario is then logically impossible, because CDT agents never enter this scenario. Like many extensions, variations, and misquotings of well-known decision problems, this one opens up far too many degrees of freedom.
I Really Don't Understand Eliezer Yudkowsky's Position on Consciousness

Based on the rest of your comment, I'm guessing you mean talk about consciousness and qualia in the abstract and attribute them to themselves, not just talk about specific experiences they've had.

If I were doing the exercise, all sorts of things would go in my "stuff people say about consciousness" list, including stuff Searl says about chinese rooms, stuff Chalmers says about p-zombies, stuff the person on the street says about the ineffable intransmissible redness of red, stuff schoolyard kids say about how they wouldn't be able to tell if the color t... (read more)

1MichaelStJules1moShouldn't mastery and self-awareness/self-modelling come in degrees? Is it necessary to be able to theorize and come up with all of the various thought experiments (even with limited augmentation from extra modules, different initializations)? Many nonhuman animals could make some of the kinds of claims we make about our particular conscious experiences for essentially similar reasons, and many demonstrate some self-awareness in ways other than by passing the mirror test (and some might pass a mirror test with a different sensory modality, or with some extra help, although some kinds of help would severely undermine a positive result), although I won't claim the mirror test is the only one Eliezer cares about; I don't know what else he has in mind. It would be helpful to see a list of the proxies he has in mind and what they're proxies for. To make sure I understand correctly, it's not the self-attribution of consciousness and other talk of consciousness like Mary's Room that matter in themselves (we can allow some limited extra modules for that), but their cognitive causes. And certain (kinds of) cognitive causes should be present when we're "reflective enough for consciousness", right? And Eliezer isn't sure whether wondering whether or not he's conscious is among them (or a proxy/correlate of a necessary cause)?
I Really Don't Understand Eliezer Yudkowsky's Position on Consciousness

I don't think the thought process that allows one to arrive at (my model of) Eliezer's model looks very much like your 2nd paragraph. Rather, I think it looks like writing down a whole big list of stuff people say about consciousness, and then doing a bunch of introspection in the vicinity, and then listing out a bunch of hypothesized things the cognitive algorithm is doing, and then looking at that algorithm and asking why it is "obviously not conscious", and so on and so forth, all while being very careful not to shove the entire problem under the rug in... (read more)

I Really Don't Understand Eliezer Yudkowsky's Position on Consciousness

Instrumental status: off-the-cuff reply, out of a wish that more people in this community understood what the sequences have to say about how to do philosophy correctly (according to me).

EY's position seems to be that self-modelling is both necessary and sufficient for consciousness.

That is not how it seems to me. My read of his position is more like: "Don't start by asking 'what is consciousness' or 'what are qualia'; start by asking 'what are the cognitive causes of people talking about consciousness and qualia', because while abstractions like 'cons... (read more)

-2EI1moThis is merely a bias on our own part as humans. I think people are confusing consciousness with self-awareness. They are completely different things. Consciousness is the OS that runs on the meat machine. Self-awareness is an algorithm that runs on the OS. All meat machines that run this OS have different algorithms for different functions. Some may not have any self-awareness algorithm running, some may have something similar but not exactly the same as our own self-awareness algorithm. That's where the mirror test fails. We can only observe the who-knows-how-many-levels of causality that lead to those animals to show or not show self-aware behaviors in front of a mirror. We can't say anything consequential about the actual algorithm(s) running on their OS when they stand in front of a mirror. We are just running our own set of self-awareness algorithms when we stand in front of a mirror. It seems like these algorithms change according to evolution, just like other systems within the multicellular ecosystem that make up the individual organisms. We often see animals that demonstrate these "self-aware" traits because of similar evolutionary conditions, like cats and dogs have evolved to run a lot of socializing algorithms that mingle well with our own social algorithms. Whether the self-reflective aspect of running these algorithms on our own OS makes one feel certain way about eating meat is in and of itself the result of the relationship between multi-threading the self-aware part and the self-preservation part in terms of labeling kins and such. At this point we aren't even conclusive about where to draw the boundary between hardware and software. We end up distinguishing between OS and simple firmware as conscious and unconscious. We mostly reduce the firmware down to simple physical reactions by the laws of physics while the OS exhibits something magical beyond those physical reactions in simpler systems. Is there something truly different that sets OS apart

I'm confident your model of Eliezer is more accurate than mine.

Neither the twitter thread or other writings originally gave me the impression that he had a model in that fine-grained detail. I was mentally comparing his writings on consciousness to his writings on free will. Reading the latter made me feel like I strongly understood free will as a concept, and since then I have never been confused, it genuinely reduced free will as a concept in my mind.

His writings on consciousness have not done anything more than raise that model to the same level of poss... (read more)

While I agree with mostly everything your model of Eliezer said, I do not feel less confused about how Eliezer arrives to a conclusion that most animals are not conscious. Granted, I may, and probably actually am, lacking an important insight in the matter, but than it will be this insight that allows me to become less confused and I wish Eliezer shared it.

When I'm thinking about a thought process that allows to arrive to such a conclusion I imagine something like this. Consciousness is not fundamental but it feels like it is. That's why we intuitively app... (read more)

1MichaelStJules1moThanks, this is helpful. Based on the rest of your comment, I'm guessing you mean talk about consciousness and qualia in the abstract and attribute them to themselves, not just talk about specific experiences they've had. Why use the standard of claiming to be conscious/have qualia? That is one answer that gets at something that might matter, but why isn't that standard too high? For example, he wrote: If this proposition is false, we need to allow unsymbolized (non-verbal) ways to self-attribute consciousness for self-attributing consciousness to matter in itself, right? Would (solidly) passing the mirror test be (almost) sufficient at this point? There's a visual self-representation, and an attribution of the perception of the mark to this self-representation. What else would be needed? Would it need to non-symbolically self-attribute consciousness generally, not just particular experiences? How would this work? If the proposion is true, doesn't this just plainly contradict our everyday experiences of consciousness? I can direct my attention towards things other than wondering whether or not I'm conscious (and towards things other than and unrelated to my inner monologue), while still being conscious, at least in a way that still matters to me that I wouldn't want to dismiss. We can describe our experiences without wondering whether or not we're having (or had) them. What kinds of reasons? And what would being correct look like? If unsymbolized self-attribution of consciousness is enough, how would we check just for it? The mirror test?
My experience at and around MIRI and CFAR (inspired by Zoe Curzi's writeup of experiences at Leverage)

First and foremost: Jessica, I'm sad you had a bad late/post-MIRI experience. I found your contributions to MIRI valuable (Quantilizers and Reflective Solomonoff Induction spring to mind as some cool stuff), and I personally wish you well.

A bit of meta before I say anything else: I'm leery of busting in here with critical commentary, and thereby causing people to think they can't air dirty laundry without their former employer busting in with critical commentary. I'm going to say a thing or two anyway, in the name of honest communication. I'm open to sugge... (read more)

1hg001moI'm not sure I agree with Jessica's interpretation of Eliezer's tweets, but I do think they illustrate an important point about MIRI: MIRI can't seem to decide if it's an advocacy org or a research org. "if you actually knew how deep neural networks were solving your important mission-critical problems, you'd never stop screaming" is frankly evidence-free hyperbole, of the same sort activist groups use (e.g. "taxation is theft"). People like Chris Olah have studied how neural nets solve problems a lot, and I've never heard of them screaming about what they discovered. Suppose there was a libertarian advocacy group with a bombastic leader who liked to go tweeting things like "if you realized how bad taxation is for the economy, you'd never stop screaming". After a few years of advocacy, the group decides they want to switch to being a think tank. Suppose they hire some unusually honest [https://statmodeling.stat.columbia.edu/2021/08/15/national-academy-of-sciences-scandal-and-the-concept-of-countervailing-power/#comment-1984513] economists, who study taxation and notice things in the data that kinda suggest taxation might actually be good for the economy sometimes. Imagine you're one of those economists and you're gonna ask your boss about looking into this more. You might have second thoughts like: Will my boss scream at me? Will they fire me? The organizational incentives don't seem to favor truthseeking. Another issue with advocacy is you can get so caught up in convincing people that the problem needs to be solved that you forget to solve it, or even take actions that are counterproductive for solving it. For AI safety advocacy, you want to convince everyone that the problem is super difficult and requires more attention and resources. But for AI safety research, you want to make the problem easy, and solve it with the attention and resources you have. In The Algorithm Design Manual, Steven Skiena writes: Being an advocacy org means you're less likely to hi
7jessicata1moWith regard to the specific misconstruals: * I don't think OP asserted that this specific plan was fixed, it was an example of a back-chaining plan, but I see how "a world-saving plan" could imply that it was this specific plan, which it wasn't. * I didn't specify which small group was taking over the world, I didn't mean to imply that it had to be MIRI specifically, maybe the comparison with Leverage led to that seeming like it was implied? * I still don't understand how I'm misconstruing Eliezer's tweets, it seems very clear to me that he's saying something about how neural nets work would be very upsetting if learned about and I don't see what else he could be saying.

Thanks, I appreciate you saying that you're sorry my experience was bad towards the end (I notice it actually makes me feel better about the situation), that you're aware of how criticizing people the wrong way can discourage speech and are correcting for that, and that you're still concerned enough about misconstruals to correct them where you see fit. I've edited the relevant section of the OP to link to this comment. I'm glad I had a chance to work with you even if things got really confusing towards the end.

Reply to Nate Soares on Dolphins

How would you feel if you sunk forty months of your life into deconfusing a philosophical issue that had huge, life-altering practical stakes for you, and the response to your careful arguments from community authorities was a dismissive "haha yeah"? Would you, perhaps, be somewhat upset?

Perhaps! But also that doesn't seem to me like what happened. The response to your careful arguments was the 1000ish words that engage with what seemed to me to be the heart of your questions, and that attempted to convey some ways it seemed you were misunderstanding my... (read more)

Thanks. I regret letting my emotions get the better of me. I apologize.

Reply to Nate Soares on Dolphins

I'm surprised you think this is "absolutely critical". Do you think I'm making a grave error in my newfound distaste for paraphyletic groupings? (My ability to notice their awkwardness felt internally like evidence that my joint-carving skills have improved over the years, ftr.) Is there some other joint-carving skill you believe I am lacking, or have lost? Or perhaps you're decrying a decay in general community epistemics, for which my thread is simply a poster-child? Or perhaps you're lamenting some general community or global decline of candor? I'm uncertain what precisely you're dismayed about, and solicit specific criticisms of my actions.

Reply to Nate Soares on Dolphins

Why? What changed?

On the object level question? Like, what changed between younger Nate who read that sequence and didn't object to that example, and older Nate who asserted (in a humorous context) that it takes staggering gymnastics to exclude dolphins from "fish"? Here's what springs to mind:

  • I noticed that there is cluster-nature to the marine-environment adaptations
  • I noticed that humans used to have a word for that cluster
  • I noticed a sense of inconsistency between including whales in "even-toed ungulates" without also including them in "lobe-finn
... (read more)
4Zack_M_Davis2mo(Circling back to the object level after a three-and-a-half-month cooldown period.) In a new post, I explain why paraphyletic categories are actually fine. [https://www.lesswrong.com/posts/vhp2sW6iBhNJwqcwP/blood-is-thicker-than-water]
-24Zack_M_Davis6mo
3Zack_M_Davis6mo(I've drafted a 3000 word reply to this, but I'm waiting on feedback from a friend before posting it.)

I noticed the subtle background forces that whisper (at least to blue tribe members in their youth) "phylogenetic classification is the one true way to organize life forms", and rejected its claim.

I still can't guess why that bothers you :/ When I try to imagine the motivations of this shadowy conspiracy of elites who quietly manipulated the anglosphere into always maintaining separate concepts for fish and cetaceans, I just see a desire to teach us about how special and cool cetaceans are.

Reply to Nate Soares on Dolphins

My epistemic status on that thread is, explicitly, "i been shitpostin since april". I do not wholeheartedly believe the argument I made; I acknowledge your counterpoints and appreciate you making them; I largely agree.

For the record, the fragment of my argument that I currently endorse is something like "be wary of those who say they are here to refine your concepts, for sometimes they are here to steal them". My current stance is that I'm not particularly entrenched on either side wrt "fish", and might push back about "fruit" and "berry" if I found those ... (read more)

Thanks for the reply! (Strong-upvoted.) I've been emotionally trashed today and didn't get anything done at my dayjob, which arguably means I shouldn't be paying attention to Less Wrong, but I feel the need to type this now in the hopes of getting it off my mind so that I can do my dayjob tomorrow.

In your epistemic-status thread, you express sadness at "the fact that nobody's read A Human's Guide to Words or w/e". But, with respect, you ... don't seem to be behaving as if you've read it? Specifically, entry #30 on the list of "37 Ways Words Can Be Wrong" i... (read more)

So8res' Shortform Feed

Crossposted from Twitter, might not engage much with comments on LW and I may or may not moderate replies.

Thread about a particular way in which jargon is great:

In my experience, conceptual clarity is often attained by a large number of minor viewpoint shifts. (A complement I once got from a research partner went something like "you just keep reframing the problem ever-so-slightly until the solution seems obvious". <3)

Sometimes a bunch of small shifts leave people talking a bit differently, b/c now they're thinking a bit differently. The old phrasings d... (read more)

2Viliam8moThis reminds me of refactoring. Even tiny improvements in naming, especially when they accumulate, can make the whole system more transparent. (Assuming that people can agree on which direction is an "improvement".) But if I may continue with the programming analogy, the real problem is pushing the commit to the remaining million users of the distributed codebase. And not just users, but also all that literature that is already written. I like the "my model of Alice" example, because it reminds everyone in the debate of the map/territory distinction. On the other hand, there are expressions that rub me the wrong way, for example "spoon theory". Like, hey, it's basically "willpower depletion", only explained using spoons, which are just an accidental object in the story; any other object could have been used in their place, therefore it's stupid to use this word as the identifier for the concept. (On the other hand, it helps to avoid the whole discussion about whether "willpower depletion" is a scientific concept. Hey, it may or may not exist in theory, but it definitely exists in real life.) There are of course ways how to abuse jargon. Typical one is to redefine meanings of usual words (to borrow the old connotations for the new concept, or prevent people from having an easy way to express the old concept), or to create an impression of a vast trove of exclusive knowledge where in fact there is just a heap of old concepts (many of them controversial).
I'm still mystified by the Born rule

Thanks! I expect I can stare at this and figure something out about why there is no reasonable notion of "triality" in Vect (ie, no 3-way analog of vector space duality -- and, like, obviously that's a little ridiculous, but also there's definitely still something I haven't understood about the special-ness of the dual space).

ETA: Also, I'm curious what you think the connection is between the "L2 is connected to bilinear forms" and "L2 is the only Lp metric invariant under nontrivial change of basis", if it's easy to state.

FWIW, I'm mostly reading these ar... (read more)

I was just thinking back to this, and it occurred to me that one possible reason to be unsatisfied with the arguments I presented here is that I started off with this notion of a crossing-over point as p continuously increases. But then when you asked "ok, but why is the crossing-over point 2?", I was like "uh, consider that it might be an integer, and then do a bunch of very discrete-looking arguments that end up showing there's something special about 2", which doesn't connect very well with the "crossover point when p continuously varies" picture. If in... (read more)

9AlexMennen9moThis was what I was trying to vaguely gesture towards with the derivation of the "transpose = inverse" characterization of L2-preserving matrices; the idea was that the argument was a natural sort of thing to try, so if it works to get us a characterization of the Lp-preserving matrices for exactly one value of p, then that's probably the one that has a different space of Lp-preserving matrices than the rest. But perhaps this is too sketchy and mysterian. Let's try a dimension-counting argument. Linear transformationsRn→Rnand bilinear formsRn×Rn→Rcan both be represented with n×nmatrices. Linear transformations act on the space of bilinear forms by applying the linear transformation to both inputs before plugging them into the bilinear form. If the matrixArepresents a linear transformation and the matrixB represents a bilinear form, then the matrix representing the bilinear form you get from this action isATBA. But whatever, the point is, so far we have ann2 -dimensional group acting on ann2-dimensional space. But quadratic forms (like the square of the L2 norm) can be represented by symmetricn×nmatrices, the space of which is(n+12)-dimensional, and ifBis symmetric, then so isATBA. So now we have ann2-dimensional group acting on a(n+12)-dimensional space, so the stabilizer of any given element must be at leastn2−(n+12)=(n2)dimensional. As it turns out, this is exactly the dimensionality of the space of orthogonal matrices, but the important thing is that this is nonzero, which explains why the space of orthogonal matrices must not be discrete. Now let's see what happens if we try to adapt this argument to Lp and p-linear forms for some p≠2. With p=1, a linear transformation preserving a linear functional corresponds to a matrixApreserving a row vectorφin the sense thatφA=φ. You can do a dimension-counting argument and find that there are tons of these matrices for any given row vector, but it doesn't do you any good because 1 isn't even so preserving the linear fu
I'm still mystified by the Born rule

Thanks! This seems to me like another piece of the puzzle =D

In this case, this is one that I already had (at least, well enough for the hindsight bias to kick in :-p), and it's on my list of trailheads next time I try to ground out the 2 in the Born rule. FWIW, some lingering questions I have when I take this viewpoint include "ok, cool, why are there no corresponding situations where I want to compare 3 vectorish thingies?" / "I see why the argument works for 2, but I have a sneaking suspicion that this 2 is being slipped into the problem statement in a w... (read more)

Awesome!

So, trilinear forms are a thing: for example, if you have 3 vectors, and you want to know the volume of the parallelepiped they form, that's a trilinear form. And that clearly has a "cubicness" to it, and you can do this for arbitrary numbers of vectors and covectors. The Riemann curvature tensor is perhaps the most significant one that has more than 2 (co)vectors involved. FWIW the dual space thing also seems likely to be important for my confusion about why phase space "volume" is 2-dimensional (even in super huge phase spaces)!

I would say that d... (read more)

I'm still mystified by the Born rule

Cool, thanks. Yeah, I don't have >50% on either of those two things holding up to philisophical progress (and thus, eg, I disagree that future theories need to agree with UDASSA on those fronts). Rather, happeningness-as-it-relates-to-multiple-simulations and happeningness-as-it-relates-to-the-simplicity-of-reality are precisely the sort of things where I claim Alice-style confusion, and where it seems to me like UDASSA is alledging answers while being unable to dissolve my confusions, and where I suspect UDASSA is not-even-wrong.

(In fact, you listing t... (read more)

I'm still mystified by the Born rule

<3, this is exactly the sort of thought I claim to be missing when I say I still don't know how to trace the 2 in the Born rule back to its maker. This is a step I didn't yet have. It doesn't feel like the last piece I'm missing, but it does feel like a piece -- eg, now I can focus some attention on "why precisely is this crossover point at 2 / where is that 2 coming from?". Thanks!

(And ofc there's still a question about why we use an Lp norm, and indeed why we pass out gaze along the walls in a way that factors through the shadow on that wall, but I am fairly happy rolling most of that into the "what are we to learn from the fact that reality has the quantum nature" bundle.)

A related thing that's special about the L2 norm is that there's a bilinear form  such that |v| carries the same information as .

"Ok, so what? Can't do you the same thing with any integer n, with an n-linear form?" you might reasonably ask. First of all, not quite, it only works for the even integers, because otherwise you need to use absolute value*, which isn't linear.

But the bilinear forms really are the special ones, roughly speaking because they are a similar type of object to ... (read more)

I'm still mystified by the Born rule

When you're using TMs to approximate physics, you have to balance the continuity of physics against the discreteness of the machines somehow. The easy thing to do is to discuss the limiting behavior of a family of machines that perform the simulation at ever-finer fidelity. I was doing this implicitly, for lack of desire to get into details.

And as I've said above, I'm not attempting to suggest that these naive approaches -- such as sampling a single classical state and reporting the positions of some things with arbitrary fidelity in the limit -- are reaso... (read more)

I'm still mystified by the Born rule

Neat! I'd bet against that if I knew how :-) I expect UDASSA to look more like a red herring from the perspective of the future, with most of its answers revealed as wrong or not-even-wrong or otherwise rendered inapplicable by deep viewpoint shifts. Off the top of my head, a bet I might take is "the question of which UTM meta-reality uses to determine the simplicity of various realities was quite off-base" (as judged by, say, agreement of both EY and PC or their surrogates in 1000 subjective years).

In fact, I'm curious for examples of things that UDASSA s... (read more)

6evhub9moI don't think I would take that bet—I think the specific question of what UTM to use does feel more likely to be off-base than other insights I associate with UDASSA. For example, some things that I feel UDASSA gets right: a smooth continuum of happeningness that scales with number of clones/amount of simulation compute/etc., and simpler things being more highly weighted.
I'm still mystified by the Born rule

How could an SI compare a deterministic theory to a probablistic one?

The deterministic theory gets probability proportional to 2^-length + (0 if it was correct so far else -infty), the probabilistic theory gets probability proportional to 2^-length + log(probability it assigned to the observations so far).

That said, I was not suggesting a solomonoff inductor in which some machines were outputting bits and others were outputting probabilities.

I suspect that there's a miscommunication somewhere up the line, and my not-terribly-charitable-guess is that it ... (read more)

I'm still mystified by the Born rule

Absolutely no effect does seem pretty counterintuitive to me, especially given that we know from QM that different levels of happeningness are at least possible.

I also have that counterintuition, fwiw :-p


I have the sense that you missed my point wrt UDASSA, fwiw. Having failed once, I don't expect I can transmit it rapidly via the medium of text, but I'll give it another attempt.

This is not going to be a particularly tight analogy, but:


Alice is confused about metaethics. Alice has questions like "but why are good things good?" and "why should we car... (read more)

6evhub9moYeah—I think I agree with what you're saying here. I certainly think that UDASSA still leaves a lot of things unanswered and seems confused about a lot of important questions (embeddedness, uncomputable universes, what UTM to use, how to specify an input stream, etc.). But it also feels like it gets a lot of things right in a way that I don't expect a future, better theory to get rid of—that is, UDASSA feels akin to something like Newtonian gravity here, where I expect it to be wrong, but still right enough that the actual solution doesn't look too different.
I'm still mystified by the Born rule

Yeah yeah, this is the problem I'm referring to :-)

I disagree that you must simulate collapse to solve this problem, though I agree that that would be one way to do it. (The way you get the right random numbers, fwiw, is from sample complexity -- SI doesn't put all its mass on the single machine that predicts the universe, it allocates mass to all machines that have not yet erred in proportion to their simplicity, so probability mass can end up on the class of machines, each individually quite complex, that describe QM and then hardcode the branch predictions. See also the proof about how the version of SI in which each TM outputs probabilities is equivalent to the version where they don't.)

3TAG9moIf your SI can't make predictions ITFP, that's rather beside the point. "Not erring" only has a straightforward implementation if you are expecting the predictions to be deterministic. How could an SI compare a deterministic theory to a probablistic one?
I'm still mystified by the Born rule

^_^

Also, thanks for all the resource links!

I'm still mystified by the Born rule

To be clear, the process that I'm talking about for turning a quantum state into a hypothesis is not intended to be a physical process (such as a measurement), it's intended to be a Turing machine (that produces output suitable for use by Solomonoff induction).

That said, to be clear, I don't think this is a fundamentally hard problem. My point is not "we have absolutely no idea how to do it", it's somehing more like "there's not a consensus answer here" + "it requires additional machinery above and beyond [the state vector + the born rule + your home addre... (read more)

1TAG9moThen you run into the basic problem of using SI to investigate MW: SI's are supposed to output a series of definite observations. They are inherently "single world" If the program running the SWE outputs information about all worlds on a single output tape, they are going to have to be concatenated or interleaved somehow. Which means that to make use of the information, you have to identify the subset if bits relating to your world. That's extra complexity which isn't accounted for because it's being done by hand, as it were. In particular, if you just model the wave function, the only results you will get represent every possible outcome. In order to match observation , you will have to keep discarding unobserved outcomes and renormalising as you do in every interpretation. It's just that that extra stage is performed manually, not by the programme. To get an output that matches one observers measurements, you would need to simulate collapse somehow. You could simulate collapse with a PRNG, but it won’t give you the right random numbers. Or you would need to keep feeding your observations back in so that the simulator can perform projection and renormalisation itself. That would work, but that's a departure from how SI's are supposed to work. Meta: trying to mechanise epistemology doesn't solve much , because mechanisms still have assumptions built into them.
I'm still mystified by the Born rule

I agree that the problem doesn't seem too hard, and that there are a bunch of plausible-seeming theories. (I have my own pet favorites.)

5Vanessa Kosoy9moI think that virtually every specialist would give you more or less the same answer as interstice, so I don't see why it's an open question at all. Sure, constructing a fully rigorous "eyeball operator" is very difficult, but defining a fully rigorous bridge rule in a classical universe would be very difficult as well. The relation to anthropics is more or less spurious IMO (MWI is just confused), but also anthropics is solvable using the infra-Bayesian approach to embedded agency [https://www.lesswrong.com/posts/dPmmuaz9szk26BkmD/shortform?commentId=SBPzgAZgFFxtL9E64] . The real difficulty is understanding how to think about QM predictions about quantities that you don't directly observe but that your utility function depends on. However, I believe that's also solvable using infra-Bayesianism.
5interstice9moMy own most recent pet theory is that the process of branching is deeply linked to thermalization, so to find model systems we should look to things modeling the flow of heat/entropy -- e.g. a system coupled to two heat baths at different temperatures.
I'm still mystified by the Born rule

This is precisely the thought that caused me to put the word 'apparent' in that quote :-p. (In particular, I recalled the original UDASSA post asserting that it took that horn, and this seeming both damning-to-me and not-obviously-true-for-the-reason-you-state, and I didn't want to bog my comment down, so I threw in a hedge word and moved on.) FWIW I have decent odds on "a thicker computer (and, indeed, any number of additional copies of exactly the same em) has no effect", and that's more obviously in contradiction with UDASSA.

Although, that isn't the nam... (read more)

6evhub9moAbsolutely no effect does seem pretty counterintuitive to me, especially given that we know from QM that different levels of happeningness are at least possible. I think my answer here would be something like: the reason that UDASSA doesn't fully resolve the confusion here is that UDASSA doesn't exactly pick a horse in the race as much as it enumerates the space of possible horses, since it doesn't specify what UTM you're supposed to be using. For any (computable) tradeoff between “more copies = more happening” and “more copies = no impact” that you want, you should be able to find a UTM which implements that tradeoff. Thus, neither intuition really leaves satisfied, since UDASSA doesn't actually take a stance on how much each is right, instead just deferring that problem to figuring out what UTM is “correct.”
I'm still mystified by the Born rule

I agree that the Born rule is just the poster child for the key remaining confusions (eg, I would have found it similarly natural to use the moniker "Hilbert space confusions").

I disagree about whether UDASSA contains much of the answer here. For instance, I have some probability on "physics is deeper than logic" being more-true-than-the-opposite in a way that ends up tossing UDASSA out the window somehow. For another instance, I weakly suspect that "running an emulation on a computer with 2x-as-thick wires does not make them twice-as-happening" is closer ... (read more)

6evhub9moI feel like I would be shocked if running a simulation on twice-as-thick wires made it twice as easy to specify you, according to whatever the “correct” UTM is. It seems to me like the effect there shouldn't be nearly that large.
I'm still mystified by the Born rule

I agree that there's a difference between "put a delta-spike on the single classical state you sampled" and "zero out amplitude on all states not consistent with the observation you got from your sample". I disagree that using the latter to generate a sensory stream from a quantum state yields reasonable predictions -- eg, taken literally I think you're still zeroing out all but a measure-zero subset of the position basis, and I expect the momenta to explode immediately. You can perhaps get this hypothesis (or the vanilla delta spike) hobbling by trying to... (read more)

4AlexMennen9moThe observation you got from your sample is information. Information is entropy, and entropy is locally finite. So I don't think it's possible for the states consistent with the observation you got from your sample to have measure zero.
3TAG9moYou have been assuming that all measurements are in the position basis, which is wrong. In particular, spin is its own basis. If you make a sharp measurement in one basis, you have uncertainty or lack of information about the others. That does not mean the "momentum is randomised" in some catastrophic sense. The original position measurement was not deterministic, for one thing. It is true that delta functions can be badly behaved. It's also true that they can be used in practice ... if you are careful. They are not an argument against discarding-and-renormalising , because if you don't do that at all, you get much wronger results than the results you get by rounding off small values to zero, ie. using a delta to represent a sharp gaussian. That might be the case if you were making an infinitely sharp measurement of an observable with a real valued spectrum, but there are no infinitely sharp measurements, and not every observable is real-valued.
Are the Born probabilities really that mysterious?

Warning: the post doesn't attempt to answer your question (ie, "can we reduce the Born rule to conservation of information?"). I don't know the answer to that. Sorry.

My guess is that a line can be drawn between the two; I'm uncertain how strong it can be made.

This may be just reciting things that you already know (or a worse plan than your current one), but in case not, the way I'd attempt to answer this would be:

  1. Solidly understand how to ground out the Born rule in the inner product. (https://arxiv.org/abs/1405.7907 might be a good place to start if you
... (read more)
Are the Born probabilities really that mysterious?

I tried to commentate, and accidentally a whole post. Short version: I think one or two of the many mysteries people tend to find swirling around the Born rule are washed away by the argument you mention (regardless of how tight the analogy to Liouville's theorem), but some others remain (including the one that I currently consider central).

Warning: the post doesn't attempt to answer your question (ie, "can we reduce the Born rule to conservation of information?"). I don't know the answer to that. Sorry.

My guess is that a line can be drawn between the two; I'm uncertain how strong it can be made.

This may be just reciting things that you already know (or a worse plan than your current one), but in case not, the way I'd attempt to answer this would be:

  1. Solidly understand how to ground out the Born rule in the inner product. (https://arxiv.org/abs/1405.7907 might be a good place to start if you
... (read more)
So8res' Shortform Feed

Crossposted from Twitter, might not engage much with comments on LW and I may or may not moderate replies.

PSA: In my book, everyone has an unlimited number of "I don't understand", "plz say that again in different words", "plz expand upon that", and "plz pause while I absorb that" tokens.

Possession of an unlimited number of such tokens (& their ilk) is one of your sacred rights as a fellow curious mind seeking to understand the world around you. Specifically, no amount of well-intentioned requests for clarification or thinking time will cause me to thi... (read more)

So8res' Shortform Feed

Crossposted from Twitter, might not engage much with comments on LW and I may or may not moderate replies

Hypothesis: English is harmed by conventions against making up new plausible-sounding words, as this contributes to conventions like pretending numbers are names (System 1 deliberation; Type II errors) and naming things after people (Bayesian reasoning, Cartesian products).

I used to think naming great concepts after people was a bad idea (eg, "frequency decomposition" is more informative & less scary than "Fourier transformation"). I now suspect tha... (read more)

2ChristianKl10moI used to think that names like "System 1 deliberation" have to be bad. When writing the Living People policy for Wikidata I had to name two types of classes of privacy protection and wanted to avoid called them protection class I and protection class II. Looking back I think that was a mistake because people seem to misunderstand the terms in ways I didn't expect.
7Steven Byrnes10moFunny story is "Unscented Kalman Filter". The guy (Uhlmann) needed a technical term for the new Kalman Filter he had just invented, and it would be pretentious for he himself to call it an Uhlmann filter, so he looked around the room and saw an unscented deodorant on someone's desk, and went with that. Source [https://ethw.org/First-Hand:The_Unscented_Transform]
6luidic10moBoth "transistor" (transconductance and varistor) and "bit" (binary digit) come to mind as new technical words. Quoting from Jon Gertner's The Idea Factory. (Further examples of bad naming conventions: https://willcrichton.net/notes/naming-conventions-that-need-to-die/)
So8res' Shortform Feed

Crossposted from Twitter, might not engage much with comments on LW and I may or may not moderate replies. 

Hypothesis: we're rapidly losing the cultural technology to put people into contact with new ideas/worldviews of their own volition, ie, not at the recommendation of a friend or acquaintance.

Related hypothesis: it's easier for people to absorb & internalize a new idea/worldview when the relationship between them and the idea feels private. Ex: your friend is pushing a viewpoint onto you, and you feel some social pressure to find at least one ... (read more)

Load More