# All of Benya_Fallenstein's Comments + Replies

Waterfall Truth Predicates

We should be more careful, though, about what we mean by saying that only depends on for , though, since this cannot be a purely syntactic criterion if we allow quantification over the subscript (as I did here). I'm pretty sure that something can be worked out, but I'll leave it for the moment.

Waterfall Truth Predicates

I would suggest changing this system by defining to mean that no is the Gödel number of a proof of an inconsistency in ZFC (instead of just asserting that isn't). The purpose of this is to make it so that if ZFC were inconsistent, then we only end up talking about a finite number of levels of truth predicate. More specifically, I'd define to be PA plus the axiom schema

Then, it seems that Jacob Hilton's proof that the waterfalls are consistent goes through for this waterfall:

Work in ZFC and assume that ZFC is i

0Benya_Fallenstein6yWe should be more careful, though, about what we mean by saying that φ(x) only depends on Trm for m>n, though, since this cannot be a purely syntactic criterion if we allow quantification over the subscript (as I did here). I'm pretty sure that something can be worked out, but I'll leave it for the moment.
No Good Logical Conditional Probability

Hm; we could add an uninterpreted predicate symbol to the language of arithmetic, and let and . Then, it seems like the only barrier to recursive enumerability of is that 's opinions about aren't computable; this seems worrying in practice, since it seems certain that we would like logical uncertainty to be able to reason about the values of computations that use more resources than we use to compute our own probability estimates. But on the other hand, all of this makes this sound like an issue of self-reference,

Agents that can predict their Newcomb predictor

The other direction follows from the fact that the algorithm is bounded, and PA can simply show the execution trace of in steps.

Unimportant technical point: I think the length of the PA proof grows faster than this. (More precisely, the length in symbols, rather than the length in number of statements; we're almost always interested in the former, since it determines how quickly a proof can be checked or be found by exhaustive search.) The obvious way of showing in PA is to successively show for higher and higher that "after ticks, the T

An Informal Conjecture on Proof Length and Logical Counterfactuals

Next, we consider the case that PA is consistent and work through the agent’s decision. PA can’t prove , since we used the chicken rule, so since the sentence is easily provable, the sentence (ie. the first sentence that the agents checks for proofs of) must be unprovable.

It seems like this argument needs soundness of PA, not just consistency of PA. Do you see a way to prove in PA that if , then PA is inconsistent?

[edited to add:] However, your idea reminds me of my post on the odd counterfactuals of playing chicken

3SamEisenstat6yThe argument that I had in mind was that if PA⊢A()≠1, then PA⊢□┌A()≠1┐, so PA⊢A( )=1 since PA knows how the chicken rule works. This gives us PA⊢⊥, so PA can prove that if PA⊢A()≠1, then PA is inconsistent. I'll include this argument in my post, since you're right that this was too big a jump. Edit: We also need to use this argument to show that the modal UDT agent gets to the part where it iterates over utilities, rather than taking an action at the chicken rule step. I didn't mention this explicitly, since I felt like I had seen it before often enough, but now I realize it is nontrivial enough to point out.
Reflective probabilistic logic cannot assign positive probability to its own coherence and an inner reflection principle

If you replace the inner "" by "", then the literal thing you wrote follows from the reflection principle: Suppose that the outer probability is . Then

Now, implies , which by the converse of the outer reflection principle yields , whence . Now, by the forward direction of the outer reflection principle, we have

which, by the outer reflection principle again, impl

1paulfchristiano6yHere [http://mathoverflow.net/questions/145032/an-approximate-infinite-dimensional-fixed-point-theorem] is the basic problem. I think that you can construct an appropriate liar's sentence by using a Lipschitz function without an approximate fixed point. But someone might want to check that more carefully and write it up, to make sure and to see where the possible loopholes are. I think that it may not have ruled out this particular principle, just something slightly stronger (but the two were equivalent for the kinds of proof techniqeus we were considering).
Paraconsistent Tiling Agents (Very Early Draft)

This is very interesting!

My main question, which I didn't see an answer to on my first read, is: If the agent proves that action leads to the goal being achieved, are there any conditions under which this implies that the goal actually is achieved? The procrastination paradox shows that this isn't true in general. Is there a class of sentences (e.g., all sentences, though I don't expect that to be true in this case) such that if then ? In other words, do we have some guarantee that we do better at actually achieving than an agent which us

Forum Digest: Corrigibility, utility indifference, & related control ideas

Categorization is hard! :-) I wanted to break it up because long lists are annoying to read, but there was certainly some arbitrariness in dividing it up. I've moved "resource gathering agent" to the odds & ends.

Identity and quining in UDT

Want to echo Nate's points!

One particular thing that I wanted to emphasize is that I think you can see as a thread on this forum (in particular, the modal UDT work is relevant) is that it's useful to make formal toy models where the math is fully specified, so that you can prove theorems about what exactly an agent would do (or, sometimes, write a program that figures it out for you). When you write out things that explicitly, then, for example, it becomes clearer that you need to assume that a decision problem is "fair" (extensional) to get certain result

0Vanessa Kosoy6yHi Benja, thx for commenting! I agree that it's best to work on fully specified models. Hopefully, soon I will write about my own approach to logical uncertainty via complexity theory.
Meta- the goals of this forum

• Non-member contributions (comments and links) are going to become publicly visible when they have received 2 likes (from members---only members can like things).

• However, members will be able to reply to a contribution as soon as it has received 1 like. This means that if you think someone's made a useful contribution, and you want to reply to them about it, you don't have to wait for a second person to Like it before you can write that reply. (It won't be publicly visible until the contribution has two Likes, though.)

Third-person counterfactuals

Yeah, that sounds good! Of course, by the Kripke levels argument, it's sufficient to consider 's of the form . And we might want to have a separate notion of " leads to at level ", which we can actually implement in a finite modal formula. This seems to suggest a version of modal UDT that tries to prove things in PA, then if that has ambiguous counterfactuals (i.e., it can't prove for any ) we try PA+1 and so on up to some finite ; then we can hope these versions of UDT approximate optimality according to your revised version of " leads t

Third-person counterfactuals

Fixed---thanks, Patrick!

Regarding the example, I earlier defined "action leads to outcome " to mean the conjunction of and ; i.e., we check for spurious counterfactuals before believing that tells us something about what action leads to, and we only consider ourselves "fully informed" in this sense if we have non-spurious information for each . (Of course, my follow-up post is about how that's still unsatisfactory; the reason to define this notion of "fully informative" so explicitly was really to be able to say more clearly i

1orthonormal6yAh! I'd failed to propagate that somehow. Given that we're using PA+n to defeat evil problems, the true modal definition of "action i leads to outcome j" might be something like "there exists a closed formula ϕ such that N⊨ϕ, GL ⊢ϕ→(Ai→Uj), and GL ⊬ϕ→¬Ai". But that's an unnecessary complication for this post.
From halting oracles to modal logic

(Thanks for the feedback!)

Trustworthy automated philosophy?

In our chat about this, Eliezer said that, aside from the difficulty of making a human-level mathematical philosophy engine aligned with our goals, an additional significant relevant disagreement with Paul is that Eliezer thinks it's likely that we'll use low-level self-improvement on the road to human-level AI; he used the analogies of programmers using compilers instead of writing machine code directly, and of EURISKO helping Lenat. (Again, hoping I'm not misrepresenting.)

This seems like a plausible scenario to me, but I'm not convinced it argues for the

0paulfchristiano6yI agree that many parts of AI research are already automated, and by the time we have broadly human-level AI, many more will be. I would not be surprised if the great majority of tasks in current AI research are automated long before we have broadly human-level AI. I wouldn't normally describe this as "low-level self-improvement," but that seems like a semantic issue. I am skeptical of the Eurisko example (and of claims about Eurisko in general) but I don't know if it's relevant to this disagreement.
Approximability

...though although I'm guessing the variant of the reflective oracle discussed in the comment thread may be approximable, it seems less likely that a version of AIXI can be defined based on it that would be approximable.

Approximability

Sorry for the delayed reply! I like this post, and agree with much of what you're saying. I guess I disagree with the particular line you draw, though; I think there's an interesting line, but it's at "is there a Turing machine which will halt and give the answer to this problem" rather than "is there a Turing machine which will spit out the correct answer for a large enough input (but will spit out wrong answers for smaller inputs, and you don't know what number is large enough)". The latter of these doesn't seem that qualitatively different to me from "i

0Benya_Fallenstein6y...though although I'm guessing the variant of the reflective oracle discussed in the comment thread may be approximable, it seems less likely that a version of AIXI can be defined based on it that would be approximable.
Stable self-improvement as a research problem

I think the main disagreement is about whether it's possible to get an initial system which is powerful in the ways needed for your proposal and which is knowably aligned with our goals; some more about this in my reply to your post, which I've finally posted, though there I mostly discuss my own position rather than Eliezer's.

"Evil" decision problems in provability logic

Actually, drnickbone's original LessWrong post introducing evil problems also gives an extension to the case you are considering: The evil decision problem gives the agent three or more options, and rewards the one that the "victim" decision theory assigns the least probability to (breaking ties lexicographically). Then, no decision theory can put probability on the action that is rewarded in its probabilistic evil problem.

0paulfchristiano6yThis setup plays a computational trick, and as a result I don't think it violates the optimality standard I proposed. In order to decide what it should do, the CDT agent needs to think strictly longer than the UDT agent. But if the CDT agent thinks longer than the UDT agent, it's totally unsurprising that it does better! (Basically, the problem just consists of a computational question which is chosen to be slightly too complex for the UDT agent. But the CDT agent is allowed to think as long as it likes. This entire family of problems appears to be predicated on the lack of computational limits for our agents.) As a result, if the UDT agent is told what the CDT agent decides, then it can get the same performance as the CDT agent. This seems to illustrate that the CDT agent isn't doing better by being wiser, just by knowing something the UDT agent doesn't. (I wasn't actually thinking about this case when I introduced the weakened criterion; the weakening is obviously necessary for UDT with 10 years of time to compete with CDT with 11 years of time, and I included it for that reason.) Does this seem right? If so, is there a way to set up the problem that violates my weakened standard? Incidentally, this problem involves a discontinuous dependence on UDT's decision(both by the competitor and by the environment). I wonder if this discontinuous dependence is necessary?
Exploiting EDT

True. This looks to me like an effect of EDT not being stable under self-modification, although here the issue is handicapping itself through external means rather than self-modification---like, if you offer a CDT agent a potion that will make it unable to lift more than one box before it enters Newcomb's problem (i.e., before Omega makes its observation of the agent), then it'll cheerfully take it and pay you for the privilege.

A primer on provability logic

Thread for proofs of results claimed above (not limited to stuff not found in Lindström); contributions appreciated. Stuff not in Lindström includes the uniqueness of arithmetic fixed points in PA (found in the modal agents paper), and I think the version of the fixed point theorem with more than one (the latter should be provable by iterated application of the fixed point theorem for a single ).

3Jaime Sevilla Molina5yGeneralized fixed point theorem: Suppose that Ai(p1,...,pn) are n modal sentences such that Ai is modalized in pn (possibly containing sentence letters other than pjs). Then there exists H1,...,Hn in which no pj appears such that GL⊢∧i≤n{⊡(pi↔Ai(p1, ...,pn)}↔∧i≤n{⊡(pi↔Hi)}. -------------------------------------------------------------------------------- We will prove it by induction. For the base step, we know by the fixed point theorem that there is H such that GL⊢⊡(p1↔Ai(p1,...,pn))↔⊡(p1↔H(p2,...,pn)) Now suppose that for j we have H1,...,Hj such that GL⊢∧i≤j{⊡(pi↔Ai(p1,...,pn)}↔∧ i≤j{⊡(pi↔Hi(pj+1,...,pn))}. By the second substitution theorem, GL⊢⊡(A↔B)→[F(A)↔F(B)]. Therefore we have that GL⊢⊡(pi↔Hi(pj+1,...,pn)→[⊡(pj+1↔Aj+1(p1,...,pn))↔⊡(pj+1↔Aj+1(p1,...,pi−1,Hi (pj+1,...,pn),pi+1,...,pn))]. If we iterate the replacements, we finally end up with GL⊢∧i≤j{⊡(pi↔Ai(p1,...,pn )}→⊡(pj+1↔Aj+1(H1,...,Hj,pj+1,...,pn)). Again by the fixed point theorem, there is H′j+1 such that GL⊢⊡(pj+1↔Aj+1(H1,... ,Hj,pj+1,...,pn))↔⊡[pj+1↔H′j+1(pj+2,...,pn)]. But as before, by the second substitution theorem, GL⊢⊡[pj+1↔H′j+1(pj+2,...,pn)] →[⊡(pi↔Hi(pj+1,...,pn))↔⊡(pi↔Hi(H′j+1,...,pn)). Let H′i stand for Hi(H′j+1,...,pn), and by combining the previous lines we find that GL⊢∧i≤j+1{⊡(pi↔Ai(p1,...,pn)}→∧i≤j+1{⊡(pi↔H′i(pj+2,...,pn))}. By Goldfarb's lemma, we do not need to check the other direction, so GL⊢∧i≤j+1{⊡ (pi↔Ai(p1,...,pn)}↔∧i≤j+1{⊡(pi↔H′i(pj+2,...,pn))} and the proof is finished □ -------------------------------------------------------------------------------- An immediate consequence of the theorem is that for those fixed points Hi and every Ai, GL⊢Hi↔Ai(H1,...,Hn). Indeed, since GL is closed under substitution, we can make the change pi for Hi in the theorem to get that GL⊢∧i≤n{⊡(Hi↔Ai(H1,...,Hn)}↔∧i≤n{⊡(Hi↔Hi)}. Since the righthand side is trivially a theorem of GL, we get the desired result. -----------------------------------------------------------------
3Jaime Sevilla Molina5yUniqueness of arithmetic fixed points: Notation: ⊡A=□A∧A Let H be a fixed point on p of ϕ(p); that is, GL⊢⊡(p↔ϕ(p))↔(p↔H). Suppose I is such that GL⊢H↔I. Then by the first substitution theorem, GL⊢F(I)↔F (H) for every formula F(q). If F(q)=⊡(p↔q), then GL⊢⊡(p↔H)↔⊡(p↔I), from which it follows that GL⊢⊡(p↔ϕ(p))↔(p↔I). Conversely, if H and I are fixed points, then GL⊢⊡(p↔H)↔⊡(p↔I), so since GL is closed under substitution, GL⊢⊡(H↔H)↔⊡(H↔I). Since GL⊢⊡(H↔H), it follows that GL ⊢(H↔I). (Taken from The Logic of Provability, by G. Boolos.)
Oracle machines instead of topological truth predicates

Thanks for expanding on your construction! I hadn't thought of the recursive construction, that's really neat.

I'm not that worried about the application to AIXI: unless I'm missing something, we can just additionally give our machines access to a double halting oracle (for ordinary Turing machines), and recover the same power. I considered doing that and didn't go with it because the stuff in my post seemed slightly more elegant, but if there's a reason to prefer the other version, it seems fine to use it.

I'm not clear on what you mean by "

Exploiting EDT

Thanks! I didn't really think at all about whether or not "money-pump" was the appropriate word (I'm not sure what the exact definition is); have now changed "way to money-pump EDT agents" into "way to get EDT agents to pay you for managing the news for them".

0danieldewey6yHm, I don't know what the definition is either. In my head, it means "can get an arbitrary amount of money from", e.g. by taking it around a preference loop as many times as you like. In any case, glad the feedback was helpful.
Oracle machines instead of topological truth predicates

About determining whether something's a query: A formulation of the statement we want to test is, (A) "for every (infinite) input on the oracle tape, and every , there is a such that with probability , the program halts in steps."

I think this is equivalent to, (B) "For every , there is a such that, for any input on the oracle tape, the program halts within timesteps with probability ."

Reason why I think this is equivalent: Clearly (B) implies (A). Suppose that (B) is false; then

0Benya_Fallenstein6yIn the other direction, suppose that φ(m,n) is a primitive recursive predicate, and consider the following probabilistic Turing machine: First, randomly choose an m∈N (placing positive probability on every natural number). Then, search exhaustively for an n∈N such that φ(m,n) is true. If you find one, halt and output 1. This machine is a query if and only if ∃m.∀n.φ(m,n). Together with the parent comment, this shows that having an oracle that tells you whether or not something is a query is equivalent to having an oracle for Π2 statements (i.e., a double halting oracle).
Oracle machines instead of topological truth predicates

In the other direction, suppose that is a primitive recursive predicate, and consider the following probabilistic Turing machine: First, randomly choose an (placing positive probability on every natural number). Then, search exhaustively for an such that is true. If you find one, halt and output . This machine is a query if and only if . Together with the parent comment, this shows that having an oracle that tells you whether or not something is a query is equivalent to having an oracle for statements (i.e., a double h

Oracle machines instead of topological truth predicates

Aagh... I'm conflicted. I really like the simplicity of having players correspond 1:1 to calls to the oracle, and their mixed strategies to the probability that the oracle returns "true", but the implication that the oracle must return the same answer if it's called twice on the same arguments during the execution of a single program interacts very badly with the intuitive picture of the applications I have in mind.

The intuitive picture is that the agents under consideration are living in a world whose laws of physics are probabilistic and allow for the co

0jessicata6yHmm...this seems like less of a problem to me. The thing we need for the Nash equilibrium equivalence is that, within an execution of a single program, equal calls to the oracle return equal results (or equivalently, you can't give the same call to the oracle twice). But we don't need to give the same call to the oracle twice in order to model the matching pennies game, because you just need one call (to determine whether your opponent plays heads with greater than 50% probability), and the two players are represented as different programs. The thing we're basing the Nash equilibrium equivalence on is the question "given fixed return values of calls, what is the probability that this program returns 1?" in order to write in utility values for our normal form game. Since we're basing everything on this process, it follows that within an execution of a program, it only make sense to make a given call once. But nothing about this rules out having different programs reason about the same call, and having the oracle give them different answers.
Oracle machines instead of topological truth predicates

Fixed the typo, thanks!

I considered describing the probabilities of the oracle returning "true" by a different function , but it seemed too pedantic to have a different letter. Maybe that's wrong, but it still feels too pedantic. If I do things that way I probably shouldn't be writing " returns 'true' if...", though...

Oracle machines instead of topological truth predicates

Thanks! I've been also thinking about making a finite version; my main motivation was that I've been hoping to take my still-incompletely-specified AIXI variant and find an analog variant of AIXItl, the computable version of AIXI that considers only hypotheses which terminate in time and whose source length is .

I think that would in fact be a pretty reasonable finite set of programs to use in what you're suggesting, since "we want it to terminate in time " provides a motivation for why we would require programs to terminate surely, as opposed to al

1jessicata7yI thought about it some more and realized that this is equivalent to finding a Nash equilibrium. What we do is, we create a player for every call (M,p). This player has 2 actions (call them action 0 and action 1). The player's utilities are set up so that they always get p utility for choosing action 0. Also, given fixed actions for the other players, their utility for action 1 is set to be the probability that M[O′] returns 1 given that O′ is an oracle that returns the action of a different player corresponding to the call it receives. Now, the player's expected utility is p for action 0, and for action 1 it will be M[O′′] 's probability of returning 1 where O′′ is an oracle that returns answers to calls under the same distribution as the other players' mixed strategies. So, it will be allowed to choose action 0 iff P(M,O)<=p, action 1 iff P(M,O)>=p, and will be allowed to mix iff P(M,O)=p, as desired. It should be noted that within a program, repeated calls to the oracle must return the same value, so they should be cached. Going the other direction is easy: to find the Nash equilibrium for a game where each player has 2 actions, set up a program for each player that queries the oracle for the actions of each other player and then returns the action that maximizes expected utility. For games with more than 2 actions (say, each player has actions 1...k), we can create a different set of programs for each player: program i where i∈{1..k−1} returns 1 iff some maximum expected utility action's index is less that or equal to i. It seems to follow that the full problem is equivalent to a game with countably infinite players. I was thinking of how you would solve the full problem using some kind of oracle machine. It would at least be necessary to determine whether a program halts for every infinite input you give it, because this is necessary to detect queries. This interacts weirdly with the requirement that queries almost surely halt; it may require something analog
Oracle machines instead of topological truth predicates

Hm, I think it would be expectedly helpful to explain things more, but it would also take more time, and aiming much lower would take much more time. (That's part of why it's taking me so long to turn these things into research papers, although in that case the length limitations make the constraint even more difficult.) At the moment, I'm trying to get the things I'm working on out there at all, especially with something like this where I haven't yet worked through all the details of whether I can actually do my intended application based on it. After all

Using modal fixed points to formalize logical causality

A model of UDT with a halting oracle searches only for one utility value for each action. I'm guessing the other formulation just wasn't obvious at the time? (I don't remember realizing the possibility of playing chicken implicitly before Will Sawin advertised it to me, though I think he attributed it to you.)

Topological truth predicates: Towards a model of perfect Bayesian agents

Thanks! :-)

The reflective assignments correspond very directly to Nash equilibria (albeit in an -to- manner, because given any finite game, a reflective assignment contains information about that game but also information about many other things). So I wouldn't quite say that they correspond to methods of equilibrium selection---e.g., if two methods of equilibrium selection give you the same Nash equilibrium, they're not distinguishable on the level of reflective assignments. But yeah, the question of what equilibrium gets played gets outsourced to the o

0cousin_it7yNow I'm wondering how this compares to Paul's probabilistic reflection [http://intelligence.org/files/DefinabilityTruthDraft.pdf]. Are there some nice "consistency" axioms that are satisfied by one approach but not the other, or vice versa?
Using modal fixed points to formalize logical causality

However, the approach is designed so that these "spurious" logical implications are unprovable, so they don’t interfere with decision-making. The proof of that is left as an easy exercise.

I don't think this is technically true as stated; it seems to be possible that the agent proves some spurious counterfactuals as long as the outcome it does in fact obtain is the best possible one. (This is of course harmless!) Say the agent has two possible actions, and , leading to outcomes and , respectively. The latter is preferred, and thes

0orthonormal6yOoh, nice: we don't need to eliminate all spurious counterfactuals, only the malignant ones!
0cousin_it7yYes, that's correct. Thanks!
Using modal fixed points to formalize logical causality

I know this is supposed to be just introductory, but I actually think that the complete reformulation of UDT-with-a-halting-oracle in terms of modal logic is really interesting! For starters, it allows us to compare UDT and modal agents in the same framework (with the right 's, we can see this version of UDT as a modal agent). It would also be really neat if we could write an "interpreter" that allows us to write UDT as a program calling a halting oracle, and then evaluate what it does by way of modal logic.

But also, it allows us to give a nice definition

0cousin_it7yYes, modal logic seems to be the most natural setting for these kinds of ideas. Also the "chicken rule" from the usual oracle formulations is gone now, I can't remember why we needed it anymore.
The Pascal's Wager Fallacy Fallacy
Ask yourself if you would want to revive someone frozen 100 years ago. Yes. They don't deserve to die. Kthx next.

I wish that this were on Less Wrong, so that I could vote this up.

4jefftk10yIt is now.
An Especially Elegant Evpsych Experiment

Garrett, since Anonymous reply was a little implicit, the point is that infants have a larger chance of dying before reproducing than young adults, so expected number of future offspring increases during childhood (when at each point counting only non-deceased children).

Aron, almost; it's because they get older, and only future children are relevant. Whether they've had children won't change the value except insofar it changes the chance for future children.

Me: ...so IIUC, we expect a large influence of random variation in the sample.

Bzzzt! Wrong.

An Especially Elegant Evpsych Experiment

Eliezer, right, thanks. And I hadn't noticed about the correlations of the subcategories...

An Especially Elegant Evpsych Experiment
Might we get an even higher correlation if we tried to take into account the reproductive opportunity cost of raising a child of age X to independent maturity, while discarding all sunk costs to raise a child to age X?

I haven't done the math, but my intuition says that upon observing the highest! correlation! ever!, surely our subjective probability must go towards a high true underlying correlation and having picked a sample with a particularly high correlation? (Conditioning on the paper not being wrong due to human error or fake, of course -- I don't ... (read more)

OB Status Update

Any chance you'd consider installing jsMath? (Client-side library rendering LaTeX math. Formatting math in plain HTML won't kill you, but there are other things you can do with the same amount of effort that will make you stronger still :-))

Sympathetic Minds
We might even cooperate in the Prisoner's Dilemma. But we would never be friends with them. They would never see us as anything but means to an end. They would never shed a tear for us, nor smile for our joys. And the others of their own kind would receive no different consideration, nor have any sense that they were missing something important thereby.

...but beware of using that as a reason to think of them as humans in chitin exoskeletons :-)

In Praise of Boredom

Robin, I suspect that despite how it may look from a high level, the lives of most of the people you refer to probably do differ enough from year to year that they will in fact have new experiences and learn something new, and that they would in fact find it unbearable if their world were so static as to come even a little close to being video game repetitive.

That said, I would agree that many people seem not to act day-to-day as if they put a premium on Eliezer-style novelty, but that seems like it could be better explained by Eliezer's boredom being a FAR value than by the concept being specific to Eliezer :-)

Newcomb's Problem and Regret of Rationality

thinks -- Okay, so if I understand you correctly now, the essential thing I was missing that you meant to imply was that the utility of living forever must necessarily be equal to (cannot be larger than) the limit of the utilities of living a finite number of years. Then, if u(live forever) is finite, p times the difference between u(live forever) and u(live n years) must become arbitrarily small, and thus, eventually smaller than q times the difference between u(live n years) and u(live googolplex years). You then arrive at a contradiction, from which you... (read more)

Newcomb's Problem and Regret of Rationality

Given how many times Eliezer has linked to it, it's a little surprising that nobody seems to have picked up on this yet, but the paragraph about the utility function not being up for grabs seems to have a pretty serious technical flaw:

There is no finite amount of life lived N where I would prefer a 80.0001% probability of living N years to an 0.0001% chance of living a googolplex years and an 80% chance of living forever. This is a sufficient condition to imply that my utility function is unbounded.

Let p = 80% and let q be one in a million. I'm pretty... (read more)

37 Ways That Words Can Be Wrong

(To be clear, the previous comment was meant as a joke, not as a serious addition to the list -- at least not as it stands :-))

37 Ways That Words Can Be Wrong

You give an absolute train wreck of a purported definition, then do your best to relive the crash over and over. Intelligence is not merely objective reason, but includes nonlinear subconscious processing, intuition, and emotional intelligence. Therefore, AGI needs quantum computing.

Expected Creative Surprises

On a whimsical note, it is reminiscent of the unpredictability of the Infinite Improbability Drive :-)

Torture vs. Dust Specks

Bogdan's presented almost exactly the argument that I too came up with while reading this thread. I would choose the specks in that argument and also in the original scenario (as long as I am not committing to the same choice being repeated an arbitrary number of times, and I am not causing more people to crash their cars than I cause not to crash their cars; the latter seems like an unlikely assumption, but thought experiments are allowed to make unlikely assumptions, and I'm interested in the moral question posed when we accept the assumption). Based on ... (read more)

How Many LHC Failures Is Too Many?

Gawk! "even if they had to deal with a terrorist attack on all of these branches, say" was supposed to come after "Surely everybody here would find an outcome undesirable where all of their future Everett branches wink out of existence." (The bane of computers. On a typewriter, this would not have happened.)

How Many LHC Failures Is Too Many?

Richard, I am going to assume ... that you assign an Everett branch in which you painless wink out of existence a value of zero (neither desirable or undesirable)

I'd rather say that people who find quantum suicide desirable have a utility function that does not decompose into a linear combination of individual utility functions for their individual Everett branches-- even if they had to deal with a terrorist attack on all of these branches, say. Surely everybody here would find an outcome undesirable where all of their future Everett branches wink out of e... (read more)

Friedman's "Prediction vs. Explanation"

Two points I'd like to comment on.

I don't think this is relevant if-- as I understood from the description-- the first scientist's theory predicted experiments 11..20 with high accuracy. In this scenario, I don't think the first scientist should have learned anything that would make them reject their previous view. This seems like an important point. (I think I understood this from Tyrrell's comment.)

Re: Theories screen of theorists

I agree-- we should pick the simpler theory-- if we're able to judge them for sim... (read more)

Friedman's "Prediction vs. Explanation"

Tyrrell, um. If "the ball will be visible" is a better theory, then "we will observe some experimental result" would be an even better theory?

Solomonoff induction, the induction method based on Kolmogorov complexity, requires the theory (program) to output the precise experimental results of all experiments so far, and in the future. So your T3 would not be a single program; rather, it would be a set of programs, each encoding specifically one experimental outcome consistent with "the ball is visible." (Which gets rid of the problem that "we will observe some experimental result" is the best possible theory :))