patrickscottshields

Comments

Decision Theories: A Semi-Formal Analysis, Part II

I'd like to cite this article (or related published work) in a research project paper I'm writing which includes application of an expected utility-maximizing algorithm to a version of the prisoner's dilemma. Do you have anything more cite-able than this article's URL and your LW username? I didn't see anything in your profile which could point me towards your real name and anything you might have published.

Decision Theory FAQ

I'm not sure which further details you are after.

Thanks for the response! I'm looking for a formal version of the viewpoint you reiterated at the beginning of your most recent comment:

Yes, if the player is allowed access to entropy that Omega cannot have then it would be absurd to also declare that Omega can predict perfectly. [...] The problem specification needs to include a clause for how 'randomization' is handled.

That makes a lot of sense, but I haven't been able to find it stated formally. Wolpert and Benford's papers (using game theory decision trees or alternatively plain probability theory) seem to formally show that the problem formulation is ambiguous, but they are recent papers, and I haven't been able to tell how well they stand up to outside analysis.

If there is a consensus that the sufficient use of randomness prevents Omega from having perfect or nearly perfect predictions, then why is Newcomb's problem still relevant? If there's no randomness, wouldn't an appropriate application of CDT result in one-boxing since the decision-maker's choice and Omega's prediction are both causally determined by the decision-maker's algorithm, which was fixed prior to the making of the decision?

There have been attempts to create derivatives of CDT that work like that. That replace the "C" from conventional CDT with a type of causality that runs about in time as you mention. Such decision theories do seem to handle most of the problems that CDT fails at. Unfortunately I cannot recall the reference.

I'm curious: why can't normal CDT handle it by itself? Consider two variants of Newcomb's problem:

  1. At run-time, you get to choose the actual decision made in Newcomb's problem. Omega made its prediction without any information about your choice or what algorithms you might use to make it. In other words, Omega doesn't have any particular insight into your decision-making process. This means at run-time you are free to choose between one-boxing and two-boxing without backwards causal implications. In this case Omega cannot make perfect or nearly perfect predictions, for reasons of randomness which we already discussed.
  2. You get to write the algorithm, the output of which will determine the choice made in Newcomb's problem. Omega gets access to the algorithm in advance of its prediction. No run-time randomness is allowed. In this case, Omega can be a perfect predictor. But the correct causal network shows that both the decision-maker's "choice" as well as Omega's prediction are causally downstream from the selection of the decision-making algorithm. CDT holds in this case because you aren't free at run-time to make any choice other than what the algorithm outputs. A CDT algorithm would identify two consistent outcomes: (one-box && Omega predicted one-box), and (two-box && Omega predicted two-box). Coded correctly, it would prefer whichever consistent outcome had the highest expected utility, and so it would one-box.

(Note: I'm out of my depth here, and I haven't given a great deal of thought to precommitment and the possibility of allowing algorithms to rewrite themselves.)

Programming the LW Study Hall

This seems like an opportunity for a startup. It could be a fun project to build startup weekend-style. The concept doesn't seem particular tied to the Less Wrong community, and (based on a couple minutes searching for "online study halls") there don't seem to be other prominent startups taking on this specific challenge.

Decision Theory FAQ

This response challenges my intuition, and I would love to learn more about how the problem formulation is altered to address the apparent inconsistency in the case that players make choices on the basis of a fair coin flip. See my other post.

Decision Theory FAQ

Thanks for this post; it articulates many of the thoughts I've had on the apparent inconsistency of common decision-theoretic paradoxes such as Newcomb's problem. I'm not an expert in decision theory, but I have a computer science background and significant exposure to these topics, so let me give it a shot.

The strategy I have been considering in my attempt to prove a paradox inconsistent is to prove a contradiction using the problem formulation. In Newcomb's problem, suppose each player uses a fair coin flip to decide whether to one-box or two-box. Then Omega could not have a sustained correct prediction rate above 50%. But the problem formulation says Omega does; therefore the problem must be inconsistent.

Alternatively, Omega knew the outcome of the coin flip in advance; let's say Omega has access to all relevant information, including any supposed randomness used by the decision-maker. Then we can consider the decision to already have been made; the idea of a choice occurring after Omega has left is illusory (i.e. deterministic; anyone with enough information could have predicted it.) Admittedly, as you say quite eloquently:

Choice is not something inherent to a system, but a feature of an outsider's model of a system, in much the same sense as random is not something inherent to a Eeny, meeny, miny, moe however much it might seem that way to children.

In this case of the all-knowing Omega, talking about what someone should choose after Omega has left seems mistaken. The agent is no longer free to make an arbitrary decision at run-time, since that would have backwards causal implications; we can, without restricting which algorithm is chosen, require the decision-making algorithm to be written down and provided to Omega prior to the whole simulation. Since Omega can predict the agent's decision, the agent's decision does determine what's in the box, despite the usual claim of no causality. Taking that into account, CDT doesn't fail after all.

It really does seem to me like most of these supposed paradoxes of decision theory have these inconsistent setups. I see that wedrifid says of coin flips:

If the FAQ left this out then it is indeed faulty. It should either specify that if Omega predicts the human will use that kind of entropy then it gets a "Fuck you" (gets nothing in the big box, or worse) or, at best, that Omega awards that kind of randomization with a proportional payoff (ie. If behavior is determined by a fair coin then the big box contains half the money.)

This is a fairly typical (even "Frequent") question so needs to be included in the problem specification. But it can just be considered a minor technical detail.

I would love to hear from someone in further detail on these issues of consistency. Have they been addressed elsewhere? If so, where?

Idea: Self-Improving Task Management Software

Task management has become a passion of mine; for the last two years or so I've been trying to build something close to what you're describing. I think it's cool that you're giving this a shot. Here are some of my thoughts:

  • Start small. Building good task management software is a hard challenge, potentially several orders of magnitude harder than you're expecting. I continually underestimated how much effort it would take to build my task management software.
  • If you want to work on this full-time, consider joining an existing team. Companies such as Asana are already in the task management space, and they have teams of software engineers and data scientists working on cool things. Joining an existing team allows you to specialize on a part of the software, whereas you might spread yourself too thin if you are responsible for all components of the project. Joining an existing team is basically what I'm trying to do now, after I decided pursuing my startup further was suboptimal. (Potential employers reading this: please contact me!)
  • Focus on the fundamental algorithms and APIs before considering presentation. Target the command line; in the browser, it's easy to get distracted by user experience issues and end up prematurely optimizing for them. Unless your software actually does the awesome things you want it to do on the technical side, it won't matter how nice its interface is. Developing for the command line forces you to focus on the actual algorithms and APIs.
  • Don't reinvent things and don't allow feature creep. If you feel like you're doing something new, do more research. Very little in the way of new algorithms, math, data structures, etc. is necessary in this area; most of the work to be done is in picking which things to use that have already been invented. Keep your code base and features small so you don't get overwhelmed by technical debt.
  • Take free online classes in algorithms, data structures, software development, machine learning, statistics, information theory, logic, AI, and planning. There's so much cool stuff out there that you might not know about, which could be useful for this sort of endeavor. For example, something I learned from Tim Roughgarden's algorithms class on Coursera is that a set of tasks with precedence constraints (e.g. constraints of the form "task X must be completed before task Y") can be represented by a directed graph. If the graph is acyclic, a topological sort can, in linear time, can give a sequence of tasks that respects all precedence constraints (if the graph is cyclic, no such sequence exists.)
  • One avenue to explore with this kind of software is data entry optimization. What optimal subset of data should be collected from the user? Data entry consumes users' time; it's suboptimal for a user with thousands of tasks to routinely update each task's parameters. I think by looking at tasks' parameters as random variables, we can use information theory and machine learning to decide when users should be asked to update various data. I wrote a paper exploring this.

One thing you have going for you is that your project is open source. That allows for a lot of little contributions from people who are interested in the work, but already have their own sources of income. That might allow the project to survive where my startup failed. I still care deeply about task management, so it's possible our work will intersect in the future. I'm now following the GitHub repository you made for this project.

Thoughts on the January CFAR workshop

For example, I assumed the median staring salary for computer scientists was a reasonable estimate for what my starting salary would be. It turns out that I can expect to make about twice that much money if I use certain job hunting techniques I learned at the workshop and optimize for money (instead of, say, cool sounding problems).

What changed your expectation of your starting salary?

How to signal curiosity?

The potential benefits from private questioning should be weighed against the cost of the information not being visible to others. I like to see Wei_Dai's questions and the responses they elicit. I think the public exchanges have significant value beyond the immediate participants.

If we live in a simulation, what does that imply?

If our simulators are human, that implies that their universe has laws of physics similar to our own. But if we're living in a simulation, I think it's more plausible that our simulators exist in a world operating under different laws of physics (e.g. they live in a universe which is more amenable to our-universe-scale simulation.) So I think other factors are in play which could lessen the probability that we are being simulated by humans, let alone our future.

If we live in a simulation, what does that imply?

Or maybe just greater means. I imagine many humans would run universe-scale simulations, if they had the means.

Load More