LESSWRONG
LW

orthonormal
17937Ω28870282334
Message
Dialogue
Subscribe

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
Staying Sane While Taking Ideas Seriously
9orthonormal's Shortform
Ω
6y
Ω
41
Monthly Roundup #33: August 2025
orthonormal9d42

If we were doing a better job catching cheaters presumably people would be doing it less?

Seems trivially easy for an idiot to start an account, lose some games, get frustrated, and start asking a chess bot what to do so that they can rescue a bad position / show that bastard what's what / vicariously enjoy winning. I expect that crowd to be the majority of cheaters, and for them to be essentially inelastic to the probability with which they get caught (the first time).

Reply
My Interview With Cade Metz on His Reporting About Lighthaven
orthonormal15d3115

I believe that Cade knows perfectly well what everyone has been saying for years; he's being disingenuous because the object level doesn't matter to him, and the only important thing is ensuring that these weirdos don't get status. He's never once engaged on simulacrum level 1 with this community.

Reply
Comp Sci in 2027 (Short story by Eliezer Yudkowsky)
orthonormal23d50

At long last, Elon Musk has created the Yandere Simulator from the classic short story Don't Create The Yandere Simulator, Among Other Things

Reply
Elizabeth's Shortform
orthonormal24d42

On several of these I think you might be confounding the positively correlated variables "has heard of this question" and "is on the autism spectrum", the latter of which is anticorrelated with the kind of behavior you want (intuitive, empathetic listening without needing to ask a bunch of explicit questions). I find it unlikely that asking the question makes people worse at the behavior you want.

Reply
Fighting Obvious Nonsense About AI Diffusion
orthonormal4mo140

Zvi is arguing "X implies Y" here. Zvi happens to believe Y but disbelieve X; however, he is writing to people who think "X and not-Y", in order to nudge them to support Y.

Here X = it is good for the US to build superintelligence fast, before China does, and Y = we should have some diffusion rules making it harder for China to catch up to the USA. 

Zvi believes Z = nobody should be building superintelligence soon, and believes Z implies Y, but it is useful to show that X implies Y as well.

Reply
orthonormal's Shortform
orthonormal4moΩ120

Question for @Scott Garrabrant, @TsviBT, @Andrew_Critch, @So8res, @jessicata, and anyone else who knows the answer: the logical inductor constructed in the paper is not merely computable but also primitive recursive, right? 

Seems obvious to me (because the fixed price point is approximated, etc), but I want to be sure I'm not missing something.

Reply
Bandwidth Rules Everything Around Me: Oliver Habryka on OpenPhil and GoodVentures
orthonormal4mo30-11

To be fair, I also think it was a correct decision to focus on the Democratic Party when advocating for sane AI notkilleveryoneism policy before the 2024 election, because that was playing to our outs. Trump will do on AI whatever he is personally paid to do, and the accelerationists have more money; anyone sane but unable/unwilling to bribe Trump has no influence whatsoever with him in this administration.

(I went around saying the above before the election, by the way.)

(Sorry for the mindkilling, it's relevant on the object level.)

Reply
orthonormal's Shortform
orthonormal5mo*Ω120

[EDIT: Never mind, this is just Kleene's second recursion theorem!]

Quick question about Kleene's recursion theorem:

Let's say F is a computable function from ℕ^N to ℕ. Is there a single computable function X from ℕ^N to ℕ such that

X = F(X, y_2,..., y_N) for all y_2,...,y_N in ℕ

(taking the X within F as the binary code of X in a fixed encoding) or do there need to be additional conditions?

Reply
orthonormal's Shortform
orthonormal5moΩ120

My current candidate definitions, with some significant issues in the footnotes:

A fair environment is a probabilistic function F(x1,...,xN)=[X1,...,XN] from an array of actions to an array of payoffs. 

An agent A is a random variable 

A(F,A1,...,Ai−1,Ai=A,Ai+1,...,AN) 

which takes in a fair environment F[1] and a list of agents (including itself), and outputs a mixed strategy over its available actions in F. [2]

A fair agent is one whose mixed strategy is a function of subjective probabilities[3] that it assigns to [the actions of some finite collection of agents in fair environments, where any agents not appearing in the original problem must themselves be fair]. 

Formally, if A is a fair agent in with a subjective probability estimator P, A's mixed strategy in a fair environment F,

A(F,A1,...,Ai−1,Ai=A,Ai+1,...,AN)

should depend only on a finite collection of A's subjective probabilities about outcomes 

{P(Fk(A1,...,AN,B1,...BM))=[X1,...,XN+M]}Kk=1 

for a set of fair environments F1,...,FK and an additional set of fair[4] agents[5] B1,...,BM if needed (note that not all agents need to appear in all environments). 

A fair problem is a fair environment with one designated player, where all other agents are fair agents.

  1. ^

    I might need to require every F to have a default action dF, so that I don't need to worry about axiom-of-choice issues when defining an agent over the space of all fair environments.

  2. ^

    I specified a probabilistic environment and mixed strategies because I think there should be a unique fixed point for agents, such that this is well-defined for any fair environment F. (By analogy to reflective oracles.) But I might be wrong, or I might need further restrictions on F.

  3. ^

    Grossly underspecified. What kinds of properties are required for subjective probabilities here? You can obviously cheat by writing BlueEyedBot into your probability estimator.

  4. ^

    This is an infinite recursion, of course. It works if we require each Bm to have a strictly lower complexity in some sense than A (e.g. the rank of an agent is the largest number K of environments it can reason about when making any decision, and each Bm needs to be lower-rank than A), but I worry that's too strong of a restriction and would exclude some well-definable and interesting agents.

  5. ^

    Does the fairness requirement on the Bm suffice to avert the MetaBlueEyedBot problem in general? I'm unsure.

Reply
orthonormal's Shortform
orthonormal5moΩ562

How do you formalize the definition of a decision-theoretically fair problem, even when abstracting away the definition of an agent as well as embedded agency? 

I've failed to find anything in our literature.

It's simple to define a fair environment, given those abstractions: a function E from an array of actions to an array of payoffs, with no reference to any other details of the non-embedded agents that took those actions and received those payoffs.

However, fair problems are more than just fair environments: we want a definition of a fair problem (and fair agents) under which, among other things:

  • The classic Newcomb's Problem against Omega, with certainty or with 1% random noise: fair
  • Omega puts $1M in the box iff it predicts that the player consciously endorses one-boxing, regardless of what it predicts the player will actually do (e.g. misunderstand the instructions and take a different action than they endorsed): unfair
  • Prisoner's Dilemma between two agents who base their actions on not only each others' predicted actions in the current environment, but also their predicted actions in other defined-as-fair dilemmas: fair
    • For example, PrudentBot will cooperate with you if it deduces that you will cooperate with it and also that you would defect against DefectBot, because it wants to exploit CooperateBots).
  • Prisoner's Dilemma between two agents who base their actions on each others' predicted actions in defined-as-unfair dilemmas: unfair
    • It would let us smuggle in unfairness from other dilemmas; e.g. if BlueEyedBot only tries Löbian cooperation against agents with blue eyes, and MetaBlueEyedBot only tries Löbian cooperation against agents that predictably cooperate with BlueEyedBot, then the Prisoner's Dilemma against MetaBlueEyedBot should count as unfair.

Modal combat doesn't need to worry about this, because all the agents in it are fair-by-construction.

Yeah, I know, it's about a decade late to be asking this question.

Reply
Load More
200How I Learned To Stop Trusting Prediction Markets and Love the Arbitrage
1y
30
49Run evals on base models too!
Ω
1y
Ω
6
36Mesa-Optimizers via Grokking
Ω
3y
Ω
4
39Transitive Tolerance Means Intolerance
4y
13
16Improvement for pundit prediction comparisons
4y
4
140Developmental Stages of GPTs
Ω
5y
Ω
72
62Don't Make Your Problems Hide
5y
5
30Map Errors: The Good, The Bad, and The Territory
5y
4
28Negotiating With Yourself
5y
0
46[Link] COVID-19 causing deadly blood clots in younger people
5y
10
Load More
Qualia
5y
(+466)
Modal combat
9y
(+281/-7)
Mathematical induction
9y
(+421)
Modal logic
9y
(+6)
Löb's theorem
9y
(+38)
Löb's theorem
9y
Peano Arithmetic
9y
(+809)