jessicata

Jessica Taylor. CS undergrad and Master's at Stanford; former research fellow at MIRI.

I work on decision theory, social epistemology, strategy, naturalized agency, mathematical foundations, decentralized networking systems and applications, theory of mind, and functional programming languages.

Blog: unstableontology.com

Twitter: https://twitter.com/jessi_cata

Wikitag Contributions

Comments

Sorted by

Oh, to be clear I do think that AI safery automation is a well targeted x risk effort conditioned on the AI timelines you are presenting. (Related to Paul Christiano alignment ideas, which are important conditional on prosaic AI)

On EV grounds, "2/3 chance it's irrelevant because of AGI in the next 20 years" is not a huge contributor to the EV of this. Because, ok, maybe it reduces the EV by 3x compared to what it would otherwise have been. But there are much bigger than 3x factors that are relevant. Such as, probability of success, magnitude of success, cost effectiveness.

Then you can take the overall cost effectiveness estimate (by combining various factors including probability it's irrelevant due to AGI being too soon) and compare it to other interventions. Here, you're not offering a specific alternative that is expected to pay off in worlds with AGI in the next 20 years. So it's unclear how "it might be irrelevant if AGI is in the next 20 years" is all that relevant as a consideration.

Wasn't familiar. Seems similar in that facts/values are entangled. I was more familiar with Cuneo for that.

Dunno; gym membership also feels like a form of blackmail (although preferable to the alternative forms of blackmail), while home gym reduces the inconvenience of exercising.

I'm not sure what differentiates these in your mind. They both reduce the inconvenience of exercising, presumably? Also, in my post I'm pretty clear that it's not meant as a punishment type incentive:

And it’s prudent to take into account the chance of not exercising in the future, making the investment useless: my advised decision process counts this as a negative, not a useful self-motivating punishment.

...

Generally, it seems like the problem is signaling. You buy the gym membership to signal your strong commitment to yourself. Then you feel good about sending a strong signal. And then the next day you feel just as lazy as previously, and the fact that you already paid for the membership probably feels bad.

That's part of why I'm thinking an important step is checking whether one expects the action to happen if the initial steps are taken. If not then it's less likely to be a good idea.

There is some positive function of the signaling / hyperstition, but it can lead people to be unnecessarily miscalibrated.

  1. I was already paying attention to Ziz prior to this.
  2. Ziz's ideology is already influential. I've been having discussions about which parts are relatively correct or not correct. This is a part that seems relatively correct and I wanted to acknowledge that.
  3. If engagement with Zizian philosophy is outlawed, then only outlaws have access to Zizian philosophy. Antimemes are a form of camouflage. If people refuse to see what is in front of them, people can coordinate crimes in plain sight. (Doesn't apply so much to this post, more of a general statement)
  4. The effect you're pointing too seems very small if it even exists, in terms of causing negative effects.

Okay, I don't think I was disagreeing except in cases of very light satisficer-type self-commitments. Maybe you didn't intend to express disagreement with the post, idk.

So far I don't see evidence that any LessWrong commentator has read the post or understood the main point.

Not disagreeing, but, I'm not sure what you are responding to? Is it something in the post?

We might disagree about the value of thinking about "we are all dead" timelines. To my mind, forecasting should be primarily descriptive, not normative; reality keeps going after we are all dead, and having realistic models of that is probably a useful input regarding what our degrees of freedom are. (I think people readily accept this in e.g. biology, where people can think about what happens to life after human extinction, or physics, where "all humans are dead" isn't really a relevant category that changes how physics works.)

Of course, I'm not implying it's useful for alignment to "see that the AI has already eaten the sun", it's about forecasting future timelines by defining thresholds and thinking about when they're likely to happen and how they relate to other things.

(See this post, section "Models of ASI should start with realism")

I was trying to say things related to this:

In a more standard inference amortization setup one would e.g. train directly on question/answer pairs without the explicit reasoning path between the question and answer. In that way we pay an up-front cost during training to learn a "shortcut" between question and answers, and then we can use that pre-paid shortcut during inference. And we call that amortized inference.

Which sounds like supervised learning. Adam seemed to want to know how that relates to scaling up inference time compute so I said some ways they are related.

I don't know much about amortized inference in general. The Goodman paper seems to be about saving compute by caching results between different queries. This could be applied to LLMs but I don't know of it being applied. It seems like you and Adam like this "amortized inference" concept and I'm new to it so don't have any relevant comments. (Yes I realize my name is on a paper talking about this but I actually didn't remember the concept)

I don't think I implied anything about o3 relating to parallel heuristics.

Load More