Posts

Sorted by New

Wiki Contributions

Comments

I had to read some Lacan in college, putatively a chunk that was especially influential on the continental philosophers we were studying.

Same. I am seeing a trend where rats who had to spend time with this stuff in college say, "No, please don't go here it's not worth it." Then get promptly ignored.

The fundamental reason this stuff is not worth engaging with is because it's a Rorschach. Using this stuff is a verbal performance. We can make analogies to Tarot cards but in the end we're just cold reading our readers.

Lacan and his ilk aren't some low hanging source of zero day mind hacks for rats. Down this road lies a quagmire, which is not worth the effort to traverse.

Thanks for the additions here. I'm also unsure how to gel this definition (which I quite like) with the inner/outer/mesa terminology. Here is my knuckle dragging model of the post's implication:

target_set = f(env, agent)

So if we plug in a bunch of values for agent and hope for the best, the target_set we get might might not be what we desired. This would be misalignment. Whereas the alignment task is more like to fix target_set and env and solve for agent.

The stuff about mesa optimisers mainly sounds like inadequate (narrow) modelling of what env, agent and target_set are. Usually fixating on some fraction of the problem (win the battle, lose the war problem).

Capital gains has important differences to wealth tax. It's a tax on net-wealth-disposed-of-in-a-tax-year, or perhaps the last couple for someone with an accountant.

So your proverbial founder isn't taxed a penny until they dispose of their shares.

Someone sitting on a massive pile of bonds won't be paying capital gains tax, but rather enjoying the interest on them.

I was glad to read a post like this!

The following is as much a comment about EA as it is about rationality:

"My self-worth is derived from my absolute impact on the world-- sometimes causes a vicious cycle where I feel worthless, make plans that take that into account, and feel more worthless."

If you are a 2nd year undergraduate student, this is a very high bar to set.

First impact happens downstream, so we can't know our impact for sure until later. Depending on what we do, until possibly after we are dead.

Second, on the assumption that impact is uncertain, it is possible to live an exemplary life and yet have near zero impact due to factors beyond out control. (You cure cancer moments before the asteroid hits)

Third. If we pull down the veil of ignorance, it is easy to imagine people with the motivation but not the opportunity to have impact. We generally don't think such people have no worth - otherwise what is it all for? By symmetry we should not judge ourselves more harshly than others.

I find intrusive thoughts take hold when I suspect they may be true. I hope this is one which might be exorcised on the basis that it is a bad idea, not an uncomfortable truth.

The description of a particular version of expected utility theory feels very particular to me.

Utility is generally expressed as a function of a random variable. Not as a function of an element from the sample space.

For instance: suppose that my utility is linear in the profit or loss from the following game. We draw one bit from /dev/random. If it is true, I win a pound, else I lose one.

Utility is not here a function of 'the configuration of the universe'. It is a function of a bool. The bool itself may depend on (some subset of) 'the configuration of the universe' but reality maps universe to bool for us, computability be damned.

Just observing that the answer to this question should be more or less obvious from a histogram (assuming large enough N and a sufficient number of buckets), "Is there a substantial discontinuity at the 2% quantile?"

Power law behaviour is not necessary and arguably not sufficient for "superforecasters are a natural category" to win (e.g. it should win in a population in which 2% have a brier score of zero and the rest 1, which is not a power law).

I like this idea generally.

Here is an elaboration on a theme I was thinking of running in a course:

If they could have a single yes / no question answered on the topic, what should most people ask?

The idea being to get people to start thinking about what the best way to probe for more information is when "directly look up the question's answer" is not an option.

This isn't something that can be easily operationalized on a large scale for examination. It is an exercise that could work in small groups.

One way to operationalize would be to construct the group average distribution, and score the question according to (0.5 - sum(mass of states mapping to true))^2. This only works (easily) for questions like, "Is the IOC in either of Geneva or Lugano?"

:D If I could write the right 50-80 words of code per minute my career would be very happy about it.

The human-off-button doesn't help Russell's argument with respect to the weakness under discussion.

It's the equivalent of a Roomba with a zap obstacle action. Again the solution is to dial theta towards the target and hold the zap button assuming free zaps. It still has a closed form solution that couldn't be described as instrumental convergence.

Russell's argument requires a more complex agent in order to demonstrate the danger of instrumental convergence rather than simple industrial machinery operation.

Isnasene's point above is closer to that, but that's not the argument that Russell gives.

'and the assumption that an agent can compute a farsighted optimal policy)'

That assumption is doing a lot of work, it's not clear what is packed into that, and it may not be sufficient to prove the argument.

This misses the original point. The Roomba is dangerous, in the sense that you could write a trivial 'AI' which merely gets to choose angle to travel along, and does so irregardless of grandma in the way.

But such an MDP not going to pose an X-risk. You can write down the objective function (y - x(theta))^2 differentiate wrt theta. Follow the gradient and you'll never end up at an AI overlord. Such a system lacks any analogue of opposable thumbs, memory and a good many other things.

Pointing at dumb industrial machinery operating around civilians and saying it is dangerous may well be the truth, but it's not the right flavour of dangerous to support Russell's claim.

So, yes, it is going to come down to a more nuanced argument.

Load More