AI_WAIFU — LessWrong

LESSWRONG
LW

Replying toMIRI announces new "Death With Dignity" strategy

MIRI announces new "Death With Dignity" strategy

That's great and all, but with all due respect:

Fuck. That. Noise.

Regardless of the odds of success and what the optimal course of action actually is, I would be very hard pressed to say that I'm trying to "help humanity die with dignity". Regardless of what the optimal action should be given that goal, on an emotional level, it's tantamount to giving up.

Before even getting into the cost/benefit of that attitude, in the worlds where we do make it out alive, I don't want to look back and see a version of me where that became my goal. I also don't think that if that was my goal, that I would fight nearly... (read more)

318

Replying toA broad basin of attraction around human values?

AI_WAIFU4y

A broad basin of attraction around human values?

I disagree to an extent. The examples provided seem to me to be examples of "being stupid" which agents generally have an incentive to do something about, unless they're too stupid for that to occur to them. That doesn't mean that their underling values will drift towards a basin of attraction.

The corrigibility thing is a basin of attraction specifically because a corrigible agent has preferences over itself and it's future preferences. Humans do that too sometimes, but the examples provided are not that.

In general, I think you should expect dynamic preferences (cycles, attractors, chaos, etc...) anytime an agent has preferences over it's own future preferences, and the capability to modify it's preferences.

Replying to$100/$50 rewards for good references

AI_WAIFU4y

$100/$50 rewards for good references

If you have access to the episode rewards, you should be able to train an ensemble of NNs using bayes + MCMC, with the final reward as output and the entire episode as input. Maybe using something like this: http://people.ee.duke.edu/~lcarin/sgnht-4.pdf

This get's a lot more difficult if you're trying to directly learn behaviour from rewards or vise-versa because now you need to make assumptions to derive "P(behaviour | reward)" or "P(reward | behaviour)".

Edit: Pretty sure OAI used a reward ensemble in https://openai.com/blog/deep-reinforcement-learning-from-human-preferences/ to generate candidate pairs for further data collection.

From the paper "we sample a large number of pairs of trajectory segments of length k, use each reward predictor in our ensemble to predict which segment will be preferred from each pair, and then select those trajectories for which the predictions have the highest variance across ensemble members."

Replying toThe LessWrong Team is now Lightcone Infrastructure, come work with us!

AI_WAIFU4y

The LessWrong Team is now Lightcone Infrastructure, come work with us!

I'm not convinced. Especially if this sort of underpay is a common policy across multiple orgs across the rationalist and EA communities. In a closed system with 2 people a "fair" price will balance the opportunity cost to the person doing the work and the value both parties assign to the fence.

But this isn't a closed system. I expect that low balling pay has a whole host of higher order negative effects. Off the top of my head:

This strategy is not scaleable. There's a limited pool of talent willing to take a pay cut because they value the output of their own work. There are probably better places to put that

AI_WAIFU4y

The LessWrong Team is now Lightcone Infrastructure, come work with us!

Our current salary policy is to pay rates competitive with industry salary minus 30%.

What was the reasoning behind this? To me this would make sense if there was a funding constraint, but I was under the impression that EA is flush with cash.

If the following are the stated stakes:

If things go right, we can shape almost the full light cone of humanity to be full of flourishing life. Billions of galaxies, billions of light years across, for some 10^36 (or so) years until the heat death of the universe.

Then I would strongly advise against low balling or cheaping-out when it comes to talent acquisition and retention.

Replying toHow might cryptocurrencies affect AGI timelines?

AI_WAIFU5y

How might cryptocurrencies affect AGI timelines?

Here's a legitimate application, buying PornHub Premium. https://news.bitcoin.com/pornhubs-premium-services-crypto-payments-13-digital-assets-supported/.

Online payment processors are an oligopoly and can at any moment revoke a businesses ability to receive online payment even if they're not breaking the law. Thus what business is and is not permissible online is entirely up to the whims of this oligopoly and the law. Crypto provides a way around this.

I'm liking where this story is going.

Replying toCovid 2/11: As Expected

AI_WAIFU5y

Covid 2/11: As Expected

IMO 2020 wasn't a turning point, and Facebook is not special. The events that happend lately have been a predictable development in a steadily escalating trend toward censorship. I'll note that these censorship policies are widespread across every social media platform, and infact extend well beyond social media and apply to the entire infrastructure stack. Everything from DDoS protection services, to cloud service providers, to payment processors have all been getting more bold over the course of several years about pulling plugs on people saying the wrong things or providing platforms for others to say the wrong things. Here's how I think it went down:

1.From 2010-2020 Social media and other SV companies... (read more)

Replying toMaking Vaccine

AI_WAIFU5y

Making Vaccine

I wouldn't look too deeply into that. The selection process for moderators on reddit is essentially first come first serve + how good are you at convincing existing moderators you should join the team. As far as I can tell this process doesn't usually select for "good" moderation, especially once a sub gets big enough that network effects make a subreddit grow despite "bad" moderation. This applies for most values of "good" and "bad".