All of Kronopath's Comments + Replies

What would we do if alignment were futile?

Do we have to convince Yann LeCun? Or do we have to convince governments and the public?

(Though I agree that the word "All" is doing a lot of work in that sentence, and that convincing people of this may be hard. But possibly easier than actually solving the alignment problem?)

What would we do if alignment were futile?

A thought: could we already have a case study ready for us?

Governments around the world are talking about regulating tech platforms. Arguably Facebook's News Feed is an AI system and the current narrative is that it's causing mass societal harm due to it optimizing for clicks/likes/time on Facebook/whatever rather than human values.

See also:

... (read more)
That's how you turn a technical field into a cesspit of social commentary and political virtue signaling. Think less AGI-Overwatch committee or GPU-export ban and more "Big business bad!", "AI racist!", "Human greed the real problem!"

All we'd have to do is to convince people that this is actually an AI alignment problem.

That's gonna be really hard, people like Yann lecun (head of Facebook AI) see these problems as evidence that alignment is actually easy. "See, there was a problem with the algorithm, we noticed it and we fixed it, what are you so worried about? This is just a normal engineering problem to be solved with normal engineering means." Convincing them that this is actually an early manifestation of a fundamental difficulty that becomes deadly at high capability levels will be really hard.

Stop button: towards a causal solution

On Wednesday, the lead scientist walks into the lab to discover that the AI has managed to replicate itself several times over, buttons included. The AIs are arranged in pairs, such that each has its robot hand hovering over the button of its partner.

"The AI wasn't supposed to clone itself!" thinks the scientist. "This is bad, I'd better press the stop button on all of these right away!"

At this moment, the robot arms start moving like a swarm of bees, pounding the buttons over and over. If you looked at the network traffic between each computer, you'd see ... (read more)

I disagree with this, sinceBisn't "amount of buttons pressed and AIs shut down", but instead "this AI's button got pressed and this AI shut down". There are, as I mentioned, some problems with this utility function too, but it's really supposed to be a standin for a more principled impact measure.
Discussion with Eliezer Yudkowsky on AGI interventions

Are we sure that OpenAI still believes in "open AI" for its larger, riskier projects? Their recent actions suggest they're more cautious about sharing their AI's source code, and projects like GPT-3 are being "released" via API access only so far. See also this news article that criticizes OpenAI for moving away from its original mission of openness (which it frames as a bad thing).

In fact, you could maybe argue that the availability of OpenAI's APIs acts as a sort of pressure release valve: it allows some people to use their APIs instead of investing in d... (read more)

Your Cheerful Price

This is a fair criticism of my criticism.

I'm glad you thought so! Your criticism is very fair too. And I'm generally curious about why people 'bounce off' the "rationalist community". I'm also mostly a lurker, particularly IRL. And I think a big part of that is the kind of thing you described. But I do want to do better at being open to really trying weird ideas (and in real life too!). (I'm pretty weird to my acquaintances, friends, and family already.) I've already found this 'trick' pretty useful. I haven't had anyone offer a (radically) honest answer to my asking them for a cheerful price. I suspect that the people I've asked don't fully understand that the question is sincere and shouldn't be answered in the context of 'standard' social norms. And that's too bad! I've asked because I'm serious and sincere about wanting to remove any obstacles (or as many as possible) to us making a particular exchange.
Your Cheerful Price

To me this post may very well be a good example of some of the things that make me uncomfortable about the rationalist community, and why I so far have chosen to engage with it very minimally and mostly stay a lurker. At the risk of making a fool of myself, especially since it’s late and I didn’t read the whole post thoroughly (partly because you gave me an excuse not to halfway through) I’m going to try to explain why.

I don’t charge friends for favours, nor would I accept payment if offered. I’m not all that uncomfortable with the idea of “social capital”... (read more)

I think it's important to keep in mind a few things about this (or any other 'weird' social rule/trick/technology/norm/etc.): 1. It doesn't have to be used all the time, let alone frequently, often, or even at all! 2. It doesn't have to replace any other form of trading favors (i.e. exchanging social/friendship capital)! It seems like you're imagining a world, or even just a single relationship/friendship, where each person is frequently, or always, using cheerful pricing instead of all of the existing social/friendship favor trading forms. But I'd be surprised if you couldn't think of any examples where this would work better than the 'social norms' you'd otherwise use. Have you never asked a friend for a favor that involved their professional expertise? That seems like an excellent scenario for this kind of thing – to me anyways. Whereas, under 'social norms', this might require considerable exchanges of social/friendship capital, even with the 'professional' friend offering a 'friend discount', asking them to name a cheerful price signals that you value your friend's expertise and their time, and at a significant premium too. And more too that you're willing solicit a price from them that's higher than they're willing to pay. Regular favors also have the problem of being hard to reject sometimes. I'd expect this to be even more useful when the favor directly requires some kind of financial cost to the friend as well. I've often found that, even when there seems like there might be some kind of mutually beneficial exchange possible, the 'transaction costs' of not having a norm for simply paying for things with money can swamp the (potential) positive gains to both parties. I find a similar dynamic to be at work even when a friend agrees to do a favor 'for free' (i.e. for $0) – their commitment to do the thing is also something like you being their employer, e.g. you can reasonably be upset if they fail to do what they agreed to do. I also don't g
Sunset at Noon

I had to double-check the date on this. This was written in 2017? It feels more appropriate to 2020, where both the literal and metaphorical fires have gotten extremely out of hand.

It becomes relevant every year around this time. :) :/ :O