Wiki Contributions

Comments

I was bouncing around LessWrong and ran into this. I started reading it as though it were a normal post, but then I slowly realized ... 

I think according to typical LessWrong norms, it would be appropriate to try to engage you on the object level claims or talk about the meta-presentation as though you and I were trying to collaborate on figuring things out and how to communicate things.

But according to my personal norms and integrity, if I detect that something is actually quite off (like alarm bells going) then it would be kind of sick to ignore that, and we should actually treat this like a triage situation. Or at least a call to some kind of intervention. And it would be sick to treat this like everything is normal, and that you are sane, and I am sane, and we're just chatting about stuff and oh isn't the weather nice today. 

LessWrong is the wrong place for this to happen. This kind of "prioritization" sanity does not flourish here. 

Not-sane people get stuck on LessWrong in order to stay not-sane because LW actually reinforces a kind of mental unwellness and does not provide good escape routes. 

If you're going to write stuff on LW, it might be better to write a journal about what the various personal, lifestyle interventions you are making to get out of the personal, unwell hole you are in. A kind of way to track your progress, get accountability, and celebrate wins. 

Musings: 

COVID was one of the MMA-style arenas for different egregores to see which might come out 'on top' in an epistemically unfriendly environment. 

I have a lot of opinions on this that are more controversial than I'm willing to go into right now. But I wonder what else will work as one of these "testing arenas." 

I don't interpret that statement in the same way. 

You interpreted it as 'lied to the board about something material'. But to me, it also might mean 'wasn't forthcoming enough for us to trust him' or 'speaks in misleading ways (but not necessarily on purpose)' or it might even just be somewhat coded language for 'difficult to work with + we're tired of trying to work with him'. 

I don't know why you latch onto the interpretation that he definitely lied about something specific. 

I was asked to clarify my position about why I voted 'disagree' with "I assign >50% to this claim: The board should be straightforward with its employees about why they fired the CEO." 

I'm putting a maybe-unjustified high amount of trust in all the people involved, and from that, my prior is very high on "for some reason, it would be really bad, inappropriate, or wrong to discuss this in a public way." And given that OpenAI has ~800 employees, telling them would basically count as a 'public' announcement. (I would update significantly on the claim if it was only a select group of trusted employees, rather than all of them.)

To me, people seem too-biased in the direction of "this info should be public"—maybe with the assumption that "well I am personally trustworthy, and I want to know, and in fact, I should know in order to be able to assess the situation for myself." Or maybe with the assumption that the 'public' is good for keeping people accountable and ethical. Meaning that informing the public would be net helpful. 

I am maybe biased in the direction of: The general public overestimates its own trustworthiness and ability to evaluate complex situations, especially without most of the relevant context. 

My overall experience is that the involvement of the public makes situations worse, as a general rule. 

And I think the public also overestimates their own helpfulness, post-hoc. So when things are handled in a public way, the public assesses their role in a positive light, but they rarely have ANY way to judge the counterfactual. And in fact, I basically NEVER see them even ACKNOWLEDGE the counterfactual. Which makes sense because that counterfactual is almost beyond-imagining. The public doesn't have ANY of the relevant information that would make it possible to evaluate the counterfactual. 

So in the end, they just default to believing that it had to play out in the way it did, and that the public's involvement was either inevitable or good. And I do not understand where this assessment comes from, other than availability bias?

The involvement of the public, in my view, incentivizes more dishonesty, hiding, and various forms of deception. Because the public is usually NOT in a position to judge complex situations and lack much of the relevant context (and also aren't particularly clear about ethics, often, IMO), so people who ARE extremely thoughtful, ethically minded, high-integrity, etc. are often put in very awkward binds when it comes to trying to interface with the public. And so I believe it's better for the public not to be involved if they don't have to be.

I am a strong proponent of keeping things close to the chest and keeping things within more trusted, high-context, in-person circles. And to avoid online involvement as much as possible for highly complex, high-touch situations. Does this mean OpenAI should keep it purely internal? No they should have outside advisors etc. Does this mean no employees should know what's going on? No, some of them should—the ones who are high-level, responsible, and trustworthy, and they can then share what needs to be shared with the people under them.

Maybe some people believe that all ~800 employees deserve to know why their CEO was fired. Like, as a courtesy or general good policy or something. I think it depends on the actual reason. I can envision certain reasons that don't need to be shared, and I can envision reasons that ought to be shared. 

I can envision situations where sharing the reasons could potentially damage AI Safety efforts in the future. Or disable similar groups from being able to make really difficult but ethically sound choices—such as shutting down an entire company. I do not want to disable groups from being able to make extremely unpopular choices that ARE, in fact, the right thing to do. 

"Well if it's the right thing to do, we, the public, would understand and not retaliate against those decision-makers or generally cause havoc" is a terrible assumption, in my view. 

I am interested in brainstorming, developing, and setting up really strong and effective accountability structures for orgs like OpenAI, and I do not believe most of those effective structures will include 'keep the public informed' as a policy. More often the opposite.

Media & Twitter reactions to OpenAI developments were largely unhelpful, specious, or net-negative for overall discourse around AI and AI Safety. We should reflect on how we can do better in the future and possibly even consider how to restructure media/Twitter/etc to lessen the issues going forward.

The OpenAI Charter, if fully & faithfully followed and effectively stood behind, including possibly shuttering the whole project down if it came down to it, would prevent OpenAI from being a major contributor to AI x-risk. In other words, as long as people actually followed this particular Charter to the letter, it is sufficient for curtailing AI risk, at least from this one org. 

Reply1032

The partnership between Microsoft and OpenAI is a net negative for AI safety. And: What can we do about that? 

We should consider other accountability structures than the one OpenAI tried (i.e. the non-profit / BoD). Also: What should they be?

Reply1511

I would never have put it as either of these, but the second one is closer. 

For me personally, I try to always have an internal sense of my inner motivation before/during doing things. I don't expect most people do, but I've developed this as a practice, and I am guessing most people can, with some effort or practice. 

I can pretty much generally tell whether my motivation has these qualities: wanting to avoid, wanting to get away with something, craving a sensation, intention to deceive or hide, etc. And when it comes to speech actions, this includes things like "I'm just saying something to say something" or "I just said something off/false/inauthentic" or "I didn't quite mean what I just said or am saying". 

Although, the motivations to really look out for are like "I want someone else to hurt" or "I want to hurt myself" or "I hate" or "I'm doing this out of fear" or "I covet" or "I feel entitled to this / they don't deserve this" or a whole host of things that tend to hide from our conscious minds. Or in IFS terms, we can get 'blended' with these without realizing we're blended, and then act out of them. 

Sometimes, I could be in the middle of asking a question and notice that the initial motivation for asking it wasn't noble or clean, and then by the end of asking the question, I change my inner resolve or motive to be something more noble and clean. This is NOT some kind of verbal sentence like going from "I wanted to just gossip" to "Now I want to do what I can to help." It does not work like that. It's more like changing a martial arts stance. And then I am more properly balanced and landed on my feet, ready to engage more appropriately in the conversation. 

What does it mean to take personal responsibility? 

I mean, for one example, if I later find out something I did caused harm, I would try to 'take responsibility' for that thing in some way. That can include a whole host of possible actions, including just resolving not to do that in the future. Or apologizing. Or fixing a broken thing. 

And for another thing, I try to realize that my actions have consequences and that it's my responsibility to improve my actions. Including getting more clear on the true motives behind my actions. And learning how to do more wholesome actions and fewer unwholesome actions, over time. 

I almost never use a calculating frame to try to think about this. I think that's inadvisable and can drive people onto a dark or deluded path 😅

I'm fine with drilling deeper but I currently don't know where your confusion is. 

I assume we exist in different frames, but it's hard for me to locate your assumptions. 

I don't like meandering in a disagreement without very specific examples to work with. So maybe this is as far as it is reasonable to go for now. 

Load More