Eli Tyre

Wiki Contributions

Comments

That's the hard part.

My guess is that training cutting edge models, and not releasing them is a pretty good play, or would have been, if there wasn't huge AGI hype. 

As it is, information about your models is going to leak, and in most cases the fact that something is possible is most of the secret to reverse engineering it (note: this might be true in the regime of transformer models, but it might not be true for other tasks or sub-problems). 

But on the other hand, given the hype, people are going to try to do the things that you're doing anyway, so maybe leaks about your capabilities don't make that much difference? 

This does point out an important consideration, which is "how much information needs to leak from your lab to enable someone else to replicate your results?"

It seems like, in many cases, there's an obvious way to do some task, and the mere fact that you succeeded is enough info to recreate your result. But presumably there are cases, where you figure out a clever trick, and even if the evidence of your model's performance leaks, that doesn't tell the world how to do it (though it does cause maybe hundreds of smart people to start looking for how you did it, trying to discover how to do it themselves).

I think I should regard the situation differently depending on the status of that axis.

In terms of speeding up AI development, not building anything > building something and keeping it completely secret > building something that your competitors learn about > building something and generating public hype about it via demos > building something with hype and publicly releasing it to users & customers.

I think it is very helpful, and healthy for the discourse, to make this distinction. I agree that many of these things might get lumped together.

But also, I want to flag the possibility that something can be very very bad to do, even if there are there other things that would have been progressively worse to do.

I want to make sure that groups get the credit that is due to them when they do good things against their incentives.

I also want to avoided falling into a pattern of thinking "well they didn't do the worst thing, or the second worst thing, so that's pretty good!" if in isolation I would have thought that action was pretty bad / blameworthy.

As of this moment, I don't have a particular opinion one way or the other about how good or bad Anthropic's release policy is. I'm merely making the abstract point at this time.

I like the creative thinking here.

I suggest a standard here, where can test our "emulation" against the researcher themselves, to see how much of a diff there is in their answers, and the researcher and rate how good a substitute the model is for themselves, on a number of different dimensions.
 

This continues to be one of the best and most important posts I have every read.

I have multiple references that corroborate that.

Can you share? I would like to have a clearer sense of what happened to them. If there's info that I don't know, I'd like to see it.

I do appreciate the conciseness a lot. 

It seems like I maybe would have gotten the same value from the essay (which would have taken 5 minutes to read?) as from this image (which maybe took 5 seconds).

But I don't want to create a culture that rewards snark, even more than it already does. It seems like that is the death of discourse, in a bunch of communities.

So I'm interested in if there are ways to get the benefits here, without the costs.

Downvoted, because even though I think this is reasonable point worth considering, I'm not excited about a LessWrong dominated by snarky memes, that make points, instead of essays.

Yeah, but that's a crux. Tigers might be awesome, but they're not optimal.

I think this was excellently worded, and I'm glad you said it. I'm also glad to have read all the responses, many of which seem important and on point to me. I strong upvoted this comment as well as several of the responses.

I'm leaving this comment, because I want to give you some social reinforcement for saying what you said, and saying it as clearly and tactfully as you did. 

There wasn't actually any such thing as Security, and if there ever was it would mean that it was time to overthrow the government immediately.

I held back tears at this part.

Load More