LESSWRONG
LW

346
ryan_b
5259Ω275712730
Message
Dialogue
Subscribe

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
National Institute of Standards and Technology: AI Standards
6ryan_b's Shortform
6y
59
No wikitag contributions to display.
On Dwarkesh Patel’s Podcast With Richard Sutton
ryan_b13d102

It feels to me like Sutton is too deep inside the experiential learning theory. When he says there is no evidence for imitation, this only makes sense if you imagine it strictly according to the RL theory he has in mind. He isn't applying the theory to anything; he is inside the theory and interpreting everything according his understanding of it.

It did feel like there was a lot of talking past one another when Dwarkesh was clearly talking about the superintelligent behaviors everyone is interested in (doing science, math, and engineering) as his model for intelligence, and Sutton is blowing all of this off only to articulate quite late in the game that his perspective is that human infants are his model for intelligence. If this was cleared up early, it would probably have been more productive.

I have always found the concept of a p-zombie kind of silly, but now I feel like we might really have to investigate the question of an approximate i-zombie: if we have a computer than can output anything an intelligent human can, but we stipulate that the computer is not intelligent....and so on and so forth.

On the flip side, it feels kind of like a waste of time. Who would be persuaded by such a thing?

Reply
I have decided to stop lying to Americans about 9/11
ryan_b13d60

Good job saying a brave thing; on a US-based site with a plurality US membership, this was a risk. Well done.

Out of curiosity, how often do conversations about 9/11 come up? For the most part, we don't discuss it that much among ourselves except during the anniversary, although I do make note that it was just a couple of weeks ago and indeed the traditional observance is literally just to talk about where we were and what we were doing at the time, which precisely when the observation about cheering would come up.

It may or may not surprise you that while there were basically no rooms cheering in the US, there was a substantial minority population that celebrated after the fact. Mostly these were people who hated finance and globalization (which the twin towers symbolized) or something about foreign policy (imperialism, colonialism, etc) or as some form of divine punishment (tolerating gay people or interracial marriage or what-have-you).

So, thank you for saying your piece. I appreciate the honesty.

Reply
IABIED Review - An Unfortunate Miss
ryan_b18d61

Strong upvote, I appreciate the inside-view context that you have from publishing a similar book. I bought it as a result of this review.

I cannot, alas, promise a side-by-side review. However, there are a couple of questions I am primed to look for, foremost among them right now: how much detail is invested in identifying the target audience? The impression I am getting so far is that it has been approximately defined as not us, but a lot of complaints seem to turn on this question. I see a lot of discussion about laymen but that's an information level, not a target audience. I don't know if I have seen much discussion of the target audiences at all outside of the AI policy area, come to think of it.

Reply
ryan_b's Shortform
ryan_b23d31

The important information I should take from a strong trend is an axis, or a dimension, rather than the default takeaway of a direction.

I have the colloquial habit of talking about a trend as a direction, leaning on the implicit metaphor of physical space for whatever domain in which the trend appears. I've only just today started to realize that, while I am pretty sure the physical space (or, probably, geography and maps) metaphor is why I speak that way, there's no reason not to lean into it as an explicit description within the abstract space for the domain. By this I mean that whatever it is we are talking about (the domain) has several important parts to it (dimensions), and taken together these form the space of the domain.

Returning to the direction v. dimension takeaway, this basically means that the important thing is the dimension (or dimensions) along which the trend moves, so it is worth looking at the opposite direction of the trend as well.

This is basically the same as the idea as taking good advice and reversing it, just looking at changes in the world instead.

Reply
Underdog bias rules everything around me
ryan_b2mo20

Boiling this down for myself a bit, I want to frame this as a legibility problem: we can see our own limitations, but outsiders successes are much more visible than their limitations.

Reply
Yudkowsky on "Don't use p(doom)"
ryan_b2mo50

I'm inclined to look at the blunt limitations of bandwidth on this one. The first hurdle is that p(doom) can pass through tweets and shouted conversations in bay area house parties.

Reply
Yudkowsky on "Don't use p(doom)"
ryan_b2mo42

I also think he objects to putting numbers on things, and I also avoid doing it. A concrete example: I explicitly avoid putting numbers on things in LessWrong posts. The reason is straightforward - if a number appears anywhere in the post, about half of the conversation in the comments will be on that number to the exclusion of the point of the post (or the lack of one, etc). So unless numbers are indeed the thing you want to be talking about, in the sense of detailed results of specific computations, they are positively distracting from the rest of the post for the audience.

I focused on the communication aspect in my response, but I should probably also say that I don't really track what the number is when I actually go to the trouble of computing a prior, personally. The point of generating the number is clarifying the qualitative information, and then the point remains the qualitative information after I got the number; I only really start paying attention to what the number is if it stays consistent enough after doing the generate-a-number move that I recognize it as being basically the same as the last few times. Even then, I am spending most of my effort on the qualitative level directly.

I make an analogy to computer programs: the sheer fact of successfully producing an output without errors weighs much more than whatever the value of the output is. The program remains our central concern, and continuing to improve it using known patterns and good practices for writing code is usually the most effective method. Taking the programming analogy one layer further, there's a significant chunk of time where you can be extremely confident the output is meaningless; suppose you haven't even completed what you already know to be minimum requirements, and compile the program anyway, just to test for errors so far. There's no point in running the program all the way to an output, because you know it would be meaningless. In the programming analogy, a focus on the value of the output is a kind of "premature optimization is the root of all evil" problem.

I do think this probably reflects the fact that Eliezer's time is mostly spent on poorly understood problems like AI, rather than on stable well-understood domains where working with numbers is a much more reasonable prospect. But it still feels like even in the case where I am trying to learn something that is well-understood, just not by me, trying for a number feels opposite the idea of hugging the query, somehow. Or in virtue language: how does the number cut the enemy?

Reply
Yudkowsky on "Don't use p(doom)"
ryan_b2mo53

I can't speak for Eliezer, but I can make some short comments about how I am suspicious of thinking in terms of numbers too quickly. I warn you beforehand my thoughts on the subject aren't very crisp (else, of course, I could put a number on them!)

Mostly I feel like emphasizing the numbers too much fails to respect the process by which we generate them in the first place. When I go as far as putting a number on it, the point is to clarify my beliefs on the subject; it is a summary statistic about my thoughts, not the output of a computation (I mean it technically is, but not a legible computation process we can inspect and/or maybe reverse). The goal of putting a number on it, whatever it may be, is not to manipulate the number with numerical calculations any more than the goal of writing an essay to is grammatically manipulate the concluding sentence, in my view.

Through the summary statistic analogy, I think that I basically disagree with the idea of numbers providing a strong upside in clarity. While I agree that numbers as a format are generally clear, they are only clear as far as that number goes - they communicate very little about the process by which they were reached, which I claim is the key information we want to share.

Consider the arithmetic mean. This number is perfectly clear, insofar as it means there are some numbers which got added together and then divided by how many numbers were summed. Yet this tells us nothing about how many numbers there were, or what the values of the numbers themselves were, or how wide the range of numbers was, or what the possible values were; there are infinitely many variations behind just the mean. It is also true going from no number at all to a mean screens out infinitely many possibilities, and I expect that infinity is substantially larger than the number of possibilities behind any given average. I feel like the crux of my disagreement with the idea of emphasizing numbers is people who endorse them strongly look at the number of possibilities eliminated in the step of going from nothing to an average and think "Look at how much clarity we have gained!" whereas I look at the number of possibilities remaining and think "This is not clear enough to be useful."

The problem gets worse when numbers are used to communicate. Supposing two people meet in a Bay Area House Party and tell each other their averages. If they both say "seven," they'll probably assume they agree, even though it is perfectly possible for the average of what to have literally zero overlap. This is the point at which numbers turn actively misleading, in the literal sense that before they exchanged averages they at least knew they knew nothing, and after exchanging averages they wrongly conclude they agree.

Contrast this with a more practical and realistic case where we might get two different answers on something like probabilities from a data science question. Because it's a data science question we are already primed to ask questions about the underlying models and the data to see why the numbers are different. We can of course do the same with the example about averages, but in the context of the average even giving the number in the first place is a wasted step because we gain basically nothing until we have the data information (where the sum-of-all-n-divided-by-n is the model). By contrast, in the data science question we can reasonably infer that the models will be broadly similar, and that if they aren't that information by itself likely points to the cruxes between them. As a consequence, getting the direct numbers is still useful; if two data science sources give very similar answers, they likely do agree very closely.

In sum, we collectively have gigantic uncertainty about the qualitative questions of models and data for whether AI can/will cause human extinction. I claim the true value of quantifying our beliefs, the put-a-number-on-it mental maneuver, is clarifying the qualitative questions. This is also what we really want to be talking about with other people. The trouble is the number we have put on all of this internally is what we communicate but does not contain the process for generating the number, and then the conversation invariably becomes about the numbers, and in my experience this actively obscures the key information we want to exchange.

Reply
DeepSeek v3.1 Is Not Having a Moment
ryan_b2mo*30

I suspect DeepSeek is unusually vulnerable to the problem of switching hardware because my expectation for their cost advantage fundamentally boils down to having invested a lot of effort in low-level performance optimization to reduce training/inference costs.

Switching the underlying hardware breaks all this work. Further, I don't expect the Huawei chips to be as easy to optimize as the Nvidia H-series, because the H-series are built mostly the same way as Nvidia has always built them (CUDA), and Huawei's Ascend is supposed to be a new architecture entirely. Lots of people know CUDA; only Huawei's people know how the memory subsystem for Ascend works.

If I am right, it looks like they got hurt by bad timing this round same way as they benefited from good timing last round.

Edit: Finally found a reasonable description of what happened. They were programming Nvidia hardware in assembly. My hardware switch guess is confirmed - this has wiped out their primary advantage. If they continue to fade, I think we could fairly assess them as a casualty of politics.

Reply
Elizabeth's Shortform
ryan_b2mo40

From that experience, what do you think of the learning value of being in a job you are not qualified for? More specifically, do you think you learned more from being in the job you weren't qualified for than you did in in other jobs that matched your level better?

Reply
Load More
13Near term discussions need something smaller and more concrete than AGI
9mo
0
25SB 1047 gets vetoed
1y
1
7If I ask an LLM to think step by step, how big are the steps?
Q
1y
Q
1
9Do you have a satisfactory workflow for learning about a line of research using GPT4, Claude, etc?
Q
2y
Q
3
7My simple model for Alignment vs Capability
2y
0
7Assuming LK99 or similar: how to accelerate commercialization?
Q
2y
Q
5
50They gave LLMs access to physics simulators
3y
18
25What to do when starting a business in an imminent-AGI world?
Q
3y
Q
7
60Common Knowledge is a Circle Game for Toddlers
4y
1
37Wargaming AGI Development
4y
10
Load More