Oliver Sourbut — LessWrong

LESSWRONG
is fundraising!
LW

Contradict my take on OpenPhil's past AI beliefs

Makes sense. Toby Ord? Does Anders count? Or the actual Bostrom? I think that crowd did better than OP by quite a bit, and the wider Oxford AI safety community was quite good. I only met them months before ChatGPT. Will seems still surprisingly (from my view over-) optimistic, but is doing some pretty relevant and good work right now, and is usually careful to caveat where his optimistic assumptions are loadbearing.

Contradict my take on OpenPhil's past AI beliefs

Oliver Sourbut3d40

I think mainly you're asking about OP in particular, but a side question:

Who is 'Oxford EA'? I definitely interacted with many Oxford-based EA(-adjacent) people, though only since 2022 (pre chatGPT), and the range of views and agendas was broad, and included 'AI soon, very deadly'. I'd guess you mean some more specific (perhaps a funding- or otherwise-rich) smaller group, and I can believe that earlier views were differently distributed.

A Full Epistemic Stack: Knowledge Commons for the 21st Century

Oliver Sourbut3d30

I agree strongly with the benefit of fluent, trustworthy (perhaps customised/contextualised) summarisation. I think LMs are getting there with this, and we should bank on (and advocate/work for) improvements to that kind of capability. Probably costly right now to produce things bespoke, but amortising that by focusing on important, wide-reach content might be quite powerful.

Part of the motive for the discussion here of structure mapping (inference and discourse) is that this epistemic structure metadata can be relatively straightforward to validate, just very time consuming for humans to do. But once pieced together, it should offer useful foundation for all sorts of downstream sense making (like the summarisation you're describing here).

A Full Epistemic Stack: Knowledge Commons for the 21st Century

Oliver Sourbut3d30

It's difficult - the main technical thing is probably identifying (important) non fakes, rather than identifying fakes, which seems very difficult (though maybe another useful layer). The main two mechanisms I think are reputation-based (the point being to make it costly or impossible to fake a reputation for honesty) and cryptographic-based (the point being you can prove A didn't produce X by proving that H produced X).

A Full Epistemic Stack: Knowledge Commons for the 21st Century

Oliver Sourbut3d52

Oh, at this stage yes I'm reasonably sure it'd be technically feasible for those criteria to be operationalised in a mostly not-too-lossy way.

The difficulty is indeed primarily in distribution and in ideals meeting people where they are. (There'd also be some question of token expense if you were running contextually sensitive smarts on every single post, though that continues to get cheaper.)

We've used the term 'aligned recommender systems' to capture the idea that content feeds and information diets could be much more edifying and so on. Trying to be too preachy and prescriptive is presumably doomed to fail. Perhaps an incentive-compatible middle way is to hook into people's more reflective endorsement signals, and better nuanced feedback than clicks and other engagements.

A name for the things that AI companies are building

Oliver Sourbut5d20

'Agentic' or 'agent' is getting a fair bit of currency ('agentic AI workflow', 'LM agent', 'AI agent', etc.)

I think that's fine, and basically accurate. Sometimes it means you need to qualify how autonomous or bounded/unbounded the looping is.

'Model' really gets my goat, is a terrible, already hopelessly conflated term, and should be banned for talking about NNs in almost all contexts. (I have been dragged kicking and screaming into using this sometimes and I'm still sad about it.) 'Reasoning model' is no better. 'Foundation model' and 'language model' are OK, but only if actually talking about foundation and/or language models per se, absent the various finetuning and scaffoldings that are involved in actual AI systems. ('Reward model' and 'world model' and such are very reasonable uses.)

I'm sorry to say that 'neuro-scaffold' isn't going to take off, and I think that's fine. 'Scaffold' is very useful on its own, but 'neuro-scaffold' is a mouthful and also doesn't really connote the specific thing you're meaning to invoke, which is the loopiness and the connection to actuators.

A Case for Model Persona Research

Oliver Sourbut7d4-2

I'm interested to know how (if at all) you'd say the perspective you've just given deviates from something like this:

pretrained LMs can simulate many personas, including interpolations and extrapolations of personas inferred from the training data
- (aside: this presumably also includes various non-first-person non-'self'-representing 'personas' like wikipedia voice and stackoverflow voice)
various forms of conditioning can elicit/filter personas
- finetuning of all sorts
- also prompting (e.g. playscript-style)
- (this can include autoregressive 'self' prompting)
depending on type and level of finetuning, weights can end up more or less cohering on a particular subset/subspace of personas
- (this might be context-dependent, for example if certain prompt conditions, such as a user/assistant playscript, are consistently present in the finetuning, and might not generalise to other prompt conditions)

My current guess is you agree with some reasonable interpretation of all these points. And maybe also have some more nuance you think is important?

Slow corporations as an intuition pump for AI R&D automation

Oliver Sourbut7d20

Given the picture I've suggested, the relevant questions are

What are the returns to thinking longer for experiment design?
- (I tentatively think they're quite harsh)
What are the penalties for parallelised experiment design and/or for teamwork on experiment design?
- (I've written even less on this, but I think it's quite difficult because the key is pooling research taste, either from best-of-k or some team iterative thing, but research taste is mostly acquired through experimental experience)
(Perhaps also as you note, are there cognitive diversity penalties or bonuses at play?)

Slow corporations as an intuition pump for AI R&D automation

Oliver Sourbut7d20

In my somewhat analogous picture (which I've thought through on paper less than you), having AI workforce is more like

having more people
who can effectively think for longer between experiments

(The inverted slowcorp would have fewer people who are more distracted/have less time to think between experiments.)

To me that's the natural operationalisation of this analogy. I notice it doesn't obviously map directly to your analogy, where there are different quantities of compute, different serial research times, etc. Do you think your analogy is more natural, or does it in fact map more closely than I'm seeing?

Oliver Sourbut's Shortform

Oliver Sourbut8d42

I think unless they explicitly want to harm or threaten you, was the point - which incidentally is often a situation not accounted for in the foundational assumptions of many economic models (utility functions generally considered to be independent and monotonic in resources and so on).

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

Sequences

Posts

Wikitag Contributions

Comments

Sequences

Posts

Wikitag Contributions

Comments