Thrasymachus - LessWrong

I see the concerns as these:

The four corners of the agreement seem to define 'disparagement' broadly, so one might reasonably fear (e.g.) "First author on an eval especially critical of OpenAI versus its competitors", or "Policy document highly critical of OpenAI leadership decisions" might 'count'.
Given Altman's/OpenAI's vindictiveness and duplicity, and the previous 'safeguards' (from their perspective) which give them all the cards in terms of folks being able to realise the value of their equity, "They will screw me out of a lot of money if I do something they really don't like (regardless of whether it 'counts' per the non-disparagement agreement)" seems a credible fear.
1. It appears Altman tried to get Toner kicked off the board for being critical of OpenAI in a policy piece, after all.
This is indeed moot for roles which require equity to be surrendered anyway. I'd guess most roles outside government (and maybe some within it) do not have such requirements. A conflict of interest roughly along the lines of the first two points makes impartial performance difficult, and credible impartial performance impossible (i.e. even if indeed Alice can truthfully swear "My being subject to such an agreement has never influenced my work in AI policy", reasonable third parties would be unwise to believe her).
The 'non-disclosure of non-disparagement' makes this worse, as it interferes with this conflict of interest being fully disclosed. "Alice has a bunch of OpenAI equity" is one thing, "Alice has a bunch of OpenAI equity, and has agreed to be beholden to them in various ways to keep it" is another. We would want to know the latter to critically appraise Alice's work whenever it is relevant to OpenAI's interests (and I would guess a lot of policy/eval/reg/etc. would be sufficiently relevant that we'd like to contemplate whether Alice's commitments colour her position). Yet Alice has also promised to keep these extra relevant details secret.

I can't help with the object level determination, but I think you may be overrating both the balance and import of the second-order evidence.

As far as I can tell, Yudkowsky is a (?dramatically) pessimistic outlier among the class of "rationalist/rationalist-adjacent" SMEs in AI safety, and probably even more so relative to aggregate opinion without an LW-y filter applied (cf.). My impression of the epistemic track-record is Yudkowsky has a tendency of staking out positions (both within and without AI) with striking levels of confidence but not commensurately-striking levels of accuracy.

In essence, I doubt there's much epistemic reason to defer to Yudkowsky more (or much more) than folks like Carl Shulman, or Paul Christiano, nor maybe much more than "a random AI alignment researcher" or "a superforecaster making a guess after watching a few Rob Miles videos" (although these have a few implied premises around difficulty curves/ subject matter expertise being relatively uncorrelated to judgemental accuracy).

I suggest ~all reasonable attempts at idealised aggregate wouldn't take a hand-brake turn to extreme pessimism on finding Yudkowsky is. My impression is the plurality LW view has shifted more from "pretty worried" to "pessimistic" (e.g. p(screwed) > 0.4) rather than agreement with Yudkowsky, but in any case I'd attribute large shifts in this aggregate mostly to Yudkowsky's cultural influence on the LW-community plus some degree of internet cabin fever (and selection) distorting collective judgement.

None of this is cause for complacency: even if p(screwed) isn't ~1, > 0.1 (or 0.001) is ample cause for concern, and resolution on values between (say) [0.1 0.9] is informative for many things (like personal career choice). I'm not sure whether you get more yield for marginal effort on object or second-order uncertainty (e.g. my impression is the 'LW cluster' trends towards pessimism, so adjudicating whether this cluster should be over/under weighted could be more informative than trying to get up to speed on ELK). I would guess, though, that whatever distils out of LW discourse in 1-2 months will be much more useful than what you'd get right now.

My experience at and around MIRI and CFAR (inspired by Zoe Curzi's writeup of experiences at Leverage)

Thrasymachus3y70

Looking back, my sense remains that we basically succeeded—i.e., that we described the situation about as accurately and neutrally as we could have. If I'm wrong about this... well, all I can say is that it wasn't for lack of trying.

I think CFAR ultimately succeeded in providing a candid and good faith account of what went wrong, but the time it took to get there (i.e. 6 months between this and the initial update/apology) invites adverse inferences like those in the grandparent.

A lot of the information ultimately disclosed in March would definitely have been known to CFAR in September, such as Brent's prior involvement as a volunteer/contractor for CFAR, his relationships/friendships with current staff, and the events as ESPR. The initial responses remained coy on these points, and seemed apt to give the misleading impression CFAR's mistakes were (relatively) much milder than they in fact were. I (among many) contacted CFAR leadership to urge them to provide more candid and complete account when I discovered some of this further information independently.

I also think, similar to how it would be reasonable to doubt 'utmost corporate candour' back then given initial partial disclosure, it's reasonable to doubt CFAR has addressed the shortcomings revealed given the lack of concrete follow-up. I also approached CFAR leadership when CFAR's 2019 Progress Report and Future Plans initially made no mention of what happened with Brent, nor what CFAR intended to improve in response to it. What was added in is not greatly reassuring:

And after spending significant time investigating our mistakes with regard to Brent, we reformed our hiring, admissions and conduct policies, to reduce the likelihood such mistakes reoccur.

A cynic would note this is 'marking your own homework', but cynicism is unnecessary to recommend more self-scepticism. I don't doubt the Brent situation indeed inspired a lot of soul searching and substantial, sincere efforts to improve. What is more doubtful (especially given the rest of the morass of comments) is whether these efforts actually worked. Although there is little prospect of satisfying me, more transparency over what exactly has changed - and perhaps third party oversight and review - may better reassure others.

REVISED: A drowning child is hard to find

Thrasymachus4y90

The malaria story has fair face validity if one observes the wider time series (e.g.). Further, the typical EA 'picks' for net distribution are generally seen as filling around the edges of the mega-distributors.

FWIW: I think this discussion would be clearer if framed in last-dollar terms.

If Gates et al. are doing something like last dollar optimisation, trying to save as many lives as they can allocating across opportunities both now and in the future, leaving the right now best marginal interventions on the table would imply they expect to exhaust their last dollar on more cost-effective interventions in the future.

This implies the right now marginal price should be higher than the (expected) last dollar cost effectiveness (if not, it should be reallocating some of the 'last dollars' to interventions right now). Yet this in turn does not imply we should see 50Bn of marginal price lifesaving lying around right now. So it seems we can explain Gates et al. not availing themselves of the (non-existent) opportunity to (say) halve communicable diseases for 2Bn a year worldwide (extrapolating from the right now marginal prices) without the right now marginal price being lied about or manipulated. (Obviously, even if we forecast the Gates et al. last dollar EV to be higher than the current marginal price, we might venture alternative explanations of this discrepancy besides them screwing us.)

Please Critique Things for the Review!

Thrasymachus5y80

I also buy the econ story here (and, per Ruby, I'm somewhat pleasantly surprised by the amount of reviewing activity given this).

General observation suggests that people won't find writing reviews that intrinsically motivating (compare to just writing posts, which all the authors are doing 'for free' with scant chance of reward, also compare to academia - I don't think many academics find peer review/refereeing one of the highlights of their job). With apologies for the classic classical econ joke, if reviewing was so valuable, how come people weren't doing it already? [It also looks like ~25%? of reviews, especially the most extensive, are done by the author on their own work].

If we assume there's little intrinsic motivation (I'm comfortably in the 'you'd have to pay me' camp), the money doesn't offer that much incentive. Given Rudy's numbers suppose each of the 82 reviews takes an average of 45 minutes or so (factoring in (re)reading time and similar). If the nomination money is ~roughly allocated by person time spent, the marginal expected return of me taking an hour to review is something like $40. Facially, this isn't too bad an hourly rate, but the real value is significantly lower:

The 'person-time lottery' model should not be denominated by observed person-time so far, but one's expectation how much will be spent in total once reviewing finishes, which will be higher (especially conditioned on posts like this).
It's very unlikely the reward is going to allocated proportionately to time spent (/some crude proxy thereof like word count). Thus the EV would be discounted by whatever degree of risk aversion one has (I expect the modal 'payout' for a review to be $0).
Opaque allocation also incurs further EV-reducing uncertainty, but best guesses suggest there will be Pareto-principle/tournament dynamic game dynamics, so those with (e.g.) reasons to believe they're less likely to impress the mod team's evaluation of their 'pruning' have strong reasons to select themselves out.

Polio and the controversy over randomized clinical trials

Thrasymachus5y40

Sure - there's a fair bit of literature on 'optimal stopping' rules for interim results in clinical trials to try and strike the right balance.

It probably wouldn't have helped much for Salk's dilemma: Polio is seasonal and the outcome of interest is substantially lagged from the intervention - which has to precede the exposure, and so the 'window of opportunity' is quickly lost; I doubt the statistical methods for conducting this were well-developed in the 50s; and the polio studies were already some of the largest trials ever conducted, so even if available these methods may have imposed even more formidable logistical challenges. So there probably wasn't a neat pareto-improvement of "Let's run an RCT with optimal statistical control governing whether we switch to universal administration" Salk and his interlocutors could have agreed to pursue.

Polio and the controversy over randomized clinical trials

Thrasymachus5y110

Mostly I just find it fascinating that as late as the 1950s, the need for proper randomized blind placebo controls in clinical trials was not universally accepted, even among scientific researchers. Cultural norms matter, especially epistemic norms.

This seems to misunderstand the dispute. Salk may have had an overly optimistic view of the efficacy of his vaccine (among other foibles your source demonstrates), but I don't recall him being a general disbeliever in the value of RCTs.

Rather, his objection is consonant with consensus guidelines for medical research, e.g. the declaration of Helsinki (article 8): [See also the Nuremberg code (art 10), relevant bits of the Hippocratic Oath, etc.]

While the primary purpose of medical research is to generate new knowledge, this goal can never take precedence over the rights and interests of individual research subjects.

This cashes out in a variety of ways. The main one is a principle of clinical equipoise - one should only conduct a trial if there is genuine uncertainty about which option is clinically superior. A consequence of this is that clinical trials conducted are often stopped early if a panel supervising the trial finds clear evidence of (e.g.) the treatment outperforming the control (or vice versa) as continuing the trial continues to place those in the 'wrong' arm in harm's way - even though this comes at an epistemic cost as the resulting data is poorer than that which could have been gathered if the trial continued to completion.

I imagine the typical reader of this page is going to tend unsympathetic to the virtue ethicsy/deontic motivations here, but there is also a straightforward utilitarian trade-off: better information may benefit future patients, at the cost of harming (in expectation) those enrolled in the trial. Although RCTs are the ideal, one can make progress with less (although I agree it is even more treacherous), and the question of the right threshold for these is fraught. (There also also natural 'slippery slope' style worries about taking a robust 'longtermist' position in holding the value of the evidence for all future patients is worth much more than the welfare of the much smaller number of individuals enrolled in a given trial - the genesis of the Nuremberg Code need not be elaborated upon.)

A lot of this ethical infrastructure post-dates Salk, but this suggests his concerns were forward-looking rather than retrograde (even if he was overconfident in the empirical premise that 'the vaccine works' which drove these commitments). I couldn't in good conscience support a placebo-controlled trial for a treatment I knew worked for a paralytic disease either. Similarly, it seems very murky to me what the right call was given knowledge-at-the-time - but if Bell and Francis were right, it likely owed more to them having a more reasonable (if ultimately mistaken) scepticism of the vaccine efficacy than Salk, rather him just 'not getting it' about why RCTs are valuable.

Neural Annealing: Toward a Neural Theory of Everything (crosspost)

Thrasymachus5y70

I'm afraid I couldn't follow most of this, but do you actually mean 'high energy' brain states in terms of aggregate neural activity (i.e. the parentheticals which equate energy to 'firing rates' or 'neural activity')? If so, this seems relatively easy to assess for proposed 'annealing prompts' - whether psychedelics/meditation/music/etc. tend to provoke greater aggregate activity than not seems open to direct calorimetry, leave alone proxy indicators.

Yet the steers on this tend very equivocal (e.g. the evidence on psychedelics looks facially 'right', things look a lot more uncertain for meditation and music, and identifying sleep as a possible 'natural annealing process' looks discordant with a 'high energy state' account, as brains seem to consume less energy when asleep than awake). Moreover, natural 'positive controls' don't seem supportive: cognitively demanding tasks (e.g. learning an instrument, playing chess) seem to increase brain energy consumption, yet presumably aren't promising candidates for this hypothesised neural annealing.

My guess from the rest of the document is the proviso about semantically-neutral energy would rule out a lot of these supposed positive controls: the elevation needs to be general rather than well-localized. Yet this is a lot harder to use as an instrument with predictive power: meditation/music/etc. have foci too in the neural activity it provokes.

The unexpected difficulty of comparing AlphaStar to humans

Thrasymachus5y70

Thanks for this excellent write-up!

I'm don't have relevant expertise in either AI or SC2, but I was wondering whether precision might still be a bigger mechanical advantage than the write-up notes. Even if humans can (say) max out at 150 'combat' actions per minute, they might misclick, not be able to pick out the right unit in a busy and fast battle to focus fire/trigger abilities/etc, and so on. The AI presumably won't have this problem. So even with similar EAPM (and subdividing out 'non-combat' EAPM which need not be so accurate), Alphastar may still have a considerable mechanical advantage.

I'd also be interested in how important, beyond some (high) baseline, 'decision making' is at the highest levels of SC2 play. One worry I have is although decision-making is important (build orders, scouting, etc. etc.) what decides many (?most) pro games is who can more effectively micro in the key battles, or who can best juggle all the macro/econ tasks (I'd guess some considerations in favour would be that APM is very important, and that a lot of the units in SC2 are implicitly balanced by 'human' unit control limitations). If so, unlike Chess and Go, there may not be some deep strategic insights Alphastar can uncover to give it the edge, and 'beating humans fairly' is essentially an exercise in getting the AI to fall within the band of 'reasonably human', but can still subtly exploit enough of the 'microable' advantages to prevail.

LESSWRONG
LW

Posts

Wiki Contributions

Comments