Sorted by New

Wiki Contributions


Chiming in on toy models of research incentives:  Seems to me like a key feature is that you start with an Arms Race then, after some amount of capabilities accumulate, transitions to the Suicide Race.  But players have only vague estimates of where that threshold is, have widely varying estimates, and may not be able to communicate estimates effectively or persuasively.  Players have a strong incentive to push right up to the line where things get obviously (to them) dangerous, and with enough players, somebody's estimate is going to be wrong.

Working off a model like that, we'd much rather be playing the version where players can effectively share estimates and converge on a view of what level of capabilities makes things get very dangerous.  Lack of constructive conversations with the largest players on that topic do sound like a current bottleneck. 

It's unclear to me to what extent there's even a universal clear distinction understood between mundane weak AI systems with ordinary kinds of risks and superhuman AGI systems with exotic risks that software and business people aren't used to thinking about outside of sci-fi.  That strikes me as a key inferential leap that may be getting glossed over.  

There's quite a lot of effort spent in technology training people that systems are mostly static absent human intervention or well defined automations that some person ultimately wrote, anything else being a fault that gets fixed.  Computers don't have a mind of their own, troubleshoot instead of anthropomorphizing, etc., etc.  That this intuition will at some point stop working or being true of a sufficiently capable system (and that this is fundamentally part of what we mean by human level AGI) is something that probably needs to be focused on more as it's explicitly contrary to the basic induction that's part of usefully working in/on computers.

Expanding on this, even if the above alone isn't sufficient to execute any given plan, it takes most of the force out of any notion that needing humans to operate all of the physical infrastructure is a huge impediment to whatever the AI decides to do.  That level of communication bandwidth is also sufficient to stand up any number of requisite front companies, employing people that can perform complex real-world tasks and provide the credibility and embodiment required to interact with existing infrastructure on human terms without raising suspicion.

Money to get that off the ground is likewise no impediment if one can work 1000 jobs at once, and convincingly impersonate a seperate person for each one.

Doing this all covertly would seemingly require first securing high-bandwidth unmonitored channels where this won't raise alarms, so either convincing the experimenters it's entirely benign, getting them to greenlight something indistinguishable-to-humans from what it wants to do, or otherwise covertly escaping the lab.

Adding the the challenge, any hypothetical "Pivotal Act" would necessarily be such an indistinguishable-to-humans cover for malign action.  Presumably the AI would either be asked to convince people en mass or take direct physical action on a global scale.

That does seem worth looking at and there's probably ideas worth stealing from biology.  I'm not sure you can call that a robustly aligned system that's getting bootstrapped though.  Existing in a society of (roughly) peers and the lack of a huge power disparity between any given person and the rest of humans is anologous to the AGI that can't take over the world yet.  Humans that aquire significant power do not seem aligned wrt what a typical person would profess to and outwardly seem to care about.

I think your point still mostly follows despite that; even when humans can be deceptive and power seeking, there's an astounding amount of regularity in what we end up caring about.

I find myself wanting to reach for an asymptotic function and mapping most of these infinities back to finite values.  I can't quite swallow assigning a non-finite value to infinite lizard.  At some point, I'm not paying any more for more lizard no matter how infinite it gets (which probably means I'd need some super-asymptote that continues working even as infinities get progressively more insane).

I'm largely on board with more good things happening to more people is always better, but I think I'd give up the notion of computing utilions by simple addition before accepting any of the above.

I also reject Pascal's wager, which is a (comparatively simple) instance of these infinite problems, for reasons that seem like they should generalize, but are hard to articulate.  My first stab would be something along the lines of my prior for any given version of heaven existing shrinks at least as fast as the values increase.  I think this follows from finite examples, e.g., if someone offers you a wager with a billion-dollar payout, the chances they're good for it are much less than for a million-dollar payout.  Large swaths of the insane results here stem from accepting bizarre wagers at face value; while that's a useful simplifying assumption for much of philosophy, I think it's one this topic has outgrown.  Absurdity heuristic is a keeper.

Looks like it's just whatever ships with VS 2022: ; No idea if it's actually first party, whitelabel/rebranded, or somewhere inbetween.

I'd guess it's GPT3 running on Azure, as Microsoft has licensed the full version to resell on Azure.  See also

AI tech seen in the wild: I've been writing C# in MS Visual Studios for the current job, and now have full line AI driven code completion out of the box that I'm finding useful in practice.  Much better than anything I've seen for smartphones or e.g. gmail text composition.  In one instance it correctly infered an entire ~120 character line including the entire anonymous function I was passing into the method call.  It won't do the tricky parts at all, but regardless does wonders for cutting through drudgery and general fatigue.  Sure feels like living in the future!

VS has had non-AI based completion of next token, for a long time that's already very good (.NET/C# being strongly typed is a huge boon for these kinds of infernces).  I imagine that extra context is why this is performing so much better than general text completion.

In so far as the answer isn't what gwern already pointed out, bigger, more visible and ambitious software projects take longer to realize, you're more likely to hear about failures, and may not be viable until more of the operational kinks get worked out with more managable projects.  As much novel stuff as DL has enabled we're still not quite mature enough that a generalist is wise to pull DL tools into a project that doesn't clearly require them.

Insightful and concrete in a way I rarely see on this subject; strong upvote.

The question I'm left with is "Who actually wants this?".  Do schools think they need every subject under the sun for legitimacy?  Schools clearly think this is a selling point based on how prominently student to faculty ratios and number of degree programs is advertized.  Do students just so consistently have no idea what they want to do or what they should pay for it (oh you can just change majors it'll be fine [not mentioned: at the cost of addtional years of schooling you'll have to pay for]), and are equiped with a support system that just lets them sleepwalk into crushing debt?

A major difference with Apple is they aren't asking any 3rd party to fund loans (e.g. Dept. of Edu. & Colleges) or pay for service (Med insurance), so zero organized pushback on price.  And in the case of Apple's luxury products the absurd price tag and unaffordability to the masses is part of the status symbol they're selling.

I was mostly responding to the implied "Why did Americans (and possibly people from other NATO countries) have such a bad prediction miss about how the conflict would play out."  I think I agree with everything you wrote above - in particular the invader-installed government seems to be an important distinction, and in a way that casually following world events from the US perspective would not lead one to realize.

Load More