FireStormOOO

Posts

Sorted by New

Wiki Contributions

Comments

Sorted by

Good clarification; not just the amount of influence, something about the way influence is exercised being unsurprising given the task.  Central not just in terms of "how much influence", but also along whatever other axes the sort of influence could vary?

I think if the agent's action space is still so unconstrained there's room to consider benefit or harm that flows through principle value modification it's probably still been given too much latitude.  Once we have informed consent, because the agent has has communicated the benefits and harms as best it understands, it should have very little room to be influenced by benefits and harms it thought too trivial to mention (by virtue of their triviality).

At the same time, it's not clear the agent should, absent further direction, reject the offer to brainwash the principle for resources, as opposed to punting to the principle.  Maybe the principle thinks those values are an improvement and it's free money? [e.g. Prince's insurance company wants to bribe him to stop smoking.]

FireStormOOOΩ120

WRT non-manipulation, I don't suppose there's an easy way to have the AI track how much potentially manipulative influence it's "supposed to have" in the context and avoid exercising more than that influence?

Or possibly better, compare simple implementations of the principle's instructions, and penalize interpretations with large/unusual influence on the principle's values.  Preferably without prejudicing interventions straightforwardly protecting the principle's safety and communication channels.

Principle should, for example, be able to ask the AI to "teach them about philosophy", without it either going out of it's way to ensure Principle doesn't change their mind about anything as a result of the instruction, nor unduly influencing them with subtly chosen explanations or framing.  The AI should exercise an "ordinary" amount of influence typical of the ways AI could go about implementing the instruction.

Presumably there's a distribution around how manipulative/anti-manipulative(value-preserving) any given implementation of the instruction is, and we may want AI to prefer central implementations rather than extremely value-preserving ones.

Ideally AI should also worry that it's contemplating exercising more or less influence than desired, and clarify that as it would any other aspect of the task.

You're very likely correct IMO.  The only thing I see pulling in the other direction is that cars are far more standardized than humans, and a database of detailed blueprints for every make and model could drastically reduce the resolution needed for usefulness.  Especially if the action on a cursory detection is "get the people out of the area and scan it harder", not "rip the vehicle apart".

This is the first text talking about goals I've read that meaningfully engages with "but what if you were (partially) wrong about what you want" instead of simply glorifying "outcome fixation".  This seems like a major missing piece in most advice about goals. That the most important thing about your goals is that they're actually what you want.  And discovering that may not be the case is a valid reason to tap the brakes and re-evaluate.

(Assuming a frame of materialism, physicalism, empiricism throughout even if not explicitly stated)

Some of your scenarios that you're describing as objectionable would reasonably be described as emulation in an environment that you would probably find disagreeable even within the framework of this post.  Being emulated by a contraption of pipes and valves that's worse in every way than my current wetware is, yeah, disagreeable even if it's kinda me.  Making my hardware less reliable is bad.  Making me think slower is bad.  Making it easier for others to tamper with my sensors is bad.  All of these things are bad even if the computation faithfully represents me otherwise.

I'm mostly in the same camp as Rob here, but there's plenty left to worry about in these scenarios even if you don't think brain-quantum-special-sauce (or even weirder new physics) is going to make people-copying fundamentally impossible.  Being an upload of you that now needs to worry about being paused at any time or having false sensory input supplied is objectively a worse position to be in in.

The evidence does seem to lean in the direction that non-classical effects in the brain are unlikely, neurons are just too big for quantum effects between neurons, and even if there were quantum effects within neurons, it's hard to imagine them being stable for even as long as a single train of thought.  The copy losing their train of thought and having momentary confusion doesn't seem to reach the bar where they don't count as the same person?  And yet weirder new physics mostly requires experiments we haven't thought to do yet, or experiments is regimes we've not yet been able to test.  Whereas the behavior of things at STP in water is about as central to things-Science-has-pinned-down as you're going to get.  

You seem to hold that the universe maybe still has a lot of important surprises in store, even within the central subject matter of century old fields?  Do you have any kind of intuition pump for that feeling there's still that many earth-shattering surprises left (while simultaneously holding empiricism and science mostly work)?  My sense of where there's likely to be surprises left is not quite so expansive and this sounds like a crux for a lot of people.  Even as much of a shock as qm was to physics, it didn't invalidate much if any theory except in directly adjacent fields like chemistry and optics.  And working out the finer points had progressively more narrower and shorter reaching impact.  I can't think of examples of surprises with a larger blast radius within the history of vaguely modern science.  Findings of odd as yet unexplained effects pretty consistently precedes attempts at theory.  Empirically determined rules don't start working any worse when we realize the explanation given with them was wrong.

Keep in mind that society holds that you're still you even after a non-trivial amount of head trauma.  So whatever amount of imperfection in copying your unknown-unknowns cause, it'd have to both be something we've never noticed before in a highly studied area, and something more disruptive than getting clocked in the jaw, which seems a tall order.

Keep in mind also that the description(s) of computation that computer science has worked out is extremely broad and far from limited to just electronic circuits.  Electronics are pervasive because we have as a society sunk the world GDP (possibly several times over) into figuring out how to make them cheaply at scale.  Capital investment is the only thing special about computers realized in silicon.  Computer science makes no such distinction.  The notion of computation is so broad that there's little if any room to conceive of an agent that's doing something that can't be described as computation.  Likewise the equivalence proofs are quite broad; it can arbitrarily expensive to translate across architectures, but within each class of computers, computation is computation, and that emulation is possible has proofs.

All of your examples are doing that thing where you have a privileged observer position separate and apart from anything that could be seeing or thinking within the experiment.  You-the-thinker can't simply step into the thought experiment.  You-the-thinker can of course decide where to attach the camera by fiat, but that doesn't tell us anything about the experiment, just about you and what you find intuitive.

Suppose for sake of argument your unknown unknowns mean your copy wakes up with a splitting headache and amnesia for the previous ~12 hours as if waking up from surgery.  They otherwise remember everything else you remember and share your personality such that no one could notice a difference (we are positing a copy machine that more or less works).  If they're not you they have no idea who else they could be, considering they only remember being you.  

The above doesn't change much for me, and I don't think I'd concede much more without saying you're positing a machine that just doesn't work very well.  It's easy for me to imagine it never being practical to copy or upload a mind, or having modest imperfections or minor differences in experience, especially at any kind of scale.  Or simply being something society at large is never comfortable pursuing.  It's a lot harder to imagine it being impossible even in principle with what we already know, or can already rule out with fairly high likelihood.  I don't think most of the philosophy changes all that much if you consider merely very good copying (your friends and family can't tell the difference; knows everything you know) vs perfect copying.

The most bullish folks on LLMs seem to think we're going to be able to make copies good enough to be useful to businesses just off all your communications.  I'm not nearly so impressed with the capabilities I've seen to date and it's probably just hype.  But we are already getting into an uncanny valley with the (very) low fidelity copies current AI tech can spit out - which is to say they're already treading on the outer edge of peoples' sense of self.

Realistically I doubt you'd even need to be sure it works, just reasonably confident.  Folks step on planes all the time and those do on rare occasion fail to deliver them intact at the other terminal.

Within this framework, whether or not you "feel that continuity" would mostly be a fact about the ontology your mindstate uses thinking about teleportation.  Everything in this post could be accurate and none of it would be incompatible with you having an existential crisis upon being teleported, freaking out upon meeting yourself, etc.

Nor does anything here seem to make a value judgement about what the copy of you should do if told they're not allowed to exist.  Attempting revolution seems like a perfectly valid response; self defense is held as a fairly basic human right after all. (I'm shocked that isn't already the plot of a sci-fi story.)

It would also be entirely possible for both of your copies to hold conviction that they're the one true you - Their experiences from where they sit being entirely compatible with that belief. (Definitely the plot of at least one Star Trek episode.)

There's not really any pressure currently to have thinking about mind copying that's consistent with every piece of technology that could ever conceivably be built.  There's nothing that forces minds to have accurate beliefs about anything that won't kill them or wouldn't have killed their ancestors in fairly short order.  Which is to say mostly that we shouldn't expect to get accurate beliefs about weird hypotheticals often without having changed our minds at least once.

There's a presumption you're open to discussing on a discussion forum, not just grandstanding.  Strong downvoted much of this thread for the amount of my time you've wasted trolling.

Bell labs, Xerox park, etc were AFAIK were mostly privately funded research labs that existed for decades and churned out patents that may as well have been money printers.  When AT&T (Bell Labs) was broken up, that research all but started the modern telecom and tech industry, which is now something like 20%+ of the stock market.  If you attribute even a tiny fraction of that to Bell Labs it's enough to fund another 1000 times over.

The missing piece arguably is executive teams with a 25 year vision instead of a 25 week vision, AND the institutional support to see it through; cost cutting is in fashion with investors too.  Private equity is in theory well positioned to repeat this elsewhere, but for reasons I don't entirely understand has become too short sighted and/or has significantly shortened horizons on returns.  IBM, Qualcom, TSMC, ASML, and Intel all seem to have research operations of that same near-legendary caliber, mostly privately funded (albeit treated as a national treasure of strategic importance); what they have in common of course, is they're all tech.  Semiconductor fabrication is extremely research intensive and world class R&D operations are table stakes just to survive to the next process node.

Maybe a good followup question is why hasn't this model spread outside of semiconductors and tech?  Is a functional monopoly a requirement for the model to work? (ASML has a functional monopoly on leading edge photo-lithography machines that power modern semiconductor fabs).  Do these labs ever start independently without a clear lineage to 100 billion+ dollar govt research initiatives?  Electronics and tech is probably many trillions in US govt funding since WWII once you include military R&D and contracts.

Govt. spending is a ratchet that only goes one direction, replacing dysfunctional agencies costs jobs and makes political enemies.  Reform might be more practical, but much like people, very hard to reform an agency that doesn't want to change.  You'd be talking about sustained expenditure of political capital, the sort of thing that requires an agency head who's invested in the change and popular enough with both parties to get to spend a few administrations working at it.

Edit: I answered separately above with regards to private industry.

Load More