cross-posted to pan narrans

Why rationalists should care (more) about free software

especially if you want to upload your brain

In the limit condition freedom of compute is freedom of thought.

As we offload more of our cognition to our computational devices we expose a new threat surface for attacks on our ability to think free of malign or otherwise misaligned influence. The parties who control the computational systems to which you have outsourced your cognition have a vector by which to influence your thinking. This may be a problem for you if their interests are not aligned with your own as they can use this power to manipulate you in service of their goals and against your own.

The fundamental operations of our brains remain difficult to reliably and effectively interfere with primarily because of our ignorance of how to achieve this. This, however, may change as understanding of our wetware increases and subtle direct manipulations of our brain chemistry can be employed to influence our behaviour. A highly granular version of this approach is likely still quite far off but it generally feels more viscerally scary than influencing us via our technology. Surfing the web without ad-block already feels uncomfortably close to the futurama gag about ads in your dreams. Increasing though this is amounting to the same thing. Indeed our technology is already doing this to us, albeit fairly crudely for now, by exploiting our reward circuits and many other subtle systematic flaws in the human psyche.

What is "free" software? Free as in liberty no as in gratuity, as in speech not beer, politically and not necessarily financially. The free software foundation defines free software as adhering to the four essential freedoms which I paraphrase here:

  1. The freedom to run the code however you wish
  2. The freedom to examine its source code so that you can understand and modify it for your own purposes
  3. The freedom to distribute the source code as is
  4. The freedom to distribute modified versions of the source code

Note that code which is 'source available' only really gets you freedom 1, depending on how the code is licenced and built this may not get you any of the others including freedom 0. Much ink has been spilt over the use of the term 'open source' as not going far enough as a result. Free software is often referred to by the acronyms FOSS & FLOSS (Free/Libre and open source software)

The occasionally controversial but ever prescient Richard Stallman (AKA RMS, AKA saint IGNUcius) has been banging on about the problems of proprietary software for nearly forty years at this point. Having essentially predicted the abuses of today's software giants because he got a bad printer diver in the early 1980s.

The problem that Stallman saw with 'proprietary' software, i.e. software which does not meet the criteria of the four essential freedoms, is one of game theoretic incentives. Making software free serves as a pre-commitment mechanism by the software authors to not abuse the users of their software. This works by empowering users to exercise a credible threat of forking the project and cutting devs abusing their position out of the project and any associated revenue streams. Revenue from free software projects can take a number of forms e.g. premium-hosting, donations/pay-what-it's-worth schemes, & service/support agreements, though how to successfully monetise free software remains a hard problem.

As the maker of a piece of propriety software, you are not subject to this kind of check on your power and it is often in your interest to increase lock-in to your product from your users to make it hard for them to leave for a competitor, should they become dissatisfied. The lack of transparency on how proprietary software works can also hide a multitude of sins such as bad security practices and provides scope for extensive surveillance of the users whilst maintaining deniability. Thus free software can serve as a solution to an alignment problem between makers and users of the software.

The speculative fiction of Cory Doctorow and Greg Egan in 'permutation city', along with the speculative (non-fiction?) of Robin Hanson in 'Age of em' has painted pictures of numerous diverse dystopian futures in which software is used to curtail individual liberties, as well as to gas-light, frame control, and otherwise manipulate or abuse people and other conscious entities.

Concerns over these potential abuses have been gaining increasing popular attention in recent years though the emphasis has been placed on Shoshana Zuboff's concept of surveilance capitalism rather than framing the problem, as I suspect Stallman would, as having its root causes in non-free software. In particular, the popularity of the Netflix documentary 'The Social Dilema' made in collaboration with Tristan Harris & Aza Raskin's Centre for human technology has increased public awareness of the problems, solutions, however, remain relatively unspecified.

Computing is becoming ever more ubiquitous, connected and is beginning to be embedded in our bodies, though mostly still as medical devices for now. Whose phone numbers do you know, what about addresses or how to travel there? How's your mental arithmetic? how good is your recall of your chat history with all your friends - would you notice it if was subtly edited in retrospect? Do you have a voice assistant? When was the last time you left your house without your phone? The more of our cognition takes place external to our brains the more vulnerable we are to the technological capture of our thought processes by misaligned entities. If we do not take measures to ensure the alignment of software makers interests with those of software users we invite dystopias galore.

Over the years there have been many explicit efforts by technology companies to lock general-purpose computing devices to vendor-approved applications (e.g. many game consoles & iPhones). This is often in the name of copyright protection and increasingly in recent years in the name of providing better security. 'Better security' of course begs the question, against what threat model? It's better security against malicious 3rd parties but what if I'm worried about what the 1st parties are doing? It comes down to the question of who holds the keys to the locks. I know I'd want to be the one deciding who's signing keys I trust to go in the future-TPM-analog of the computer system emulating my brain and given their track records it's probably not Google, Apple, Amazon, Facebook I'm sorry Meta - rolls eyes, or Microsoft. (The basic competencies, understanding, and good widely adopted low friction systems needed for individuals to be good stewards of their own private keys is a problem in the locked bootloader space as well as the cryptocurrency space.) It is worth noting that at this point in time it is almost impossible and extremely impractical to get a completely free software computer down to the firmware level.

I think a strong case could be made that a 'freedom of compute' should be enshrined in future constitutional settlements on par with freedom of speech as a protection of fundamental freedoms, in service to preserving freedom of thought. FOSS development has been discussed in the EA community as a potentially valuable intervention. Developers seem to be overrepresented in the rationalist community so maybe this is a bit of a touchy subject for any of us working on proprietary code. I'm of the opinion that we as a community should advocate for free software and that there is a certain synergy between the free software movement's goals and those of the rationality community, I'd be interested to hear contrary opinions.

Well-aligned software has the potential to massively improve our lives both at the individual and societal levels, look at what Taiwan is doing with open software in digital governance. Making use of some of the same behavioural modification tricks currently used to sell us crap we don't need and immiserate us as a side effect so that we can be sold the cure can be turned to good. Helping us to establish good habits, to break bad ones and beat akrasia. To collaborate and communicate more deeply and effectively, instead of more shallowly and ineffectually. To be understood not misunderstood, seen for who we are and not through the funhouse mirror of beautification filters. To build a fun world together, not a depressing and solipsistic one.

Disclosure: I am an associate member of the FSF, and pay them an annual membership fee & the link on 'beginning to be embedded in our bodies' is a shamelessly self-promotional link to an episode of my podcast where my co-host and I discuss embedded tech and its implications at length

New Comment
43 comments, sorted by Click to highlight new comments since: Today at 2:30 PM

especially if you want to upload your brain


If I upload my brain as a program, I am quite interested in ensuring that 'users' of that program not have the freedom to run the code however they wish, the freedom to distribute the code however they wish, or the freedom to modify the code however they wish and distribute the modified version.

I would regard the specifics of your brain as private data. The infrastructural code to take a scan of an arbitrary brain and run its consciousness is a different matter. It's the difference between application code and a config file / secrets used in deploying a specific instance. You need to be able to trust the app that running your brain e.g. to not feed it false inputs.

I initially assumed something similar to what you just described. However, it's plausible to me that in practice the line between "program" and "data" might be blurry here.


You probably want brain uploads to be proprietary software controlled by you or brain uploads which are a distinct agent with some degree of autonomy over their legal status/owners of themselves as proprietary software (similar to the right to bodily autonomy in meatspace).

I think what this post is pointing to is a strong desire for the stack of technologies on which a brain is uploaded to be free software, easily modifiable by the distinct agent to suit the agents needs and purpose, and incapable of coercing the agent to some nebulous 'bad' state (think contract drafting em). A more object level framing for this is secure homes for digital people

Thanks for the link. The problem of how to have a cryptographic root of trust for an uploaded person and how to maintain an on going state of trusted operation is a tricky one that I'm aware people have discussed. Though it's mostly well over my cryptography pay grade. The main point I was trying to get at was not primarily about uploaded brains. I'm using them as an anchor at the extreme end of a distribution that I'm arguing we are already on. The problems of being able to trust its own cognition that an uploaded brain has we are already beginning to experience in the aspects of our cognition that we are outsourcing.

Human brains are not just general purpose CPUs much of our cognition is performed on the wetware equivalent of application-specific integrated circuits (ASICs). ASICs that were tuned for applications that are of waning relevance in the current environment. They were tuned for our environment of evolutionary adaptiveness but the modern world presents very different challenges. By analogy it's as if they were tuned for sha256 hashing but Ethereum changed the hash function so the returns have dropped. Not to mention that biology uses terrible, dirty hacky heuristics that would would make a grown engineer cry and statisticians yell WHY! at the sky in existential dread. These leave us wide open to all sorts of subtle exploits that can be utilised by those who have studied the systematic errors we make and if they don't share our interests this is a problem.

Note that I am regarding the specifics of an uploaded brain as personal data which should be subject to privacy protections (both at the technical and policy level) and not as code. This distinction may be less clear for more sophisticated mind upload methods which generate an abstract representation of your brain and run that. If, however, we take a conceptually simpler approach the data/code distinction is cleaner. let's say we have an 'image' of the brain which captures the 'coordinates' (quantum numbers) of all of the subatomic particles that make up your brain. We then run that 'image' in a physics simulation which can also emulate sensory inputs to place the uploadee in a virtual environment. The brain image is data, the physics and sensory emulation engine is code. I suspect a similar reasonable distinction will continue to continue to hold quite well for quite a while even once your 'brain' data starts being represented in a more complex data structure than and N dimensional matrix.

I actually think mind uploading is a much harder problem than many people seem to regard it as, indeed I think it is quite possibly harder than getting to AGI de novo in code. This is for reasons related to neurobiology, imaging technology and computational tractability of physics simulations and I can get into it at greater length if anyone is interested.

I might have misunderstood the part of the OP about 'freedom of compute'. I understood it as proposing a constitutional amendment making 'proprietary software' not a thing and mandating that literally all software be open source.

If that's not what it meant, what does it mean? Open source software is already a thing that exists, I'm not sure how else to interpret the proposed amendment.

You might be interested in this post on the EA Forum advocating for the potential of free/open-source software as an EA cause or EA-adjacent ally movement (see also my more skeptical take in the comments).

I also thought this other EA Forum post was a good overview of the general idea of altruistic software development, some of the key obstacles & opportunities, etc. It's mostly focused on near-term projects for creating software tools that can improve people's reasoning and forecasting performance, not cybersecurity for BCIs or digital people, but you might find it helpful for giving an in-depth EA perspective on software development that sometimes contrasts and sometimes overlaps with the perspective of the open-source movement:

For the time being, proprietary software doesn't seem anywhere close to being the limiting factor on my freedom of thought and speech. I experience social relationships, the media, and government being much more profound checks on my personal freedom and that of others. To be sure, they exert some of their power via software. But the proprietary nature of that software doesn't seem to be central.

Freedom of data, on the other hand, seems like it could be game-changing. Imagine if Facebook, for example, was legally obligated to make the contents of its database freely available. What if a competitor could "fork" not just the interface of Facebook, but its entire userbase? What if all hospitals were required to contribute the contents of their EMR systems to a central database - perhaps with robust anonymization features, but otherwise available to all, at least in some format? What if all businesses above a certain size were required to maintain public records of all purchases and sales above a certain cost? What if SciHub were legal?

I'm not at all convinced that this would be good. I'd want to think through the implications over the long term on a case-by-case basis, though I'm sold on SciHub. My point is that such a change strikes me has having the potential to be powerful.

The fact that they exert some of that power, (an ever increasing amount), through software make the question of the freedom of that software quite relevant to your autonomy in relation to those factors. consider the G0v movement. When working with open government software or at least open APIs civic hackers have been able to get improvements in things like government budgetary transparency, the ease with which you can file your tax forms, the ability to locate retailers with face masks in stock etc. The ability to fork the software used by institutions, do better and essentially embarrass them into adopting the improvements because of how bad their versions are in comparison is surprisingly high leverage.

Data is its whole own complex problem especially personal data that warrants a separate discussion all of it's own. In relation to free software though the most relevant part is open data specifications for formats and data portability between applications so they you are free to take your data between applications.

It seems to me that what you're worried about is a tendency toward increased gating of resources that are not inherently monopolizable. Creating infrastructure that permits inherently hard-to-gate resources, like software or technological designs, to be gated more effectively creates unnecessary opportunities for rent-seeking.

On the other hand, the traditional argument in favor of allowing such gates to be erected is that it creates incentives to produce the goods behind the gates, and we tend to reap much greater rewards in the long run by allowing gates to be temporarily erected and then torn down than we would by prohibiting such gates from ever being erected at all.

The fear is that some savvy agents will erect an elaborate system of gates such that they create, perhaps not a monopoly, but a sufficient system of gates to exact increasing fractions of created wealth over time. I think this is potentially worth worrying about, but it's not clear to me why we'd particularly focus on software as the linchpin of this dynamic, as opposed to all the other forms of gateable wealth. I think this is my primary objection to your argument.

When working with open government software or at least open APIs civic hackers have been able to get improvements in things like government budgetary transparency, the ease with which you can file your tax forms, the ability to locate retailers with face masks in stock etc.

Note that at least budgetary transparency and location of retailers with face masks are questions of data access. Sure, software is required to access that data, but it's making the data available that's key here. Forking software is useless without access to the data that software is meant to deliver. It's also useless without access to sufficiently powerful computing power. For example, if I had the source code for GPT-3, it wouldn't do me any good unless I had a sufficiently powerful supercomputer to run that code on. Furthermore, the human knowledge required to implement and maintain the code base can't just be forked even if you have access to the source code.

Data, microchips, and expertise are where bits meet atoms. Source code is just a fraction of the total capital controlled by a given company.

I care a lot about free (and open source) software.

In particular, I learned programming so I could make some changes to a tablet note-taking app I was using at school.  Open source is the reason why I got into software professionally, and causally connected to a bunch of things in my life.

Some points I have in favor of this:

  • I think having the ability to change tools you use makes using those tools better for thinking
  • In general I'm more excited about a community of 'tool-builders' rather than 'tool-users' (this goes for cognitive tools, too, which the rationality community is great at)
  • Feedback from how people use/modify things (like software) is a big ingredient in making it better and creating more value.

With that said, I think we're still in better need of analogies and thought experiments on what to do with compute, and how to deal with risk.

It's much easier to share the source code to a program than to provide everyone the compute to run it on.  Compute is pretty heterogenous, too, which makes the problem harder.  It's possible that some sort of 'universal basic compute' is the future, but I am not optimistic about that coming before things like food/water/shelter/etc.

The second point is that I think it is important to consider technological downsides when deploying things like this (and sharing open source software / free software counts as deployment).  In general its easier to think of the benefits than the harms, and a bunch of the harms come from tail risks that are hard to predict, but I think this is also worth doing.

I agree that a computing resources to run code on would be a more complex proposition to make available to all my point is more that if you purchase compute you should be free to use it to perform whatever computations you wish and arbitrary barriers should not be erected to prevent you from using it in whatever way you see fit (cough Apple, cough Sony, cough cough).

As the maker of a piece of propriety software, you are not subject to this kind of check on your power and it is often in your interest to increase lock-in to your product from your users to make it hard for them to leave for a competitor, should they become dissatisfied. 

Creating network effects to lock in users is in the interest of open-source projects the same way as it's in the interest of propriety software. If you look in the crypto world someone who bases his technology on Ethereum or Bitcoin can get locked in even if he could fork software.

Thus free software can serve as a solution to an alignment problem between makers and users of the software.

A free software project that makes money by providing support has an incentive to make it hard enough to use the software while a propriety software might see providing support as a cost center and work hard to make the software easier to use so that less support is needed. 

In many cases, free software doesn't create incentives for companies to invest a lot of resources into making the software better for users in the same way that proprietary software does.

NPM recently had the developer of colors.js upload a version that broke a lot of programs because he had no incentive to be aligned with the users of his libraries.

If you want to look at whether incentives are aligned you actually have to look at the business models involved.

NPM recently had the developer of colors.js upload a version that broke a lot of programs because he had no incentive to be aligned with the users of his libraries.

and the fact that it was opensource let others fork it and remove the broken commit, effectively enforcing the power that the users had on the creator. Had it been a closed source (free as in beer) software, the users would have been locked out and with no recourse other than caving to the creator demands.

That retaliation allowed the users to limit the damage because they could work around it. It however did nothing that was disadvantageous to the creator and thus there was no deterrent. The parties were not aligned to cooperate. 

I don't know I'd say that guy torched a lot of future employment opportunities when when he sabotaged his repos. Also obligatory:

There is a distinction between lock-in and the cost of moving between standards, an Ethereum developer trying to move to another blockchain tech is generally moving from one open well documented standard to another. There is even the possibility of (semi-)automated conversion/migration tools. They are not nearly as hamstrung in moving as is the person trying to migrate from a completely un-documented and deliberately obfuscated or even encrypted format to a different one.

The incentive to make it difficult to use if you are providing support has some interesting nuances especially with open core models but it is somewhat cancelled out by the community incentive to make it easy for them to use. Any overt attempts to make things difficult loses the project good will with the community on which they often somewhat depend. The can be an incentive to make a hosted product difficult to deploy if you offer a hosted version, but this is often less of an issue if your are offering enterprise support packages where things other than just the convenient cloud hosting are the main value add.

Free software is not without challenges when it comes to constructing viable business models but there are some example that are working pretty well, off the top of my head RedHat, SUZE, & Nextcloud.

Any overt attempts to make things difficult loses the project good will with the community on which they often somewhat depend. 

You don't need to make any overt attempt to make things difficult to use for complex programs to become difficult to use. If you don't invest effort into making them easy to use they usually become hard to use.

The incentive to make it difficult to use if you are providing support has some interesting nuances especially with open core models but it is somewhat cancelled out by the community incentive to make it easy for them to use. 

There's little community incentive to make software easy to use. Ubuntu's useability for the average person is better than that of more community-driven projects. 

Starting in 2008 (number from memory) Microsoft invested a lot in useability for Microsoft Office. As far as I know, there wasn't anything similar for LibreOffice. GIMP's usability is still similarly bad as it was a decade ago.

Even though Blender basically won, its usability is awful. 

One of the key points of inadequate equilibria is that there are plenty of problems in our society where it's hard to create an alignment between stakeholders to get the problem solved. If we would legislate that all software has to be free software then that would prevent some forms that are currently effectively used to get problems solved from getting solved. 

Maybe, but I would be interested to see that tested empirically by some major jurisdiction. I would bet that in the ascendance of an easy option to use propriety software many more firms would hire developers or otherwise fund the development of features that they needed for their work including usability and design coherence. There is a lot more community incentive to to make it easy to use if the community contains more business whose bottom lines depend on it being easy to use. I suspect propriety software may have us stuck in a local minimum, just because some of the current solutions produce partial alignments does not mean there aren't more optimal solutions available.

In-house software is usually worse as far as UX goes than software where the actual user of the software has to pay money.

Even if companies would care about the useability of in-house software, general usability is not something that you need for particular use-cases. A company is likely going to optimize for its own workflows instead of optimizing for the useability of the average user.

I don't see why we would expect Blender to look different if open source would be legally mandated. Blender is already used a lot commercially.

Yes a lot of in house software has terrible UX, mostly because it is often for highly specialised applications, it may also suffer from limited budget, poor feedback cycles if it was made as a one off by an internal team or contractor, and the target user group is tiny, lack of access to UX expertise etc.

Companies will optimise for their own workflows no doubt but their is often substantial overlap with common issues. Consider the work the redhat/ibm did on pipewire and wire plumber which will soon be delivering a substantially improved audio experience for the Linux desktop as a result of work they were doing anyway for automotive audio systems

I'm not that current with blender but I'm given to under stand there have been some improvements in usability recently as it has seen wider industry adoption and efforts have been directed at improving UX. Large firms with may people using a piece of software are motivated to fund efforts to make using it easier as it makes on-boarding new employees easier. Though given that blender is a fairly technical and specialist application I would not be surprised if it remained somewhat hard to use it's not like there are not UX issues with similarly specialist proprietary apps.

Blender having a bad UX is what makes it a specialist application. If it would be easier to use, I would expect more people to do 3D printing. 

There's certainly not zero investment into improving its UX but the number of resources going into that might be 1-2 orders of magnitude higher if it would be proprietary software.

While I vaguely agree with you, this goes directly against local opinion. Eliezer tweeted about Elon Musk's founding of OpenAI, saying that OpenAI's desire for everyone to have AI has trashed the possibility of alignment in time.

Eliezer's point is well-taken, but the future might have lots of different kinds of software! This post seemed to be mostly talking about software that we'd use for brain-computer interfaces, or for uploaded simulations of human minds, not about AGI. Paul Christiano talks about exactly these kinds of software security concerns for uploaded minds here:

I'm not fundamentally opposed to exceptions in specific areas if there is sufficient reason. If I found the case that AI is such an exception convincing I might carve one out for it. In most cases however and specifically in the mission of raising the sanity waterline so that we collectively make better decisions on things like prioritising x-risks I would argue that a lack of free software and related issues of technology governance are currently a bottleneck in raising that waterline.

This sounds to me like a bunch of buzzwords thrown together. You have argued in your post that it might be useful to have free software but I have seen no argument why it's currently a bottleneck for raising the waterline. 

Apologies but I'm unclear if you are characterising my post or my comment as "a bunch of buzzwords thrown together" could you clarify? The post's main thrust was to make the simpler case that the more of our cognition takes place on a medium which we don't have control over and which is subject to external interest the more concerned we have to be about trusting our cognition. The clearest and most extreme case of this is if your whole consciousness is running one someone else's hardware and software stack. However, I'll grant that I've not yet make the case in full that this is bottleneck in raising the sanity waterline, perhaps this warrants a follow-up post. In outline: the asymmetric power relationship, lack of accountability, transparency oversight or effective governance of the big proprietary tech platforms is undermining trust in our collective and indeed individual ability to discern the quality of information and this erosion of the epistemic commons is undermining our ability to reason effectively and converge on better models. In Aumann agreement terms, common priors are distorted by amplified availability heuristics in online bubbles, common knowledge is compromised by pseudo science, scientific cargo cults, framed in a way that is hard to distinguish from 'the real deal'. Also the 'honest seekers of truth' assumption is undermined by, bots, trolls, and agent provocateur mascaraing as real people acting on behalf of entities with specific agendas. You only fix this with better governance and I content free software is a major part of that better governance model.

The problem of raising the sanity waterline in the world that currently exists is not one of dealing with emulated consciousness. The word bottleneck makes only sense if you apply it to a particular moment in time. Before the invention of emulated consciousness, any problems with emulated consciousness are not bottlenecks.

Yes, I'm merely using a emulated consciousness as the idealised example of a problem that applies to non-emulated consciousnesses that are outsourcing cognitive work to computer systems that are outside of their control and may be misaligned with their interests. This is a bigger problem for you if your are completely emulated but still a problem if you are using computational prostheses. I say it is bottle-necking us because even it's partial form seems to be undermining our ability to have rational discourse in the present.

OpenAI's desire for everyone to have AI

I didn't find the full joke/meme again, but, seriously, OpenAi should be renamed to ClosedAI.

"We need free software and hardware so that we can control the programs that run our lives, instead of having a third party control them."

"We need collective governance and monitoring arrangements to keep unfriendly AI accidents from happening."

These statements appear to be in conflict. Does anyone see a resolution?

Interesting article.  

This is often in the name of copyright protection and increasingly in recent years in the name of providing better security. 'Better security' of course begs the question, against what threat model? It's better security against malicious 3rd parties but what if I'm worried about what the 1st parties are doing?

I'd like to add that FOSS generally has better, although sometimes imperfect (node-ipc malware for example), security against 3rd party malware as well.  This is summed up pretty well with Linus's law: "given enough eyeballs, all bugs are shallow".  If we're going to have BCI in the future it's crucial that it's done securely, FOSS would definitely be an advantageous paradigm for that.

To clarify it's the ability to lock you're bootloader that I'm saying is better protection from 3rd parties not the propriety nature of many of the current locks. The HEADs tools for example which allows you to verify the integrity of your boot image in coreboot would be a FOSS alternative that provides analogous protection. Indeed it's not real security if it's not out there in the open for everyone to hammer with full knowledge of how it works and some nice big bug bounties (intentional or unintentional) on the other side to incentivise some scrutiny.


The only safe AGI software is libre AGI software? I buy that.

Given: Closed source Artificial General Intelligence requires all involved parties to have no irreconcilable differences.

Thence: The winner of a closed source race will inevitably be the party with the highest homogeneity times intelligence.

Thence: Namely, the CCP.

Given: Alignment is trivial.

Thence: The resulting AI will be evil.

Given: Alignment is difficult.

Given: It's not in the CCP's character to care.

Thence: Alignment will fail.

Based on my model of reality, closed sourcing AI research approaches the most wrong and suicidal decisions possible (if you're not the CCP). Closed groups fracture easily. Secrets breed distrust, which in turn breeds greater secrecy and smaller shards. The solution to the inevitable high conflict environment is a single party wielding overwhelming power.

Peace, freedom- civilization starts with trust. Simply building aligned AI is insufficient. Who it is aligned with is absolutely critical.

Given this, to create civilized AI, the creator must create in a civilized manner.

Who it is aligned with is absolutely critical. 

One formula for friendly AI is, that it should be the kind of moral agent a human being would become, if they could improve themselves in a superhuman fashion. (This may sound vague; but imagine that this is just the informal statement, of a precise recipe expressed in terms of computational neuroscience.) 

If one were using a particular human or class of humans as the "seed", they would then only need a slight preponderance of virtue over vice (benevolence over malevolence? reason over unreason?) to be suitable, since such a being, given the chance to self-improve, would want to increase its good and decrease its bad; and, in this scenario, the improved possible person is what the AI is supposed to align with, not the flawed actual person. 

closed sourcing AI research approaches the most wrong and suicidal decisions possible 

One of the risks of open sourcing, arises from the separation between general problem-solving algorithms, and the particular values or goals than govern them. If you share the code for the intelligent part of your AI, you are allowing that problem-solving power to be used for any goal at all. One might therefore wish to only share code for the ethical part of the AI, the part that actually makes it civilized. 

they would then only need a slight preponderance of virtue over vice

This assumes that morality has only one axis, which I find highly unlikely. I would expect the seed to quickly radicalize, becoming good in ways that the seed likes, and becoming evil in ways that the seed likes. Under this model, if given a random axis, the seed comes up good 51% of the time, I would expect the aligned AI to remain 51% good.

Assuming the axes do interact, if they do so inconveniently, for instance if we posit that evil has higher evolutionary fitness, or that self destruction becomes trivially easy at high levels, an error along any one axis could break the entire system.

Also, if I do grant this, then I would expect the reverse to also be true.

One might therefore wish to only share code for the ethical part of the AI

This assumes you can discern the ethical part and that the ethical part is separate from the intelligent part.

Even given that, I still expect massive resources to be diverted from morality towards intelligence, A: because people want power and people with secrets are stronger than those without and B: because people don't trust black boxes, and will want to know what's inside before it kills them.

Thence, civilization would just reinvent the same secrets over and over again, and then the time limit runs out.

they would then only need a slight preponderance of virtue over vice

This assumes that morality has only one axis, which I find highly unlikely.

This is a good and important point. A more realistic discussion of aligning with an idealized human agent might consider personality traits, cognitive abilities, and intrinsic values as among the properties of the individual agent that are worth optimizing, and that's clearly a multidimensional situation in which the changes can interact, even in confounding ways. 

So perhaps I can make my point more neutrally as follows. There is both variety and uniformity among human beings, regarding properties like personality, cognition, and values. A process like alignment, in which the corresponding properties of an AI are determined by the properties of the human being(s) with which it is aligned, might increase this variety in some ways, or decrease it in others. Then, among the possible outcomes, only certain ones are satisfactory, e.g. an AI that will be safe for humanity even if it becomes all-powerful. 

The question is, how selective must one be, in choosing who to align the AI with. In his original discussions of this topic, back in the 2000s, Eliezer argued that this is not an important issue, compared to identifying an alignment process that works at all. He gave as a contemporary example, Al Qaeda terrorists: with a good enough alignment process, you could start with them as the human prototype, and still get a friendly AI, because they have all the basic human traits, and for a good enough alignment process, that should be enough to reach a satisfactory outcome. On the other hand, with a bad alignment process, you could start with the best people we have, and still get an unfriendly AI. 

One might therefore wish to only share code for the ethical part of the AI

This assumes you can discern the ethical part and that the ethical part is separate from the intelligent part.

Well, again we face the fact that different software architectures and development methodologies will lead to different situations. Earlier, it was that some alignment methodologies will be more sensitive to initial conditions than others. Here it's the separability of intelligence and ethics, or problem-solving ability and problems that are selected to be solved. There are definitely some AI designs where the latter can be cleanly separated, such as an expected-utility maximizer with arbitrary utility function. On the other hand, it looks very hard to pull apart these two things in a language model like GPT-3. 

My notion was that the risk of sharing code is greatest for algorithms that are powerful general problem solvers which have no internal inhibitions regarding the problems that they solve; and that the code most worth sharing, is "ethical code" that protects from bad runaway outcomes by acting as an ethical filter. 

But even more than that, I would emphasize that the most important thing is just to solve the problem of alignment in the most important case, namely autonomous superhuman AI. So long as that isn't figured out, we're gambling on our future in a big way. 

[+][comment deleted]2y10