Will_Pearson - LessWrong

I was thinking of having evals that controlled deployment of LLMs could be something that needs multiple stakeholders to agree upon.

Butt really it is a general use pattern.

Will_Pearson's Shortform

Will_Pearson2d1-2

Agreed code as coordination mechanism

Code nowadays can do lots of things, from buying items to controlling machines. This presents code as a possible coordination mechanism, if you can get multiple people to agree on what code should be run in particular scenarios and situations, that can take actions on behalf of those people that might need to be coordinated.

This would require moving away from the “one person committing code and another person reviewing” code model.

This could start with many people reviewing the code, people could write their own test sets against the code or AI agents could be deputised to review the code (when that becomes feasible). Only when an agreed upon number of people thinking the code should it be merged into the main system.

Code would be automatically deployed, using gitops and the people administering the servers would be audited to make sure they didn’t interfere with running of the system without people noticing.

Code could replace regulation in fast moving scenarios, like AI. There might have to be legal contracts that you can’t deploy the agreed upon code or use the code by itself outside of the coordination mechanism.

Express interest in an "FHI of the West"

Will_Pearson7d10

As well as thinking about the need for the place in terms of providing a space for research, it is probably worth thinking about the need for a place in terms of what it provides the world. What subjects are currently under-represented in the world and need strong representation to guide us to a positive future? That will guide who you want to lead the organisation.

Deontic Explorations In "Paying To Talk To Slaves"

Will_Pearson15d10

I admit that it is extreme circumstances that would make slavery consensual and justified. My thinking was if existential risk was involved, you might consent to slavery to avert it. It would have to be a larger entity than a single human doing the enslaving, because I think I agree that individuals shouldn't do consequentialism. Like being a slave to the will of the people, in general. Assuming you can get that in some way.

I don't follow the reasoning here

So let's say the person has given up autonomy to avert existential risk, they should perhaps get something in return. Maybe they get influence, but they can't use influence for their own benefit (as one of the deontological rules stipulates that is disallowed). So they are stuck trying to avert existential risk with no pay off. If you unenslave them you remove the will of the people's voice and maybe increase existential risk or s risks.

Hmm, sorry went off on a bit of tangent here. All very unlikely agreed.

Deontic Explorations In "Paying To Talk To Slaves"

Will_Pearson15d10

Tangential but is there ever justified unconscious slavery. For example if you asked whether you consent to slavery and then your mind wiped, might you get into a situation where the slave doesn't know they consented to it, but the slave master is justified in treating them like a slave.

You would probably need a justification for the master slave relationship. Perhaps it is because it needs to be hidden for a good reason? Or to create a barrier against interacting with the ethical. In order to dissolve such slavery, understanding the justifications for why the slavery started would be important.

Will_Pearson's Shortform

Will_Pearson16d20

Proposal for new social norm - explicit modelling

Something that I think would make rationalists more effective at convincing people is if we had explicit models of the things we care about.

Currently we are at the stage of physicists arguing that the atom bomb might ignite the atmosphere without concrete math and models of how that might happen.

If we do this for lots of issues and have a norm of making models composable this would have further benefits.

People would use the models to make real world decisions with more accuracy
We would create frameworks for modelling that would be easily composable, that other people would use

Both would raise the status and knowledge of the rationalist community.

Vanessa Kosoy's Shortform

Will_Pearson18d10

Does it make sense to plan for one possible world or do you think that the other possible worlds are being adequately planned for and it is only the fast unilateral take off that is neglected currently?

Limiting AI to operating in space makes sense. You might want to pay off or compensate all space launch capability in some way as there would likely be less need.

Some recompense for the people who paused working on AI or were otherwise hurt in the build up to AI makes sense.

Also trying to communicate ahead of time what a utopic vision of AI and humans might look like, so the cognitive stress isn't too major is probably a good idea to commit to.

Committing to support multilateral acts if unilateral acts fail is probably a good idea too. Perhaps even partnering with a multilateral effort so that effort on shared goals can be spread around?

Will_Pearson's Shortform

Will_Pearson18d-10

Relatedly I am thinking about improving the wikipedia page on recursive self-improvement. Does anyone have any good papers I should include? Ideally with models.

Will_Pearson's Shortform

Will_Pearson19d00

I'm starting a new blog here. It is on modelling self-modifying systems, starting with AI. Criticisms welcome

What is the nature of humans general intelligence and it's implications for AGI?

Will_Pearson1mo10

I'm wary about that one, because that isn't a known "general" intelligence architecture, so we can expect AIs to make better learning algorithms for deep neural networks, but not necessarily themselves.

LESSWRONG
LW

Posts

Wiki Contributions

Comments