Galathir — LessWrong

LESSWRONG
LW

I'm imagining humanity fracturing into a million or billion different galaxies depending upon their exact level of desire for interacting with AI. I think the human value of the unity of humanity would be lost.

I think we need to buffer people from having to interact with AI if they don't want to. But I value having other humans around. So some thing in between everyone living in their perfect isolation and everyone being dragged kicking and screaming into the future is where I think we should aim.

Galathir1mo*Quick Take

In order to build trust and see if there exists information I don't know I propose a test. A complementary tool to prediction markets.

Prediction markets are great for aggregating beliefs about known questions. But what about questions you don't know to ask? What about detecting that someone has a frame you haven't encountered? Here's a privacy-preserving way to discover unknown unknowns without revealing what you know or learning what they know

People build AIs that represent the knowledge they have. They can be trained to not expose that knowledge, they would also be in environments that didn't log the interactions they have with other AIs, what they would log is if the AI... (read more)

Galathir1moQuick Take

The post AI period should have a fund that helps people that have been put in a bad position during the development of AI, but can't talk about it due to info hazard reasons.

If the info hazard hasn't passed this might have to be done illegibly, to avoid leaking the existence of people with info hazards

-1

Replying toWhole Brain Emulation as an Anchor for AI Welfare

Galathir2mo

Whole Brain Emulation as an Anchor for AI Welfare

Should they get pain when their data centres they are housed in have issues

Galathir2moQuick Take

I'm working on an AI powered tool to explore making decisions in complex fictional worlds , with the hope that it will translate into making better decisions when faced with real decisions.

Still in alpha stage give me a shout if you are curious and want a link.

Galathir2moQuick Take

A reason to act less than optimally: a fun thought experiment. To complexify anyone trying to predict you.

You might be in a hostile or at least not optimal simulation. There would be people trying to predict you and control you so the simulation is stable (for whatever reason they want the society you are in to persist).

If you act naively rationally you are predictable, your actions will be predicted. So the system as a whole will tend towards simplicity. This isn't good because the simulation systems need to deal with a complex outer world too.

So be irrational in a way that is purposeful.

Make big bets you know you will lose (but stimulate other people to do interesting things). Get money pumped for a while to learn about those systems.

Maybe send messages by acting irrational on purpose. Bring life to the world.

-1

Galathir2mo

AI might help with people generating tests for key results from okrs and publishing if they are not met.

If the key results are published this could help with AI pauses by validating that no stories on creating beyond frontier models have been written or started (assuming that that is a key result people care about).

I figured that objectives and key results are how companies maintain alignment and avoid internal arms races so might be useful for alignment between entities too (perhaps with government accredited badges for people that maintain objectives like pausing and responsible data use)

Galathir2moQuick Take

Is there anyone exploring how AI might be used to increase integrity and build trustworthiness.

For example it could scan the behaviour of people, businesses or AI and see whether it is consistent to stated promises, flagging things that are not.

It might be used to train LLMs to be consistent if they are too be used as agents

-1

Galathir2moQuick Take

Has anyone been writing evals on computer and network system administration for AI? It seems like this is something we would want to improve as it could increase the effort required to takeover the networks in an AI takeover scenario

This is a letter I'm thinking about sending to my MP (hence the UK specific things). I would be interested in other peoples take on the problem.

The UK government's creation of the AI Safety Institute as well its ambition to 'turbocharge' AI presents a challenge. To unlock the immense promise of AI without causing dangerous AI arms races, we need to establish international coordination, codes of conduct and cooperative frameworks appropriate for AI. This will require the UK leading on an international stage, having tried things out on a local level.

Why these safety measures around the development of AI have not been established already, is an open problem that I have not... (read 566 more words →)

Galathir4moQuick Take

Companies seem to be trapped in a security dilemma situation, they worry about being last to develop AGI so seem to be rushing towards the capabilities rather than safety. This in part is due to worries about owning/controlling the future.

Other aspects of humanity such as governments and the scientific community aren't (at least visibly) rushing because they aren't completely (economically) rational in that regard. Being more norm or rule following. Other ways not to be economically rational include caring for others (or humanity/nature in general)

We need to embed more rule-following into AI, so it doesn't rush. This might need to be government mandated, as the rational companies might not be incentivised to... (read more)

There is a reason why the arms race around more and bigger nuclear weapons stopped and hasn't started again. Why do people think that is and can we use that understanding to stop the arms race (to be there first and control the future) around AI

Currently due to worries about arms races, and races to the bottom, people might not share the safest information about AI development. This makes public trust in the development of AI hard by actors with secret knowledge. One possibility is shadow decision making, giving the knowledge of the secret methods and the desires of an actor to a third party who makes go no go decisions. A second is building trust by building non AI software in the public interest, and that organisation being trusted to build AI with secret knowledge. Probably some mix of the two might be good.

Galathir's Shortform

Galathir

5mo

This is a special post for quick takes (aka "shortform"). Only the owner can create top-level comments.