LESSWRONG
LW

installgentoo
6220
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
LLM+Planners hybridisation for friendly AGI
installgentoo1y10

Do you want to make a demo with dspy+gpt4 api+fast downward?

Reply
But why would the AI kill us?
installgentoo2y1-3

You can make AI care about us with this one weird trick:

1. Train a separate agent action reasoning network. For LLM tech this should be training on completing interaction sentences, think "Alice pushed Bob. ___ fell due to ___", with a tokenizer that generalizes agents(Alice and Bob) into generic {agent 1, agent n} and "self agent". Then we replace various Alices and Bobs in various action sentences with generic agent tokens, and train on guessing consequences or prerequisites of various actions from real situations that you can get from any text corpus.

2. Prune anything that has to do with agent reasoning from the parent LLM. Any reasoning has to go through the agent reasoning network.

3. In anything that has to do with Q-learning cost we replace tokens in agent reasoning net with "self token", rerun the network, and take highest cost of two. Repeat this for every agent token. Agent will be forced to treat harm to anyone else as harm to itself.

Emulated empathy. I'm concerned that this might produce a basilisk though, also it will be inherently weak against predatory ai as it will be forced to account for its wellbeing too. Might work though if these agents are deployed autonomously and allowed to exponentially grow. I dunno. I think we'll all die this year, knowing humanity.

Reply
7LLM+Planners hybridisation for friendly AGI
1y
2
1Empathy bandaid for immediate AI catastrophe
2y
2