This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
Tags
LW
Login
Goal-Directedness
•
Applied to
[Interim research report] Evaluating the Goal-Directedness of Language Models
by
Rauno Arike
3mo
ago
•
Applied to
A "Bitter Lesson" Approach to Aligning AGI and ASI
by
RogerDearnaley
3mo
ago
•
Applied to
Emotional issues often have an immediate payoff
by
Chipmonk
4mo
ago
•
Applied to
Measuring Coherence and Goal-Directedness in RL Policies
by
dx26
6mo
ago
•
Applied to
Understanding mesa-optimization using toy models
by
tilmanr
6mo
ago
•
Applied to
Measuring Coherence of Policies in Toy Environments
by
dx26
7mo
ago
•
Applied to
Refinement of Active Inference agency ontology
by
Roman Leventov
10mo
ago
•
Applied to
Quick thoughts on the implications of multi-agent views of mind on AI takeover
by
Kaj_Sotala
10mo
ago
•
Applied to
Towards an Ethics Calculator for Use by an AGI
by
sweenesm
10mo
ago
•
Applied to
“Clean” vs. “messy” goal-directedness (Section 2.2.3 of “Scheming AIs”)
by
RobertM
10mo
ago
•
Applied to
FAQ: What the heck is goal agnosticism?
by
RobertM
1y
ago
•
Applied to
A thought experiment to help persuade skeptics that power-seeking AI is plausible
by
jacob_drori
1y
ago
•
Applied to
Clarifying how misalignment can arise from scaling LLMs
by
Util
1y
ago
•
Applied to
Think carefully before calling RL policies "agents"
by
TurnTrout
1y
ago