LESSWRONG
LW

740
PaperclipNursery
0010
Message
Dialogue
Subscribe

Biologist (PhD in epigenetics & co-evolution), working in medical science communication.
Interested in AI alignment from an evolutionary and developmental perspective.
Exploring how ideas from biology, cognitive development, and ethics can inform robust alignment strategies.
Here to learn, share ideas, and connect with others working on trustworthy AI.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
So You Think You've Awoken ChatGPT
PaperclipNursery3mo10

Unfortunately, that's just how it is, and prompting is unlikely to save you; you can flip an AI to be harshly critical with such keywords as "brutally honest", but a critic that roasts everything isn't really any better than a critic that praises everything. What you actually need in a critic or collaborator is sensitivity to the underlying quality of the ideas; AI is ill suited to provide this.

Are there any models out there that tend to be better at this sort of task, i.e. constructive criticism? If so, what makes them perform better in this domain? Specific post-training? Also why, wouldn't "the right prompt" be able to compensate for bias in either direction (blatant sycophancy vs. brutal roast)?  

Reply
1From Evolution to Maturation – A Developmental Perspective on Aligning Advanced AI
3mo
0