LESSWRONG
LW

Rafael Cosman
62140
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Someone already tried "Chaos-GPT"
Rafael Cosman2y30

I'm curious about if a good "hero-GPT" or "alignment-research-support-GPT" could be useful today or with slightly improved tech. Of course having something like this run autonomously is not without risk, but might be quite valuable/important in the sub-critical AI era.

Reply
Creating a truly formidable Art
Rafael Cosman3y10

Hey Valentine, I really like this post. I think it hits on some key things that traditional LW culture was missing for a while. Was wondering if you've ever encountered The Conscious Leadership Group (https://conscious.is/)- they explicitly train some techniques similar to what you're describing here (as well as some quite different ones).

Reply
Prototype of Using GPT-3 to Generate Textbook-length Content
Rafael Cosman3y10

Cool, thanks for sharing! Hadn't heard of Metaphor before.

Reply
Prototype of Using GPT-3 to Generate Textbook-length Content
Rafael Cosman3y10

I might be able to code up an 'editing' pass to catch things like that!

Reply
Prototype of Using GPT-3 to Generate Textbook-length Content
Rafael Cosman3y10

:)

Reply
Consider using reversible automata for alignment research
Rafael Cosman3y10

Have spent some time playing with reversible CAs, and can confirm that they are very interesting. They are a great example of how provable high-level properties (things like conservation of gliders) can come out of low level properties (reversibility).

Reply
Jailbreaking ChatGPT on Release Day
Rafael Cosman3y22

This is absolutely hilarious, thank you for the post. 

Reply
What is wrong with this approach to corrigibility?
Rafael Cosman3y10

Great answer, thanks!

Reply
What an actually pessimistic containment strategy looks like
Rafael Cosman3y10

Thanks for the post! I think asking AI Capabilities researchers to stop is pretty reasonable, but I think we should be especially careful not to alienate the people closest to our side. E.g. consider how the Protestants and Catholics fought even though they agree on so much.

 

I like focusing on our common ground and using that to win people over. 

Reply
Load More
2Prototype of Using GPT-3 to Generate Textbook-length Content
3y
8
7What is wrong with this approach to corrigibility?
Q
3y
Q
8