LESSWRONG
LW

Matthew Khoriaty's Shortform

by Matthew Khoriaty
21st Feb 2025
1 min read
24

2

This is a special post for quick takes by Matthew Khoriaty. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
Matthew Khoriaty's Shortform
144Matthew Khoriaty
26whestler
11cubefox
24TsviBT
-11Canaletto
22Vale
7cubefox
12Jack Payne
8Kajus
4Kajus
6Zachary
6tryhard1000
3Canaletto
1Kajus
1Canaletto
1Kajus
3Matthew Khoriaty
2Matthew Khoriaty
1Matthew Khoriaty
3Viliam
1Matthew Khoriaty
2cubefox
1Matthew Khoriaty
2cubefox
24 comments, sorted by
top scoring
Click to highlight new comments since: Today at 1:14 PM
[-]Matthew Khoriaty4mo144150

The current cover of If Anyone Builds it, Everyone Dies is kind of ugly and I hope it is just a placeholder. At least one of my friends agrees. Book covers matter a lot!

I'm not a book cover designer, but here are some thoughts:

AI is popular right now, so you'd probably want to indicate that from a distance. The current cover has "AI" half-faded in the tagline.

Generally the cover is not very nice to look at. 

Why are you de-emphasizing "Kill Us All" by hiding it behind that red glow?

I do like the font choice, though. No-nonsense and straightforward.

 @Eliezer Yudkowsky @So8res 

Reply32
[-]whestler3mo261

I work as a designer (but not a cover designer) and I agree. This should be redesigned. 

Straight black and white text isn't a great choice here, and makes me think of science-fiction and amateur publications rather than a a serious book about technology, philosophy and consequences. For books with covers which have done well in this space, take a look at the waterstones best sellers for science and tech.

Reply
[-]cubefox3mo112

Yeah. It is probably even more important for the cover to look serious and "academically respectable" than for it to look maximally appealing to a broad audience. It shouldn't give the impression of a science fiction novel or a sensationalist crackpot theory. An even more negative example of this kind (in my opinion) is the American cover of The Beginning of Infinity by David Deutsch.

Reply
[-]TsviBT3mo24-12

My version:

Probably too understated, but it's the sort of thing I like.

GoogleDraw link if anyone wants to copy and modify: https://docs.google.com/drawings/d/10nB-1GC_LWAZRhvFBJnAAzhTNJueDCtwHXprVUZChB0/edit

Reply
[+]Canaletto3mo-110
[-]Vale3mo220

Extremely quickly thrown together concept.

If-Anyone.png

Reply
[-]cubefox3mo*74

Not sure about the italics, but I like showing Earth this way from space. It drives home a sense of scale.

Reply
[-]Jack Payne3mo*12-1

Here are a couple of suggestions:

https://imgur.com/X5yyp9N

https://imgur.com/kxn99Uj

https://imgur.com/UCm4n0W

https://imgur.com/GXQLuOS

Reply1
[-]Kajus3mo8-3

I also took my stab at this idea. Here is my cover. The left empty part is for the back.

Reply
[-]Kajus3mo40

I think that nate soares and yudkowsky aren't really well known names so the cover should do some name dropping (current one doesn't do it)

Reply
[-]Zachary3mo67

it is truly terrible

Reply
[-]tryhard10004mo61

I actually find the font a bit hard to read: my System 1 brain took a noticeable split second (I'd estimate about 0.8 seconds) longer to process the words' semantic meanings than it does with normal, all-lowercase text, or even with the titles of the other book covers at the Amazon link. This took long enough that I could see myself (i.e. my System 1) glossing over this book entirely when scrolling/looking through a page of books, being drawn to more immediately legible items. 

Although the above might just be a quirk of my personal attention/processing style, I wonder if it's worth experimenting with changes in font given this. I'd suspect my experience occurred due in part to the heavy font weight, since the title's characters look less immediately distinguishable (and more blobby) than with lower weights. There are also a few very narrow spaces between adjacent words that probably complicate immediate word distinguishing. As mentioned above, the topic of AI also isn't immediately clear within the title, which I'd worry might lose domain-interested readers if not understood semantically.

Reply
[-]Canaletto3mo*32

Run it a few times in different image generators, and I liked this one actually. It's the same kind of palette but with "photo" of a sunset sky on the background and thinner font. Might be a good starting point as a prototype.

Link to the image. It just looks better if you squint a bit link.

The prompt was: "The ominous cover of "If Anyone Builds it, Everyone Dies" book by Eliezer Yudkowsky and Nate Soares. On black background, grey clouds, illuminated by red light from the ground which is not visible."

Reply
[-]Kajus3mo10

This one is extremely good. I already made one with simple  black/red gradients but I like this one much more. I can mix mine and yours together to create a grammatically correct one 

Reply
[-]Canaletto3mo10

Do it!

Reply
[-]Kajus3mo1-3

If you go to Amazon most of the books in that section look similar 

Reply
[-]Matthew Khoriaty3mo3-1

I'd say that Empire of AI, AI Snake Oil, and The Age of AI are good book covers, and that Genesis and More Everything Forever are bad covers. 

Reply
[-]Matthew Khoriaty4mo21

Scalable oversight is an accessible and relatable kind of idea. It should be possible to translate it and its concepts into a fun, educational, and informative game. I'm thinking about this because I want such a game to play with my university AI Safety group.

Reply
[-]Matthew Khoriaty6mo10

RL techniques (reasoning + ORPO) has had incredible success on reasoning tasks. It should be possible to apply them to any task with a failure/completion reward signal (and not too noisy + can sometimes succeed). 

Is it time to make the automated Alignment Researcher?

Task: write LessWrong posts and comments. Reward signal: get LessWrong upvotes.

More generally, what is stopping people from making RL forum posters on eg Reddit that will improve themselves?

Reply
[-]Viliam6mo30

More generally, what is stopping people from making RL forum posters on eg Reddit that will improve themselves?

Could be a problem with not enough learning data -- you get banned for making the bad comments before you get enough feedback to learn how to write the good ones? Also, people don't necessarily upvote based on your comment alone; they may also take your history into account (if you were annoying in the past, they may get angry also about a mediocre comment, while if you were nice in the past, they may be more forgiving). Also, comments happen in a larger context -- a comment that is downvoted in one forum could be upvoted in a different forum, or in the same forum but a different thread, or maybe even the same thread but a different day (for example, if your comment just says what someone else already said before).

Maybe someone is already experimenting with this on Facebook, but the winning strategy seems to be reposting cute animal videos, or posting an AI generated picture of a nice landscape with the comment "wow, I didn't know that nature is so beautiful in <insert random country>". (At least these seem to be over-represented in my feed.)

Task: write LessWrong posts and comments. Reward signal: get LessWrong upvotes.

Sounds like a good way to get banned. But as a thought experiment, you might start at some place where people judge content less strictly, and gradually move towards more difficult environments? Like, before LW, you should probably master the "change my view" subreddit. Before that, probably Hacker News. I am not sure about the exact progression. One problem is that the easier environments might teach the model actively bad habits that would later prevent it from succeeding in the stricter environments.

But, to state the obvious, this is probably not a desirable thing, because the model could get high LW karma by simply exploiting our biases, or just posting a lot (after it succeed to make a positive comment on average).

Reply
[-]Matthew Khoriaty6mo10

The facebook bots aren't doing R1 or o1 reasoning about the context before making an optimal reinforcement-learned post. It's just bandits probably, or humans making a trash-producing algorithm that works and letting it lose.

 

Agreed that I should try Reddit first. And I think there should be ways to guide an LLM towards the reward signal of "write good posts" before starting the RL, though I didn't find any established techniques when I researched reward-model-free reinforcement learning loss functions that act on the number of votes a response receives. (What I mean is the results of searching DPO's citations for "Vote". Lots of results, though none of them have many citations.)

Reply
[-]cubefox6mo20

Reinforcement Learning is very sample-inefficient compared to supervised learning, so it mostly just works if you have some automatic way of generating both training tasks and reward, which scales to millions of samples.

Reply
[-]Matthew Khoriaty6mo10

Deepseek R1 used 8,000 samples. s1 used 1,000 offline samples. That really isn't all that much.

Reply
[-]cubefox6mo20

S1 is apparently using supervised learning:

We seek the simplest approach to achieve test-time scaling and strong reasoning performance. First, we curate a small dataset s1K of 1,000 questions paired with reasoning traces (...). After supervised finetuning the Qwen2.5-32B-Instruct language model on s1K (...).

But 8000 samples like R1 is a lot less than I thought.

Reply
Moderation Log
More from Matthew Khoriaty
View more
Curated and popular this week
24Comments