LESSWRONG
LW

3922
Kajus
1192790
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
1Kajus's Shortform
2y
48
Kajus's Shortform
Kajus13d10

I am creating a comparative analysis of cross-posted posts on LW and EAF. Make your bets! 

I will pull all the posts that were posted on both LW and EAF and compare how different topics get different amount of karma and comments (and maybe sentiment of comments) as a proxy for how interested people are and how much they agree with different claims. Make your bests and see if it tells you anything new! I suspect that LW users are much more interested in AI safety and less vegan. They care less about animals and are more skeptical towards utility maximization. Career related posts will fare better on EAF, while rationality (as in rationality as an art) will go better on LW. Productivity posts will have more engagement on EAF. 

It is not possible to check all the bets since the number of cross-posted posts is not that big and they are limited to specific topics. 

   

Reply
⿻ Plurality & 6pack.care
Kajus1mo123

I briefly read 6pack.care website and your post. It sounds to me like an idea supplementary to existing AI safety paradigms and not solving the core problem - aligning AIs. Looking at your website I see that it's already assumed that AI is mostly aligned and issues with rogue AIs are not mentioned in the risks section

A midsize city is hit by floods. The city launches a simple chatbot to help people apply for emergency cash. Here is what attentiveness looks like in action:

  • Listening. People send voice notes, texts, or visit a kiosk. Messages stay in the original language, with a clear translation beside them. Each entry records where it came from and when.
  • Mapping. The team (and the bot) sort the needs into categories: housing, wage loss, and medical care. They keep disagreements visible — renters and homeowners need different proofs.
  • Receipts. Every contributor gets a link to see how their words were used and a button to say “that’s not what I meant.

and so on. 

Reply
Kajus's Shortform
Kajus1mo30

I applied for Thomas Kwa SPAR stream but I have some doubts about the direction of research. I post it here to get feedback on my thoughts. Kwa wants to train models to produce something close to neuralese as reasoning traces and evaluate white box and black box monitoring against these traces. It seems obvious to me that when a model switches to neuralese we already know that something is wrong, so why test our monitors against neuralese?

Reply
Rauno's Shortform
Kajus2mo10

Wikipedia?

Reply
Tomás B.'s Shortform
Kajus2mo74

source on the ED risk?

Reply
Daniel Kokotajlo's Shortform
Kajus3mo20

do you want to stop worrying?

Reply
Raemon's Shortform
Kajus3mo83

I think that on most of the websites only about 1-10% of the users actually post things. I suspect that the number of people having those weird interactions with LLMs (and stopping before posting stuff) is like 10 - 10000 (most likely around 100) times bigger than what we see here

Reply
tailcalled's Shortform
Kajus4mo10

why?

Reply
Kajus's Shortform
Kajus4mo30

The goals we set for AIs in training are proxy goals. We, humans, also set proxy goals. We use KPIs, we talk about solving alignment and ending malaria (proxy to increasing utility, saving lives) budgets and so on. We can somehow focus on proxy goals and maintain that we have some higher level goal at the same time. How is this possible? How can we teach AI to do that? 

Reply
We’re in Deep Research
Kajus4mo10

So I got what I wanted. I tried zed code editor and well... it's free and very agentic. I haven't tried cursor but I think it might be on the same level.

Reply
Load More
7What is the theory of change behind writing papers about AI safety?
Q
7mo
Q
1
1Kajus's Shortform
2y
48