LESSWRONG
LW

1175
sanyer
2363300
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
What training data should developers filter to reduce risk from misaligned AI? An initial narrow proposal
sanyer1mo*10

Regarding results of empirical evaluations of AI’s scheming-relevant capabilities, I think we could do even better than simple removal by replacing the real results with synthetic results that intentionally mislead the AIs about their own capabilities. So, if the model is good at scheming, we could mislead it by making it believe it is bad at it, and vice-versa. I think this could be quite feasible, since rather than coming up with fully synthetic data that may be easy to spot as synthetic, you only need to replace some specific results. 

Reply
China proposes new global AI cooperation organisation
sanyer3mo81

A link to the original article would be appreciated

Reply
Consider chilling out in 2028
sanyer4mo30

How many people are working on test-time learning? How feasible do you think it is?

Reply
jenn's Shortform
sanyer5mo32

I see. Why do you have this impression that the default algorithms would do this? Genuinely asking, since I haven't seen convincing evidence of this.

Reply
jenn's Shortform
sanyer5mo10

I don't know, the obviously wrong things you see on the internet seems to differ a lot based on your recommendation algorithm. The strawmanny sjw takes you list are mostly absent from my algorithm. In contrast, I see LOTS of absurd right-wing takes in my feed.

Reply
Guide To The Less Wrong Editor
sanyer5mo10

The links to subsections in the table of contents seem to be broken.

Reply
LessWrong Feed [new, now in beta]
sanyer5mo30

It's Galaxy A54.

I'm not sure how to share screenshots on mobile on LW 😅

Reply1
LessWrong Feed [new, now in beta]
sanyer5mo32

The idea seems cool but the feed doesn't work well on my phone. It cuts the sides of the text which makes things unreadable. (I have a Samsung)

Reply
European Links (18.05.25)
sanyer6mo52

Now, the EU itself needs some reforms badly, namely, as Draghi report suggests, relaxing the regulation, but there seems no political will to do that. At least, last time I’ve checked I have still seen those annoying “accept cookies” banners alive and kicking.

This is not true; there is a lot of political will for deregulation and simplification (see e.g. here). Everyone is talking about it in Brussels. 

I assume the point about "accept cookies" banners was a joke, but just in case it wasn't: it takes time for regulations to be changed, so the fact that we still see the "accept cookies" banners offers no evidence that the EU is not taking deregulation seriously (another question is, if getting rid of those banners or other GDPR rules would boost competitiveness; I suspect it won't).

Also, IMO the most important reforms we need are not about regulation, but about harmonizing standards across the EU and creating a true single market.

Reply
Wei Dai's Shortform
sanyer6mo10

I would expect higher competence in philosophy to reduce overcondidence, not increase it? The more you learn, the more you realize how much you don't know

Reply
Load More
35President of European Commission expects human-level AI by 2026
6mo
4
11On epistemic autonomy
1y
0
52Two LessWrong speed friending experiments
1y
3
53Announcing the Double Crux Bot
2y
11