LESSWRONG
LW

Michael Liu
20030
Message
Dialogue
Subscribe

I like to lurk. 

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Beware General Claims about “Generalizable Reasoning Capabilities” (of Modern AI Systems)
Michael Liu3mo144

Really good and well written post. While I think it always good to provide vigorous and strong evidence against Gary Marcus (and others) when he defaults to promoting claims that confirm his bias, I wonder if there is are beter long term solutions to this issue. For normal everyday people that don't follow AI or people that show modest interest in learning more about AI, they will likely never see this posts or have even heard of LW/Alignment Forum/80k/etc. I do think that the lack of public mainstream content is part of the issue, but I'm sure that there are lots of difficulties that I am naive to. Also, I sense that there is a distrust of people working in the field being salesman trying to hype up the AI or fuel their ego for how important their work is, so that might not be the best solution either, but I'm curious to hear if anyone is working towards any concrete ideas for giving informed and strongly evidence backed claims about AI progress to the mainstream (think national tv) or what are the biggest roadblocks which make it hard or impossible. 

Reply
Gemini Diffusion: watch this space
Michael Liu3mo30

Agreed. I highly recommend this blog post (https://sander.ai/2024/09/02/spectral-autoregression.html) for concretely understanding why autoregressive and diffusion models are so similar, despite seeming so different.  

Reply
So how well is Claude playing Pokémon?
Michael Liu6mo60

According to the creator the "Claude plays Pokemon", the internal knowledge in Claude can often be more harmful than good for successfully navigating the game. In the system prompt, Claude is specifically told not to trust it's instincts and to rely on the memories in its context. See (starting @ 20:28): 

Reply
No posts to display.