really nice comment that I also happen to agree with. As a programmer working with Claude Code and Cursor every day I have yet to see AI systems achieve "engineering taste", which seems far easier than "research taste" as discussed by OPs. In my experience, these systems cannot perform medium-term planning and execution of tasks, even those that are clearly within distribution.

Perhaps the apparent limitations relate to the independent probability of things going wrong when you aren't maintaining a consistent world-model / in-line learning and feedback.

For example, even if 0.90 of your actions are correct, if they all can independently tank your task then your probably of success after 6 actions... (read more)

7

2

LESSWRONG
LW

LESSWRONG
LW

Michael Swift

Michael Swift

Michael Swift

Michael Swift

Michael Swift

Michael Swift