Auto-Enhance: Developing a meta-benchmark to measure LLM agents’ ability to improve other agents
by Sam F. Brown, BasilLabib, Codruta (Coco) Lugoj, and Sai Sasank Y
Summary * Scaffolded LLM agents are, in principle, able to execute arbitrary code to achieve the goals they have been set. One such goal could be self-improvement. * This post outlines our plans to build a benchmark to measure the ability of LLM agents to modify and improve other LLM...
Jul 22, 202420