Review

This is a quick update that METR (formerly ARC Evals) is recruiting for four positions. I encourage you to err on the side of applying to positions that interest you even if you’re unsure about your fit! We’re able to sponsor US visas for all the roles below except Research Assistant, and all applications are rolling with no set closing date.

  • Engineering Lead and Senior Software Engineer. You’ll work on our internal platform for evaluating model capabilities (think: 100 docker containers running agents in parallel against different tasks). The work is technically fascinating and you get to be on the cutting edge of what models can do, as well as collaborate with our partners (e.g. major world governments).
  • Human Data Lead. High-quality feedback on agent behavior is a key bottleneck to improving agent performance, and you’ll manage this data generation process by recruiting and managing skilled contractors.
  • Research Assistant. You'll help our Model Evaluation Researchers test model capabilities by designing and implementing tasks, testing agent designs, and reviewing agent performance. Many of our research assistants from earlier this year are now full-time researchers, and we both found that experience useful to gauge fit for a longer-term work relationship. This is a full-time, fully-remote role that requires substantial overlap with North American Pacific Time working hours.


If you know anyone who’d be a good fit, please let them know about these roles or recommend that we reach out to them! If we reach out to and hire a candidate because you filled out this referral formwe will pay you a referral bonus of 5,000 USD. (See referral form for conditions.)

New Comment
1 comment, sorted by Click to highlight new comments since:
[-]Raemon1516

(Might be good to say ‘formerly Arc Evals’ in the title since not everyone will have heard about the name change)