LESSWRONG
LW

1658
Justin Olive
47120
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Incorrect Baseline Evaluations Call into Question Recent LLM-RL Claims
Justin Olive3mo20

FYI: this post was included in the Last Week in AI Podcast - nice work!

My team at Arcadia Impact is currently working on standardising evaluations

I notice that you cite instance-level results (2023 Burnell et al.) as being a recommendation. Is there anything else that you think the open-source ecosystem should be doing?

For example, we're thinking about:

  • Standardised implementations of evaluations (i.e. Inspect Evals)
  • Aggregating and reporting results for evaluations (see beta release of the Inspect Evals Dashboard)
  • Tools for automated prompt variation and elicitation*

*This statement especially had me thinking about this: "These include using correct formats and better ways to parse answers from responses, using recommended sampling temperatures, using the same max_output_tokens, and using few-shot prompting to improve format-following."

Reply
A concerning observation from media coverage of AI industry dynamics
Justin Olive3y10

Hi Dave, thanks for the great input from the insider perspective.

Do you have any thoughts on whether risk-aversion (either product-related or misalignment-risk) might be contributing to a migration of talent towards lower-governance zones?

If so, are there any effective ways to combat this that don't translate to accepting higher levels of risk?

Reply
8Feedback request: `eval-crypt` a simple utility to mitigate eval contamination.
Q
2mo
Q
4
38LASR Labs Spring 2025 applications are open!
1y
0
8A concerning observation from media coverage of AI industry dynamics
3y
3