An Empirical Review of the Animal Harm Benchmark
Summary: The Animal Harm Benchmark (AHB) is one of only two publicly available benchmarks for measuring LLM bias against non-human animals. This work examines whether AHB 2.0 is well-calibrated, asking three questions: (Q1) Does a score of 0 correspond to maximum risk and 1 to minimum risk? (Q2) Do higher...
Mar 116