Harmfulness Directions in OLMo
Introduction This work was conducted as part of the MARS 4.0 program, supervised by Lorenzo Pacchiardi, with Hannes Whittingham and Mikhail Mironov as research managers. The core empirical work was carried out by Bryan Maruyama and Daniele Pace. In this technical report, we treat harmfulness as a composition of subcategories...