Tl;dr: We have a dataset for conceptual reasoning which you can request access for if you would like to use it for AI safety (or related) research. We consider the dataset half-baked and it will likely become much more useful over the next few months. At the same time, we think it's very high quality compared to typical AI datasets and currently the best available dataset of this kind, so want to make it available to mission-aligned projects now. We also have half-baked prompts to make models better at critiquing conceptual reasoning which you can request.
Our group consists of Caspar Oesterheld, Emery Cooper, and me. Ethan Perez is advising us on this project.
Motivation/context: We are working on eliciting conceptual reasoning capabilities in LLMs where conceptual reasoning refers to reasoning about questions or problems where we don't have (access to) ground truth and there is no (practically feasible) agreed-upon methodology for arriving at the correct conclusion. Philosophy is the prototypical example but forecasting of far future events and many AI safety questions also fall into this category. Our motivation for doing this is to shorten the time at which we can use AI assistants for conceptual safety-relevant research relative to AIs' general capabilities. As part of this project, we are building a conceptual reasoning dataset and developing prompts for eliciting their full conceptual reasoning abilities.
The dataset: The idea behind our dataset is that it’s easier to evaluate the quality of contextualised arguments than the bottom-line conclusion in conceptual domains.
The prompts: We have done extensive prompt optimization to elicit models' ability to rate critiques accurately (i.e., similarly to the human raters). We have just started prompt engineering to elicit models' ability to write high-quality critiques (with our dataset and LLM judges being very helpful at speeding up this process).
Paper: You can find a more detailed preliminary paper draft about our dataset here. This paper also further details the limitations of the dataset in its current form.
Access: To request access, you first have to read our data sharing policy. Once you've done so, you can confirm this and request access in this form. If you or your organisation are quite well known in the AI safety community, your (organisation's) name is all we need from you in the form and you can stop reading here.
We will initially be conservative with granting access since we don't have the capacity to properly evaluate access requests and also haven't decided how we want to share the dataset in the long term. We will usually consider access requests only if:
Unfortunately, we cannot currently commit to assessing requests if this would require substantial effort from our side (such as reading and judging a research proposal). If you're unsure if you fit into a/b/c, feel free to just submit a bare-bones response and leave a note that you're happy to share more!