jungofthewon

coo @ ought.org. by default please assume i am uncertain about everything i say unless i say otherwise :)

Wiki Contributions

Comments

How do we prepare for final crunch time?

 Access

Alignment-focused policymakers / policy researchers should also be in positions of influence. 

Knowledge

I'd add a bunch of human / social topics to your list e.g. 

  • Policy 
  • Every relevant historical precedent
  • Crisis management / global logistical coordination / negotiation
  • Psychology / media / marketing
  • Forecasting 

Research methodology / Scientific “rationality,” Productivity, Tools

I'd be really excited to have people use Elicit with this motivation. (More context here and here.)

Re: competitive games of introducing new tools, we did an internal speed Elicit vs. Google test to see which tool was more efficient for finding answers or mapping out a new domain in 5 minutes. We're broadly excited to structure and support competitive knowledge work and optimize research this way. 

The case for aligning narrowly superhuman models

This is exactly what Ought is doing as we build Elicit into a research assistant using language models / GPT-3. We're studying researchers' workflows and identifying ways to productize or automate parts of them. In that process, we have to figure out how to turn GPT-3, a generalist by default, into a specialist that is a useful thought partner for domains like AI policy. We have to learn how to take feedback from the researcher and convert it into better results within session, per person, per research task, across the entire product. Another spin on it: we have to figure out how researchers can use GPT-3 to become expert-like in new domains. 

We’re currently using GPT-3 for classification e.g. “take this spreadsheet and determine whether each entity in Column A is a non-profit, government entity, or company.” Some concrete examples of alignment-related work that have come up as we build this: 

  • One idea for making classification work is to have users generate explanations for their classifications. Then have GPT-3 generate explanations for the unlabeled objects. Then classify based on those explanations. This seems like a step towards “have models explain what they are doing.”
  • I don’t think we’ll do this in the near future but we could explore other ways to make GPT-3 internally consistent, for example:
    • Ask GPT-3 why it classified Harvard as a “center for innovation.”
    • Then ask GPT-3 if that reason is true for Microsoft.
      • Or just ask GPT-3 if Harvard is similar to Microsoft.
    • Then ask GPT-3 directly if Microsoft is a “center for innovation.”
    • And fine-tune results until we get to internal consistency.
  • We eventually want to apply classification to the systematic review (SR) process, or some lightweight version of it. In the SR process, there is one step where two human reviewers identify which of 1,000-10,000 publications should be included in the SR by reviewing the title and abstract of each paper. After narrowing it down to ~50, two human reviewers read the whole paper and decide which should be included. Getting GPT-3 to skip these two human processes but be as good as two experts reading the whole paper seems like the kind of sandwiching task described in this proposal.

We'd love to talk to people interested in exploring this approach to alignment!

Open & Welcome Thread – March 2021

Ought is building Elicit, an AI research assistant using language models to automate and scale parts of the research process. Today, researchers can brainstorm research questions, search for datasets, find relevant publications, and brainstorm scenarios.  They can create custom research tasks and search engines.  You can find demos of Elicit here and a podcast explaining our vision here.  

We're hiring for the following roles:

Each job description contains sample projects from our roadmap. 

Research is one of the primary engines by which society moves forward.  We're excited about the potential language models and ML have for making this engine orders of magnitude more effective. 

100 Tips for a Better Life

"Remember that you are dying."

Embedded Interactive Predictions on LessWrong

When Elicit has nice argument mapping (it doesn't yet, right?) it might be pretty cool and useful (to both LW and Ought) if that could be used on LW as well. For example, someone could make an argument in a post, and then have an Elicit map (involving several questions linked together) where LW users could reveal what they think of the premises, the conclusion, and the connection between them.

Yes that is very aligned with the type of things we're interested in!! 

Embedded Interactive Predictions on LessWrong

Lots of uncertainty but a few ways this can connect to the long-term vision laid out in the blog post:

  1. We want to be useful for making forecasts broadly. If people want to make predictions on LW, we want to support that. We specifically want some people to make lots of predictions so that other people can reuse the predictions we house to answer new questions. The LW integration generates lots of predictions and funnels them into Elicit.  It can also teach us how to make predicting easier in ways that might generalize beyond LW. 
  2. It's unclear how exactly the LW community will use this integration but if they use it to decompose arguments or operationalize complex concepts, we can start to associate reasoning or argumentative context with predictions. It would be very cool if, given some paragraph of a LW post, we could predict what forecast should be embedded next, or how a certain claim should be operationalized into a prediction. Continuing the takeoffs debate and Non-Obstruction: A Simple Concept Motivating Corrigibility start to point at this. 
  3. There are versions of this integration that could involve richer commenting in the LW editor.
  4. Mostly it was a quick experiment that both teams were pretty excited about :) 
Embedded Interactive Predictions on LessWrong

I see what you're saying. This feature is designed to support tracking changes in predictions primarily over longer periods of time e.g. for forecasts with years between creation and resolution. (You can even download a csv of the forecast data to run analyses on it.)

It can get a bit noisy, like in this case, so we can think about how to address that. 

Load More