TLDR:full task completion training data will soon be available, leading to much more capable agentic AI

Capabilities predictions:

  • Agentic AIs for all white collar tasks
  • creative problem solving might be solved

Business predictions:

  • MS/GOOG try to capture all the data (web browsing, (web/native)application interactions)
  • (maybe)creepy examples attempting better un-black-boxing of humans doing the problem solving
    • watching the user's face
    • eye tracking

Caveats:

  • TOS/legal limits on data retention/use for training
  • security(EG:prompt injection) for finished systems

TLDR_END

Microsoft is introducing copilot for office 365. Google is integrating AI into its products. Economic motivation for automating white collar task labor hours is obvious even if only to sell to workers directly. Where can the needed training data be gathered? Companies using MS tech stack (Windows(OS), Edge(browser), outlook(email), office365(documents)) are fully set up to capture all human computer interactions. Google can do the same via Chrome/ChromeOS for web application based workflows.

Ways to make better agentic AIs:

  1. do task unreliably and let human select among end results
    • Microsoft low code/ no code
    • basic approval feedback system (EG:chatgpt thumbs up)
  2. watch people doing a task to generate a problem solving log for that task

(1) leads to refinement and reliability improvement for existing capabilities. (2) leads to new capabilities altogether

Useful characteristics of problem solving data

  • homogenous
    • real time human computer interaction logs
    • no missing steps (except those that occur inside human brain)
  • tagged by user
    • can use "user vector" to condition model output as in image models conditioned by text
    • learn from the dumbest act like the smartest
    • less RL fine tuning required

Capability Predictions

This leads to capable narrow domain agents

  • creativity is still hard and 99%+ of humans aren't doing creative problem solving in day to day life
  • capturing the 1% could have huge impacts
    • can do RL or conditioning to elicit creative problem solving behavior
    • curation is easier than creation allowing lower skill humans to give feedback
      • EG:"Rate the automated tech support assistant"
    • This approach tops out somewhere unless RSI/takeoff occurs somehow
  • skill synergies may arise (EG:the model writes code to automate tasks rather than completing them one at a time)
    • this would be the "creative" part that leads to true AGI when fully generalized.

real world example:

  • tech support
  • need to troubleshoot software/hardware issues
  • data collection:
    • support agent (actions/chat)
    • "did they solve my problem" user ratings.
      • supplement by extracting info from chat logs
  • gather training data from existing remote support techs doing remote desktop type support
  • train capability into model via imitation learning
  • deploy model to small fraction of users and gather KPIs to determine readiness
  • gradually replace human agents with AI
    • switch from imitation to reinforcement learning
  • synergies with all sorts of other skills
    • programming, CLI usage, network exploration etc.
  • synergies with access to other corporate data:
    • internal network information
    • server logs

copilot type systems will get more capable until they become "autopilot" systems. Jobs switch to curating AI outputs with supervision dropping off over time. Even if the machine can't "pilot" at all it can still learn.

Business predictions:

  • Changes to tech giant TOSes to allow much more data collection.
  • Tech giants move to capture business segments, providing full solutions to generic problems (EG:accounting, customer service, IT support) or creating a commoditised market for solutions to placate regulators. Privacy/security concerns lead to centralisation (see Zvi's AI post for more).

New to LessWrong?

New Comment