Emergingagents

Computer Use

Models that drive a real screen, mouse, and keyboard like a human.

First observed 2024-10-22412 signals

Trend Dynamics

Updated daily
Velocity
+92
Maturity
18
Signals
412
All-time observed

Definition

Computer use is a capability class in which a multimodal model perceives a graphical user interface via screenshots and emits low-level pointer, keyboard, and scroll actions to operate arbitrary software.

Why It Matters

Computer use removes the integration tax on enterprise AI: the long tail of legacy software with no API becomes accessible the moment the model can see and click.

Signals Feeding This Trend

  • Computer-use benchmark scores (OSWorld, WebArena)38%
    benchmark
  • Browser/desktop agent product launches32%
    release
  • Vision model GUI-grounding research30%
    research

Companies Involved

  • Anthropic
  • OpenAI
  • Google DeepMind
  • Adept
  • Microsoft

Timeline

  1. 2024-10

    Anthropic introduces Computer Use beta in Claude 3.5 Sonnet.

  2. 2025-01

    OpenAI Operator ships consumer browser-use product.

  3. 2025-Q2

    Major RPA vendors (UiPath, Automation Anywhere) reposition around vision-driven agents.

Predictions

  • 12 monthsmedium confidence

    Computer-use agents pass 70% on a major end-to-end office-task benchmark.

Related Trends

Track every signal feeding Computer Use

Steek surfaces individual signals the moment they enter the index.

Explore the Signal Index