The Next Evolution in Design: Agent Computer Interfaces

Part of my thesis work at Chapter One collaborating with Jamesin!

Jun 27, 2024

As a rising senior in college, I’ve been thinking a lot about the value in learning Computer Science and the future of work in my generation with the AI revolution. The job market is flooded with software engineers and it feels like SWE will become obsolete with the improvement of agents. Although Devin is significantly better than other models like Claude 2 or GPT-4, it only passes 13.86% of the SWE-bench (Cognition Labs).

From HCI to ACI?

I have been fascinated by a paper that researchers at Princeton University published titled “SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering” demonstrating that designing interfaces tailor made for language model agents can increase their efficiency in coding, navigating repositories and running tests.

Human computer interaction and user experience focuses on how well humans can seamlessly navigate and work with interfaces. For example, IDEs like VS Code were created and thoughtfully designed for programmers to be more productive. Similarly, agent computer interfaces will help ensure that agents are navigating these spaces with the same ease.

Notably, a SWE-agent, which is a language model agent working on an agent-computer interface, is 10.7% more productive than a language model working on a regular Linux shell.

Currently there are only a handful of companies working on agents that can navigate browsers, including Zeta Labs, Browserbase, and Browserless. One company I am particularly excited about is Zeta Labs, based in San Francisco and backed by Nat Friedman, Daniel Gross, and Earlybird.

Zeta Labs is developing Jace, an AI assistant that can browse the web making reservations and starting its own company. Although the browsing speed is still slow and not always reliable, Jace can set up its own LLC for a math tutoring company, send emails, and put together problem sheets.

Search has found product market fit for enterprises but agents are even more powerful because beyond retrieving information, agents can automate repetitive tasks and code. The magic is that agents can also simulate team structures and work for you: imagine there is one agent as the project manager, and other agents who code, design, and execute a product.

I can see an entire new market opening up for Agent Computer Interfaces, but it may take some time until agents perform better.

Besides the accuracy, we are still early in the wave of agents because there is still work to be done on integrating agents into browsers and other infrastructure systems. Early versions that popularized agents like BabyGPT and AutoGPT had general use cases and didn’t stick. Future agentic products will find product market fit with specific use cases and instructions.

We have only scratched the surface of what agents can do, and how they can work for us.

Natalie’s Substack

Discussion about this post