Browser Harness Empowers LLMs with Full Web Freedom

A new open-source project called Browser Harness aims to give large language models (LLMs) maximum autonomy when interacting with web browsers. The developers, frustrated by the limitations of existing frameworks, have created a system that removes constraints and allows LLMs to perform virtually any browser task they’re trained for.

The core problem addressed is how to handle edge cases in complex browser interactions. Traditional frameworks often include heuristics (pre-defined rules) to manage these situations - like cross-origin iframes or native file popups. However, this approach can lead to “silent failures” where the LLM believes it’s performed an action when nothing actually happened.

Browser Harness solves this by giving the LLM direct access to Chrome DevTools Protocol (CDP) and allowing it to add new tools on demand - even writing its own functions when needed. For example, in one instance, the LLM automatically created an upload function using DOM.setFileInputFiles after realizing it was missing from the existing toolset.

The project provides a minimal foundation consisting of:

  • A daemon that maintains the CDP websocket connection
  • Basic helper functions (that can be extended)
  • Skill.md documentation explaining how to use the system

This approach contrasts with other agent frameworks like Playwright MCP or Agent Browser, which offer predefined toolsets. The developers argue that their solution provides greater flexibility and context for LLMs operating in complex web environments.

Notable Achievements with Browser Harness:

  • Played a full game of Stockfish (chess engine)
  • Set a world record score in Tetris
  • Figured out how to draw a heart using JavaScript

The project is designed to be easily installed by asking Claude Code to set it up: Set up https://github.com/browser-use/browser-harness for me. The developers are seeking community input on what to call this new paradigm of LLM-web interaction - suggestions like “dialect” or other terms reflecting the unique approach are welcome.

What do you think about giving LLMs more direct control over browsers? Share your thoughts in the comments!