What this groups
Agents that operate interfaces
These entries cover browser agents, GUI agents, computer-use models, desktop-control stacks, and automation-oriented browser infrastructure.
AI Resources
A focused LifeHubber front door for agents, models, and tooling that point AI systems at browsers, desktops, or visual interfaces.
Computer-use tools can touch accounts, files, websites, and private data. Use this as a discovery reference, then inspect permissions, terms, and review steps before trying anything important.
What this groups
These entries cover browser agents, GUI agents, computer-use models, desktop-control stacks, and automation-oriented browser infrastructure.
Why it matters
Reading an answer is different from letting software click, type, browse, or control a desktop. Check the project scope and boundaries carefully.
Caveat
Start with throwaway tasks, test accounts, and manual review steps before connecting anything private, paid, or hard to undo.
Discovery reference
Use this page as a starting point for inspection, not as an endorsement, recommendation, guarantee, or safety review. Open the source materials before relying on details such as setup, terms, limits, privacy, access, or costs.
Curated from AI Resources
browser-use/browser-use
A browser automation framework for AI agents that can navigate websites, click elements, type into pages, use custom tools, and run browser tasks through code or CLI workflows.
iFurySt/open-codex-computer-use
A computer-use MCP service for AI agents and MCP clients, with macOS, Linux, and Windows paths, Codex, Claude Code, Gemini CLI, opencode setup commands, a Codex skill path, command-call examples, and local validation commands.
trycua/cua
A computer-use agent stack with Cua Driver for background desktop control, agent-ready sandboxes, CuaBot, Cua-Bench, Lume, SDKs, MCP support, and model integrations.
Hcompany/holo31
An H Company computer-use model family for web, desktop, and mobile automation, with 0.8B, 4B, 9B, and 35B-A3B sizes plus FP8, NVFP4, and Q4 GGUF checkpoint paths for local or edge-oriented deployment.
bytedance/UI-TARS-desktop
A ByteDance GUI-agent desktop app and multimodal agent stack for local or remote computer and browser operation, with vision-language model control, CLI/Web UI paths, and MCP-oriented tooling.
nico-martin/gemma4-browser-extension
An independent Chrome extension experiment for running an on-device browser agent with Transformers.js, WebGPU, Gemma 4, page RAG, tab tools, and semantic history search.
alibaba/page-agent
A JavaScript in-page GUI agent for controlling web interfaces with natural language, aimed at browser-based workflows and interface automation.
allenai/molmoweb
A multimodal web agent from Ai2 that can navigate browser tasks from natural-language instructions.
lightpanda-io/browser
A headless browser designed with AI automation use cases in mind.
Also in AI
Keep the thread going with AI Guides for decision habits for messy AI choices, AI Access for free and low-cost ways to compare AI model access, AI Ballot for a clearer view of what readers are leaning toward.