Open Source Alternatives LogoOpen Source Alternatives
AlternativesBlogAdvertise
Open Source Alternatives LogoOpen Source Alternatives

Stay Updated

Subscribe to our newsletter for the latest news and updates about Alternatives

Open Source Alternatives LogoOpen Source Alternatives

Handpicked Open Source Alternatives to Paid Softwares

Product
  • Search
  • Categories
  • Tag
  • Sign In
Resources
  • Blog
  • Collection
  • Submit
  • Advertise your tool
Company
  • Privacy Policy
  • Terms of Service
  • Refund Policy
  • Sitemap
Copyright © 2026 All Rights Reserved.
Home/Categories/Customer Support/page-agent
icon of page-agent

page-agent

Automate browser workflows with a GUI agent that operates websites through natural-language tasks. Apache-2.0 TypeScript project for web agents.

20.6K starsTypeScriptMITActive this week
Visit websiteGitHub repo
image of page-agent
Contents
  1. 01Who page-agent is for
  2. 02The problem it solves
  3. 03How it solves it
  4. 04Strengths and trade-offs
  5. 05page-agent vs alternatives
  6. 06Tech stack
  7. 07FAQ
  8. 08Similar open-source tools
TL;DR

page-agent PageAgent is an MIT-licensed TypeScript library for adding a natural-language GUI agent directly inside a web page. It uses client-side JavaScript and text-based DOM control, with npm installation, script-tag demos, bring-your-own LLM configuration, and optional extension or MCP paths.MIT · TypeScript · 20.6K stars · Active this week

who it's for

Who page-agent is for#

SaaS copilots

Use PageAgent to let users ask a product to click, fill, filter, or navigate inside a web app.

Skip if:

You only need a support chat widget that answers text questions.

Smart form filling

Teams can turn repetitive ERP, CRM, or admin form flows into natural-language actions.

Skip if:

Your forms require strict server-side approvals before each field update.

Accessibility and guided operation

Natural-language commands can help users operate complex interfaces without finding every control manually.

Skip if:

Your app cannot safely expose UI actions to an embedded agent.

the problem

The problem it solves#

Web users often know what they want but not which controls to operate. PageAgent addresses that gap by letting a product team embed an agent that reads the page structure and performs UI actions from natural-language instructions.

how page-agent solves it

How it solves it#

In-page JavaScript agent

The core integration runs inside the web page without requiring a browser extension, Python service, or headless browser.

Text-based DOM control

PageAgent works from DOM structure instead of relying on screenshots or multimodal LLMs for core page understanding.

Bring your own model

Developers can configure model, base URL, and API key values when creating a PageAgent instance.

Fast evaluation paths

The README provides a one-line script tag demo and an npm install path for application integration.

Optional multi-page control

A Chrome extension and beta MCP server can extend PageAgent beyond the core in-page use case.

strengths · trade-offs

Strengths and trade-offs#

Strengths

  • Fits product-owned UIPageAgent is useful when a team controls the page and wants a natural-language layer inside that experience.
  • Less infrastructure than headless automationThe core library avoids running a separate browser automation service for simple in-product tasks.
  • Clear automation boundaryThe README explicitly says PageAgent is designed for client-side web enhancement, not server-side automation.

Trade-offs

  • -Needs web-app integrationPageAgent is not a drop-in support inbox. Developers need to install and configure it inside the web application.
  • -Single-page core by defaultMulti-page workflows require the optional Chrome extension or MCP path rather than only the base in-page library.
  • -Model behavior still mattersTeams must choose, configure, and monitor the LLM that interprets user commands and page state.
versus alternatives

page-agent vs alternatives#

Compared to Intercom

Intercom is a customer messaging and support product with inboxes, help content, and customer engagement workflows. PageAgent is a developer library for controlling a web interface from natural language inside the page. Use Intercom when the goal is support communication; use PageAgent when the goal is letting users operate your product UI through an embedded agent.

tech stack · detected from GitHub

What it's built on#

Languages
JavaScriptTypeScript
Frameworks
React
frequently asked

FAQ#

What is PageAgent?

PageAgent is an in-page JavaScript GUI agent that lets users control web interfaces with natural language.

Does PageAgent require a browser extension?

No. The core README says PageAgent needs only in-page JavaScript. A Chrome extension is optional for multi-page tasks.

What license does PageAgent use?

GitHub metadata reports PageAgent as MIT licensed.

How do you install PageAgent?

The README shows npm install page-agent, then importing PageAgent and calling agent.execute with a natural-language command.

also worth a look

Similar open-source tools#

cognee

cognee

Persistent memory for AI agents across sessions

25.2KPythonApache-2.0
deer-flow

deer-flow

Build super agents with DeerFlow's powerful framework

75.4KPythonMIT
iptv

iptv

A collaborative database for TV channels

128.7KTypeScriptUnlicense
iroh

iroh

Connect devices seamlessly without relying on the cloud.

10.5KRustApache-2.0
orca

orca

The ultimate IDE for coding agents

4.7KTypeScriptMIT
CLI-Anything

CLI-Anything

Empower AI agents with agent-native CLIs

43.9KPythonApache-2.0

Repository

Stars
20.6K
Forks
1.8K
License
MIT
Latest
v1.10.0
Last commit
3 days ago
Last verified
Jun 29, 2026
Repo
alibaba/page-agent ↗

Additional details

Language
TypeScript
Open issues
47
Contributors
31
First release
2025

Categories

Customer SupportMarketing & GrowthWeb DevelopmentAI & Machine LearningDeveloper Tools

Tags

AI AgentsCustomer SupportWorkflow AutomationChatbotsUI/UX Design