Browser Control
Use this skill when the task requires controlling a real browser session, interacting with page elements, or extracting in-page data from live websites across macOS, Linux, or Windows.
Tools
- •
browser_open- •Open a URL in Chromium/Chrome/Edge/Firefox/WebKit.
- •Returns page snapshot fields (
title,url,readyState, short text preview).
- •
browser_click- •Click an element by CSS selector on the active tab.
- •Use after opening a page or when navigating interactive UI.
- •
browser_type- •Type into a field selected by CSS selector.
- •Supports optional clear + optional submit.
- •
browser_eval- •Execute custom JavaScript in page context.
- •Use
mode="expression"for simple reads (for exampledocument.title). - •Use
mode="function"for statement blocks andreturna structured object.
- •
browser_extract- •Extract page data in one of:
text,html,links,forms. - •Optional
selectorscopes extraction to a specific section.
- •Extract page data in one of:
- •
browser_close- •Close one session or all sessions created by this skill.
Operating Pattern
- •Start with
browser_openfor target URL. - •Use
browser_extract mode="text"ormode="links"to map the page quickly. - •Use
browser_clickandbrowser_typefor navigation and form flow. - •Use
browser_evalfor custom DOM reads/actions that the standard tools do not cover. - •Re-run
browser_extract(orbrowser_eval) to verify resulting page state.
Constraints
- •Requires Playwright runtime (
playwrightpackage) and browser engines. - •Install once per environment:
- •
npm install playwright - •
npx playwright install chromium firefox webkit
- •
- •Default behavior is headed (visible browser window). Toggle
headlessfrom the dashboard skill config when needed. - •For Linux desktopless environments, use
headless=trueor run with a virtual display. - •
browser_evalruns arbitrary page JavaScript; use only task-relevant scripts and avoid unsafe actions.