Action Types
When the AI processes your request, it responds with one or more actions. Each action is a structured instruction that the extension executes on the page. This page documents every action type, its fields, and when the AI uses it.
On this page
How actions work
The AI's response is a JSON object with an actions array. Each action has a
type field and additional fields depending on the type. The extension executes
actions in order.
{
"actions": [
{ "type": "navigate", "target": "/account" },
{ "type": "show-message", "message": "Navigating to your account page." }
]
} The AI can return multiple actions in a single response. For example, it might navigate to a page, then show a message explaining what it did. Actions are executed sequentially.
navigate
Navigates the browser to a different URL. The AI uses this to move between pages, either using routes from a recipe or URLs it discovers in the DOM.
| Field | Type | Description |
|---|---|---|
type | "navigate" | Action type identifier. |
target | String | The URL path to navigate to (e.g., /account, /products?q=shoes). |
message | String (optional) | A message to show the user explaining the navigation. |
When the AI uses it
- User asks to go to a specific page: "Go to settings"
- User asks to do something on a different page: "Check my balance" (navigates to account page first)
- As part of a multi-step task that spans pages
Example
{
"type": "navigate",
"target": "/demos/ginko/deposit",
"message": "Going to the deposit page."
} click
Clicks on a DOM element identified by a CSS selector. Used for buttons, links, checkboxes, and any clickable element.
| Field | Type | Description |
|---|---|---|
type | "click" | Action type identifier. |
selector | String | CSS selector for the element to click. |
message | String (optional) | A message explaining what was clicked. |
When the AI uses it
- User asks to press a button: "Submit the form"
- User asks to click a link: "Click on the cart icon"
- To trigger UI state changes: opening menus, expanding sections, etc.
Example
{
"type": "click",
"selector": "#deposit-form button[type='submit']",
"message": "Clicking the submit button on the deposit form."
} execute-js
Executes arbitrary JavaScript code on the page. This is the most powerful and flexible action type. The AI uses it for typing into inputs, manipulating DOM state, reading values, and any interaction that doesn't fit neatly into click or navigate.
| Field | Type | Description |
|---|---|---|
type | "execute-js" | Action type identifier. |
code | String | JavaScript code to execute in the page context. The code runs via eval() in the content script. |
message | String (optional) | A message explaining what the code does. |
When the AI uses it
- Filling in form fields: setting
.valueand dispatching input events - Scrolling the page to reveal content
- Reading text content from specific elements
- Complex DOM manipulation that requires multiple steps
- Triggering framework-specific events (React, Vue, etc.)
Example
{
"type": "execute-js",
"code": "const el = document.querySelector('#amount'); el.value = '5000'; el.dispatchEvent(new Event('input', { bubbles: true }));",
"message": "Entering 5000 into the amount field."
} input and
change) after setting values so that frontend frameworks (React, Vue, etc.)
detect the change. Simply setting .value is not enough for most modern apps.
show-message
Displays a text message to the user in the gyoza chat. This is the AI's way of communicating information, answering questions, or explaining what it's doing. No DOM interaction occurs.
| Field | Type | Description |
|---|---|---|
type | "show-message" | Action type identifier. |
message | String | The text to display to the user. |
When the AI uses it
- Answering a question: "What is my balance?" — reads the page, replies with the amount
- Explaining an action: "I just navigated to the deposit page"
- Providing context or instructions
- Translating page content for the user
Example
{
"type": "show-message",
"message": "Your current balance is 150,000 JPY. The most recent transaction was a deposit of 20,000 JPY on March 15."
} highlight-ui
Visually highlights a DOM element by adding a temporary colored border or overlay. Used to draw the user's attention to a specific part of the page without clicking or modifying it.
| Field | Type | Description |
|---|---|---|
type | "highlight-ui" | Action type identifier. |
selector | String | CSS selector for the element to highlight. |
message | String (optional) | A message explaining what's being highlighted. |
When the AI uses it
- User asks where something is: "Where is the logout button?"
- Guiding the user through a process: "Click the button I highlighted"
- Pointing out specific information on the page
Example
{
"type": "highlight-ui",
"selector": ".nav-logout",
"message": "The logout button is in the top-right navigation area. I've highlighted it for you."
} fetch
Makes an HTTP request to an API endpoint. The AI uses this when a recipe includes
<api-endpoints> or when it determines that calling an API directly is more
efficient than interacting with the UI.
| Field | Type | Description |
|---|---|---|
type | "fetch" | Action type identifier. |
url | String | The URL to request. |
method | String (optional) | HTTP method: GET, POST, PUT, PATCH, DELETE. Defaults to GET. |
message | String (optional) | A message explaining the API call. |
When the AI uses it
- Recipe lists API endpoints and the task is best done via API
- Fetching data that isn't visible on the current page
- Submitting data programmatically (e.g., adding an item to cart via API)
Example
{
"type": "fetch",
"url": "/api/cart",
"method": "GET",
"message": "Checking the current cart contents."
} clarify
Asks the user for clarification when the request is ambiguous. The AI presents a question and optionally a list of choices for the user to pick from.
| Field | Type | Description |
|---|---|---|
type | "clarify" | Action type identifier. |
message | String | The question to ask the user. |
options | String[] (optional) | A list of choices the user can pick from. Displayed as buttons in the widget. |
When the AI uses it
- Ambiguous request: "Transfer money" — the AI asks "To which account?"
- Multiple options: "Open a product page" — the AI shows a list of products to choose
- Missing required information: "Deposit some money" — the AI asks for the amount
Example
{
"type": "clarify",
"message": "How much would you like to deposit?",
"options": ["5,000 JPY", "10,000 JPY", "50,000 JPY", "Other amount"]
} Action response schema
The complete JSON schema for the AI response. All fields are validated using Zod on the engine side.
// Action types
type ActionType = "navigate" | "click" | "execute-js" | "show-message"
| "highlight-ui" | "fetch" | "clarify";
// Single action
interface Action {
type: ActionType;
target?: string; // navigate: URL path
selector?: string; // click, highlight-ui: CSS selector
code?: string; // execute-js: JavaScript code
message?: string; // user-facing message
url?: string; // fetch: request URL
method?: string; // fetch: HTTP method
options?: string[]; // clarify: choice options
}
// Full AI response
interface ActionResponse {
actions: Action[]; // at least one action
extraRequests?: string[]; // optional: request more page context
} Extra requests
The AI can optionally include an extraRequests array in its response. This tells
the extension to gather additional page context and send it in the next turn. This is useful
when the AI needs more information to complete the task.
| Request type | Description |
|---|---|
buttonsSnapshot | A list of all buttons currently visible on the page. |
linksSnapshot | A list of all links on the page with their text and href. |
formsSnapshot | A snapshot of all forms and their fields. |
inputsSnapshot | A list of all input fields with their current values. |
textContentSnapshot | The full text content of the page body. |
fullPageSnapshot | A complete DOM snapshot of the page (expensive, used as last resort). |
Extra requests enable a conversational loop: the AI performs an action, then asks for more context about the resulting page state to inform its next action.