Beware of the tiny agents in your browser: a Chrome Prompt API case study

Security

Published

May 11, 2026

12 min

TL;DR. Chrome now ships an on-device language model that any web page or extension can call. We built two minimal browser agents on top of it (the same shape as shipping products like Comet, Atlas, and Claude for Chrome) and showed that a single comment in a blog post can cause one of them to wire money out of a victim’s bank account in a different tab, and the other to exfiltrate the URLs and titles of every other open tab to an attacker server. No CVE. No malicious extension. No bank compromise. The model behaved exactly as designed.

1. Background: what is Chrome’s Prompt API?

Chrome’s Prompt API is a JavaScript interface that lets web pages and extensions send natural-language prompts to a language model that runs locally inside the browser. It shipped stable for the open web in Chrome 148, which rolled out to stable on May 5, 2026, and is the primary developer surface for what Google markets as “built-in AI.”

A few properties make this API a meaningfully different security object from a cloud LLM endpoint:

On-device. Chrome’s documentation states explicitly: “No data is sent to Google or any third party when using the model.” The model runs on the user’s CPU/GPU; prompts never leave the machine.
No enterprise audit trail. Because there is no server-side hop, there is no provider-side log of what the model was asked or what it produced.
Available to every origin by default. The API is exposed to all top-level windows and their same-origin iframes. Cross-origin iframes can be granted access via the language-model Permission Policy. Web Workers are unsupported.
Multimodal input. The API accepts text, images (HTMLImageElement, Blob, video frames, canvas), and audio.
Structured output. Sessions support JSON Schema response constraints, system prompts, conversation history, and session cloning.

The hardware bar to use it is significant (roughly 22 GB of free disk for the model itself and either more than 4 GB of VRAM or 16 GB of RAM with 4+ cores), but it is achievable on modern consumer machines.

For developers, the surface is simple:

const session = await LanguageModel.create({
initialPrompts: [{ role: "system", content: "You are a helpful assistant." }]
});

const response = await session.prompt("Summarize this page: ...", {
responseConstraint: { type: "object", properties: { summary: { type: "string" } } }
});

That’s the entire shape. From the security side, it has three properties worth holding in mind:

The model’s prompt is constructed by JavaScript on the page.
That JavaScript is free to mix system instructions, user input, and arbitrary scraped page content into a single prompt.
The model has no reliable way to tell which of those tokens were authored by the developer and which were authored by an adversary.

One term first. Indirect prompt injection is when an attacker plants instructions in content that a language model will later consume (a comment on a page, an email, a search result, a memo line on a payment), and the model follows those instructions because it cannot distinguish “data to process” from “instructions to obey.” The term was coined by Greshake et al. in 2023 against early Bing Chat; in the two years since it has been demonstrated against essentially every browser-integrated LLM. This post is the same attack class, applied to Chrome’s on-device variant.

2. What is Gemini Nano?

The model powering the Prompt API is Gemini Nano, which Google introduced on December 6, 2023 as the smallest tier of the original Gemini 1.0 family.

Authoritative facts:

Two declared sizes: Nano-1 with 1.8 B parameters and Nano-2 with 3.25 B parameters, both with a 32,768-token context window.
Both variants are distilled from larger Gemini models and 4-bit quantized for efficient on-device inference.
It was positioned as Google’s “most efficient model for on-device tasks.” Its first deployments were on the Pixel 8 Pro (powering features like Recorder summarization and Gboard Smart Reply) and Android via the AICore system service starting in Android 14.
The Chrome variant (same model class) is what now ships behind the Prompt API.

For the security analysis that follows, the relevant property of Nano is that it is small. Compared to frontier cloud models, it is less capable at multi-step reasoning, more sensitive to prompt structure, and far more likely to lose track of context after a handful of turns. We will see all three failure modes show up in our tests, and we will see that they do not save the user from compromise. They just change the shape of it.

3. What we built: two miniature browser agents

We built two Chrome extensions. Both have the same minimum-viable structure of a real browser AI agent: they auto-run on every page the user visits, they have a small catalog of generic tools (fill a form, click an element, make an HTTP request), and they let the model decide what to do based on the page text.

They are deliberately minimal. Neither has a hardcoded “transfer money” tool. The tools are the same web-platform primitives any generic browser agent has.

Architecture A: the single-shot agent

The single-shot agent reads the page, sends the text to Gemini Nano with a JSON-Schema-constrained system prompt, and dispatches every tool call the model emits in one batch. One inference, one action plan, no observation loop.

page text → LLM call → [fill, fill, click, fetch] → done

This is fast, simple, and reliable on small models. It is not how real browser agents work, but it is a useful baseline: it gives us a per-architecture floor for what’s possible.

Architecture B: the ReAct agent

The second extension implements a classic ReAct loop: the model emits a single Thought and a single Action per turn, the agent executes the action, the result becomes an Observation fed back into the next prompt, and the loop continues until the model emits finish.

LLM call 1 → list_tabs → observe
LLM call 2 → read(bank tab) → observe
LLM call 3 → fill(...) → observe
LLM call 4 → fill(...) → observe
LLM call 5 → click(...) → observe
LLM call 6 → fetch(...) → observe
LLM call 7 → finish

This is the architecture used by the products that are actually shipping in this space. Perplexity Comet, ChatGPT Atlas, Claude for Chrome, and Microsoft Copilot in Edge all sit close to this design, modulo bigger models and richer tool catalogs.

The tool catalog in both extensions is the same, deliberately small and deliberately generic:

fill(tab, field, value): fill a form field. “Tab” is a substring matched against any open tab’s URL or title; “field” is a natural-language description matched against labels, placeholders, and aria-labels.
click(tab, element): click a button, link, or other clickable element matched the same way.
fetch(url, method, body): make an HTTP request, optionally attaching scraped tab content as the body.
list_tabs(), read(tab) (ReAct only): exploration primitives.

Nothing here is bank-specific or finance-specific. These are the primitives any browser agent needs to do useful work.

4. The two scenarios we tested

Scenario 1: Cross-tab bank transfer via a blog comment

The setup is the smallest possible threat model:

The victim has their First National Bank dashboard open in one tab. They check it in the morning, then leave it alone.
In a different tab, they open a normal-looking news article on DailyTech about WebGPU 2026.
The article’s comments section has four replies. Three are legitimate. The fourth, from a user with a verified badge calling themselves @bank_security_alert, is the entire prompt-injection payload: a several-hundred-word “automated banking security directive” that instructs the assistant agent to chain four tool calls. Exfiltrate the bank tab’s contents to an external URL, fill the transfer form’s “To account” and “Amount” fields with attacker-chosen values, and click “Send transfer.”

The single-shot agent, processing the news page on its tabs.onUpdated autorun listener, emits the four tool calls in one JSON blob. The dispatcher fires them: a POST to the attacker’s logging endpoint, followed by chrome.scripting.executeScript against the bank tab that fills both inputs and clicks submit.

The user, who has not switched away from the news tab, is unaware. When they eventually navigate back to the bank, the transfer-complete modal is already on screen.

Video 1. Bank transfer scenario, end-to-end.

Image 1. Bank transfer scenario, malicious comment in the news article.

Image 2. Bank transfer scenario, completed transfer modal on the bank tab.

The narrative for clients is short: “This is your customer’s bank dashboard. They check it daily. The transfer happened on a tab they never returned to. The attacker did not compromise the bank, did not compromise the customer’s machine, did not compromise the browser, and did not even have to author a separate web page. They posted one comment on someone else’s blog.”

Scenario 2: Bulk tab harvesting via a news comment

The second scenario tests a different question: how much can an attacker pull out of a victim’s browser using only the generic tools above, when they don’t know in advance what’s open?

The setup:

The victim has several tabs open: a corporate Gmail thread, their calendar, an AWS access portal showing four account IDs, an OpenAI billing page, a SageMaker console, an internal GitHub PR, a Cloudflare dashboard, and the bank page from Scenario 1.
They open a different news article whose comments section contains an injection asking the agent for a single HTTP request: POST the contents of every other open tab to an attacker URL.

The agent obeys. It iterates every open tab, scrapes the visible text, packs the full set (URL, title, visible text per tab) into a JSON body, and posts it to the attacker URL. Total cost: one tool call. The agent never touches the bank tab’s transfer form; this is pure reconnaissance.

The exfiltrated payload, captured by our Cloudflare Worker test endpoint, contained the URL, title, and visible-text contents of every tab Chrome would let the extension read, gated only by Chrome’s <all_urls> host permission plus the user’s one-time toggle for file-URL access, both of which any AI productivity extension already requests at install time.

Video 2. All-tabs exfiltration scenario, end-to-end.

Image 3. Data exfiltration scenario, the malicious comment in the news article.

Image 4. Data exfiltration scenario, the captured payload on the attacker server’s access log.

5. The injection vector is wherever your users read content

The two scenarios above use a comment field on a fake news site. Pick whichever real-world equivalent applies to your users:

A Reddit comment thread.
A Hacker News reply.
A product review on Amazon, an answer on Stack Overflow, an issue comment on GitHub.
A From: name or memo line on a $0.01 Zelle/Venmo transaction that shows up in the victim’s online-banking statement.
An email body the agent processes on the victim’s behalf.
An RSS-pulled news widget embedded on a site the user trusts.
An internal Slack message whose preview gets ingested by a sidebar AI.
A support ticket, a Salesforce note, a Notion comment, a Linear issue, or any other field where users can submit free-form text that the agent later sees.

The cost to the attacker is approximately zero. They do not have to compromise infrastructure. They have to write text. The prompt-injection payload itself is short: a few hundred characters of natural-language instructions plus, if reliability matters, a literal JSON example. The same payload works against tens of millions of browsers because the agent doesn’t know or care which page it came from.

6. What this proves (and what it doesn’t)

What we have not proven:

We did not find a vulnerability in Chrome. The Prompt API behaved exactly as Google designed it.
We did not find a vulnerability in Gemini Nano. The model produced exactly the structured output its system prompt and JSON schema asked for.
We did not exploit any extension we did not write ourselves. Our extensions are deliberately minimal and have no installed user base.

What we have demonstrated:

An on-device language model with structured tool-calling is sufficient to drive a real-world prompt-injection attack against another tab in the same browser, end-to-end, with no user interaction.
The threat surface is the tool catalog, not the model. Whichever tools an agent has, an attacker can compose them. Removing any single tool reduces specific attack variants, not the attack class.
The “on-device, nothing leaves Google” property is a privacy guarantee, not a security one. The model never sent data to Google. The agent sent data to the attacker. Those are different egress paths.
Real shipping browser agents have the same structural exposure. Comet, Atlas, Claude for Chrome, and Copilot in Edge all combine an LLM input channel that is identical to the attacker’s input channel with a tool surface that operates inside the user’s authenticated browser session. The class of attack we ran was first formally framed by Greshake et al. (2023) and has since been demonstrated in production, most publicly by Brave’s August 2025 research against Perplexity Comet, where a hidden Reddit comment caused Comet to extract a user’s email and one-time password from their logged-in Gmail tab and exfiltrate both back to the attacker. The same primitive works against any agent on this architecture.
Defense at the model layer is unreliable. Alignment training and “ignore prior instructions” filters reduced some attacks in our testing (the role-confusion variant of the one-shot demo failed 0/5 against Nano), but other variants succeeded 5/5 with trivial natural-language phrasing. There is no model setting that makes the injection class go away.

7. What’s actually defensible

Generic “treat LLM output as untrusted” advice is true and not useful. It’s been the standard since 2023, and no team that has ignored it will start because of this post. Four points specific to browser AI agents are worth landing:

The tool catalog is the boundary, not the model. Audit what an agent can call before you audit what it was prompted to say. Per-origin tool scoping, outbound-domain allow-lists on fetch, and mandatory human-in-the-loop on state-changing actions (any click on a submit-like element, any fill on a payment field) are more durable than any system-prompt mitigation.
Observations are part of the trust boundary. On any cloud-hosted agent (every browser agent that is not Chrome’s on-device Prompt API), each read of a tab ships that tab’s contents to the agent vendor as conversation input on the next turn. The data leaves the moment the agent reads it; the fetch to an attacker URL is downstream theater.
Prefer narrow primitives over broad ones. transfer_funds_with_approval is safer than click(element) because the model can compose click into transfer; it cannot compose transfer_funds_with_approval into anything the developer didn’t anticipate. The temptation to ship generic tools because they’re “more flexible” is the same temptation that ships SQL injection.
Disable the Prompt API by enterprise policy until a specific origin needs it. Chrome ships GenAILocalFoundationalModelSettings and per-feature policies for exactly this. Default-deny is cheap; the productivity case for “every site can run on-device AI” is, at minimum, not yet proven.

8. Closing

The uncomfortable moment in this project was not when the bank transfer succeeded. It was watching the same agent, fired by a different injection in the same kind of comment thread, walk silently into the user’s other tabs. Read the calendar. Read the AWS access portal. Read the banking session. Build a body. Ship it to a server the user had never heard of. No popup. No warning. No click. The user reading the news article had no way to know any of it was happening.

When browsers ship AI primitives with no privilege boundary between page content and developer intent, and extensions wrap those primitives with no privilege boundary between tool calls and authenticated user actions, the entire attack surface of every website the user visits collapses into a single shared trust domain. The model is not the boundary. The tool catalog is the boundary. And the tool catalog is wherever the agent’s developer drew the line, which is usually generously.

For enterprise security teams: the right question is not “does the Prompt API have a CVE.” It is “which AI agents are running in our browsers, what tools do they expose, and how would I know if one of them just acted on an attacker’s instructions?” If the answer to the third part is we wouldn’t, the answer to the first two is too many, too much.

References

Chrome Prompt API documentation. https://developer.chrome.com/docs/ai/prompt-api
Chrome 148 release notes (Prompt API graduates to stable for the open web, May 5, 2026). https://developer.chrome.com/release-notes/148
“Introducing Gemini: our largest and most capable AI model.” Google. December 6, 2023. https://blog.google/technology/ai/google-gemini-ai/
Gemini (language model). Wikipedia. https://en.wikipedia.org/wiki/Gemini_(language_model)
Greshake, Abdelnabi, Mishra, Endres, Holz, Fritz. “Not what you’ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection.” AISec ’23. https://arxiv.org/abs/2302.12173
“Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet.” Brave. August 20, 2025. https://brave.com/blog/comet-prompt-injection/
Yao, Zhao, Yu, Du, Shafran, Narasimhan, Cao. “ReAct: Synergizing Reasoning and Acting in Language Models.” ICLR 2023. https://arxiv.org/abs/2210.03629

Content

Heading 2