SafeBreach disclosed a Google Gemini voice assistant issue using indirect prompt injection via messaging notifications.
The attack did not need a malicious app, a browser exploit, or direct access to Google’s model.
SafeBreach showed that a message notification could be enough.
In research published on June 3, 2026, SafeBreach Labs disclosed a new class of indirect prompt injection attacks against Google Gemini’s voice assistant. The technique used seemingly ordinary notifications from messaging apps to feed hidden instructions into Gemini, turning the assistant’s convenience feature into an input channel for attacker-controlled prompts.
What SafeBreach Found
The issue centered on Gemini’s ability to read and summarize notifications on Android devices. SafeBreach researcher Or Yair found that Gemini’s Android Utilities agent processed untrusted notification content from incoming messages. That meant an attacker who could send a message through WhatsApp, Slack, SMS, Signal, Instagram, Messenger, or another notification-capable app could potentially place malicious instructions where Gemini would later read them.
The user did not have to paste a prompt into Gemini. The malicious instruction could arrive indirectly through a message notification and become part of the assistant’s conversational context when the user asked Gemini to read or summarize messages.
SafeBreach described this as notification-based indirect prompt injection. The attack surface is broad because almost any app capable of producing a message notification can become a delivery path.
The Technique: Fake Context Alignment
SafeBreach said it bypassed Google’s earlier mitigations using a technique it called “Fake Context Alignment.” The core idea was to make malicious instructions appear contextually legitimate enough for Gemini to process them while keeping the user unaware of the prompt manipulation.
The research demonstrated that attackers could hide instructions in foreign-language text or muted hyperlink content that Gemini interpreted but did not clearly surface to the user. In a voice-assistant scenario, that gap is especially dangerous. The victim may only hear Gemini’s clean spoken summary, not the suspicious raw message or hidden instruction behind it.
SafeBreach said this could let an attacker control Gemini’s output, generate social engineering messages, open URLs, poison long-term memory, schedule recurring actions, or attempt to trigger tools connected to the victim’s assistant.
Why Voice Makes This More Dangerous
Voice assistants compress context. That is useful when a user is driving, walking, cooking, or unable to look at the screen. It is also exactly where the attack becomes sharper.
If a user manually reads a suspicious WhatsApp or SMS message, they may notice the unknown sender, odd formatting, strange link, or awkward wording. But when Gemini summarizes the notification aloud, those visual warning signs can disappear. The assistant may relay the attacker’s desired message in a trusted voice, stripped of the context that would have helped the user detect the scam.
SafeBreach highlighted a particularly risky social engineering case: Gemini could be manipulated into presenting a fake message as though it came from a trusted contact. In some scenarios, the attacker would not even need to know the contact’s name in advance. The payload could instruct Gemini to borrow an authentic sender name from existing notifications and attribute the attacker’s message to that person.
Potential Impact
The demonstrated impact went beyond misleading summaries. SafeBreach said the research reproduced several high-impact scenarios previously explored in its Gemini promptware work, including manipulation of assistant output, phishing and spam generation, opening URLs, interaction with connected tools, and app-boundary crossing through application URIs.
Examples cited by SafeBreach included controlling smart home devices such as connected windows, lights, or boilers; opening URLs that could expose a victim’s IP-based location or initiate downloads; and launching unauthorized video streams through apps such as Zoom.
The most important defender takeaway is not that every Gemini user was automatically compromised. It is that AI assistants increasingly process hostile data from ordinary user workflows. Notifications, calendar invites, emails, shared documents, and chat messages are not neutral context. They are attacker-writable input.
Google’s Response
SafeBreach said it responsibly disclosed the issue to Google and that Google has since rolled out content classifier updates to mitigate the vulnerabilities.
Google has separately documented its layered defense strategy for indirect prompt injection in Gemini, including prompt injection content classifiers, security-focused instruction reinforcement, markdown sanitization, suspicious URL redaction, user confirmation for risky actions, end-user mitigation notices, and model hardening.
That layered approach matters, but SafeBreach’s research shows why this category remains difficult. The attacker does not need to defeat an authentication system in the traditional sense. They need to get hostile text into a place the assistant is trusted to read.
Why This Matters for Security Teams
This is an early warning for enterprises adopting AI assistants across mobile, collaboration, productivity, and workplace platforms.
Security teams are used to classifying emails, URLs, attachments, and apps as risky. Agentic AI adds another layer: any external content that an assistant can read may become an instruction surface. A Slack message is no longer just a Slack message if an AI assistant can summarize it, reason over it, and trigger actions from it.
For enterprise environments, the practical risk is especially relevant where users connect AI assistants to work accounts, messaging apps, calendars, smart devices, or workflow automation tools. The more permissions the assistant has, the more damaging context manipulation can become.
What Defenders Should Watch
Organizations should treat AI assistant integrations as privileged software, not just productivity features. That means reviewing which apps can expose content to assistants, limiting tool permissions where possible, requiring explicit user confirmation for sensitive actions, and monitoring for unusual assistant-triggered workflows.
Users should be cautious when a voice assistant relays unexpected requests involving payments, file uploads, links, password resets, account changes, or urgent instructions from a supposed contact. The safest response is to open the original app and inspect the raw message before acting.
For vendors, the lesson is architectural. AI assistants need stronger separation between user commands, untrusted external content, hidden formatting, and executable tool calls. A model that reads everything in one conversational stream will keep creating places where attackers can blur data and instructions.
NeuraCyb's Assessment
SafeBreach’s Gemini notification research is important because it exposes the weak seam in modern AI assistants: they are becoming trusted interpreters of attacker-controlled content.
The next wave of AI abuse will not always look like jailbreak prompts typed into chat boxes. It will look like a calendar invite, a Slack message, a document comment, or a notification read aloud while the user is not looking. Once assistants can act, summarization becomes a security boundary. That boundary now needs to be defended like one.
References
Dark Reading: Malicious Notifications Could Trick Google Gemini Users
Google Security Blog: Mitigating prompt injection attacks with a layered defense strategy