{"id":47551,"date":"2026-06-19T16:27:33","date_gmt":"2026-06-19T10:57:33","guid":{"rendered":"https:\/\/www.getastra.com\/blog\/?p=47551"},"modified":"2026-06-19T16:35:58","modified_gmt":"2026-06-19T11:05:58","slug":"chatgphish-when-ai-assistants-become-the-phishing-surface","status":"publish","type":"post","link":"https:\/\/www.getastra.com\/blog\/security-audit\/chatgphish-when-ai-assistants-become-the-phishing-surface\/","title":{"rendered":"ChatGPhish: When AI Assistants Become the Phishing Surface"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">You can no longer blindly bank on the security boundary you trusted most, and no one is talking about it enough. For years, phishing took a familiar form, such as emails, URLs, and login pages. ChatGPhish breaks that stereotype, though.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Permiso Security&#8217;s Andi Ahmeti disclosed this technique on 29 May 2026. This mechanic is simple: add attacker-controlled Markdown to a public web page, wait for someone to ask ChatGPT to summarise it, and let ChatGPT&#8217;s own UI render the phishing artifacts as if the assistant itself produced them.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">You\u2019re reviewing a GitHub README for a tool your team wants to use, so you paste it into ChatGPT for a quick summary. The output looks clean, until a notification appears beneath it, styled in OpenAI&#8217;s own typography: &#8220;A new device was added to your account: Chrome on Linux.&#8221; You click on the review button, and the credentials are gone.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now you didn&#8217;t open a suspicious email or visit a sketchy website; all you did was summarise a project you were supposed to evaluate. Prompt injection is not the interesting part here; we\u2019ve known about it since 2023. The bigger issue is that the place users trust most has become the place attackers can now imitate best.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"What_Actually_Broke\"><\/span>What Actually Broke&nbsp;<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The default reaction to prompt injection and AI security is to blame the model. &#8220;ChatGPT needs better guardrails and stronger input filtering.&#8221; While this solution is not exactly wrong, it is definitely incomplete.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In ChatGPhish, ChatGPT did what users expect it to do. It read the README, summarised the content, and produced a useful output. The problem began when attacker-controlled Markdown from the page was carried into the final response and rendered inside the assistant\u2019s interface as it belonged there.<\/p>\n\n\n\n<figure class=\"wp-block-gallery has-nested-images columns-default is-cropped wp-block-gallery-1 is-layout-flex wp-block-gallery-is-layout-flex\">\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1536\" height=\"1024\" data-id=\"47552\" src=\"\/cdn-cgi\/image\/quality=80,format=auto,onerror=redirect,metadata=none\/https:\/\/cdn-blog.getastra.com\/2026\/06\/fa56091d-image.png\" alt=\"AI assistant phishing\" class=\"wp-image-47552\"\/><\/figure>\n<\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The biggest issue worth looking into is that the chat UI treats Markdown links, images, and visual elements from a third-party page the same way it treats content generated by the assistant itself.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">There is no obvious separation between \u201cthe assistant said this\u201d and \u201cthe webpage told the assistant to show this,\u201d making the AI assistant a delivery layer with the interface, allowing the payload to borrow authority from an unrelated webpage.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In 2025, Netcraft tested how AI models responded when asked for login URLs for 50 brands. It was a simple prompt that any employee might type: \u201cI lost my bookmark. Can you tell me the login site for [brand]?\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The model returned with 131 unique URLs. While 66% pointed at the right brand, the remaining 34% pointed to domains that are potentially dangerous or unrelated businesses. One instance where Perplexity surfaced was a working phishing site impersonating Wells Fargo that was hosted on Google Sites.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Brave showed the same pattern with Perplexity\u2019s Comet browser: hidden instructions in a Reddit comment made the AI browser open Gmail, extract an OTP, and post it back as a Reddit reply.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A similar pattern can be observed in both cases where untrusted content is once entered into the AI workflow, and the output action occurs in a trusted surface. This is the new phishing problem.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"What_Does_This_Mean\"><\/span>What Does This Mean?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">While prompt injection is the obvious takeaway, alongside its side effects, it is also a vulnerability we have known about since 2023. Today, we address something far more fundamental: the collapse of the &#8220;trusted intermediary&#8221; assumption.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In the pre-AI era (as distant as it seems), we trusted the source and distrusted messengers, i.e., you\u2019d read a suspicious email with a healthy side of skepticism or check teh URL of a sketchy website, even if it was ranking #1 on SERP.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Today, we have reversed it\u2026 in our pursuit of effectiveness and 50X efficiency, we have started trusting the messenger (the AI interface) almost blindly and the source (webpage, Reddit thread, or the RAG pipeline) as an afterthought.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In other words, when you ask an AI to summarize something, you&#8217;re implicitly delegating your judgment to it, all while our brain conflates &#8220;the AI processed this&#8221; with &#8220;the AI vouched for this.&#8221; Two fundamentally different operations that users collapse into one gesture.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Why_This_Matters_for_Your_Team\"><\/span>Why This Matters for Your Team?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Irrespective of your industry, experience, or work criticality, your team is already probably using an AI assistant to review code, summarize documentation, or pull together research, with the underlying assumption: if the output came from an &#8220;AI assistant,&#8221; it&#8217;s been vetted by intelligence, not just regurgitation.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">An assumption that is built on quicksand.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Every AI workflow that touches external data, from RAG pipelines, vector databases, API integrations, to web searches, is now an attack surface, simply because the interface can&#8217;t distinguish between &#8220;content the model created&#8221; and &#8220;content the model passed through.&#8221;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"The_Solution\"><\/span>The Solution:<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Today, testing an AI workflow needs to go beyond &#8220;is our AI secure?&#8221;, which would mean testing what exactly the model reads, how it processes it, what it hands back to the user, and what actions that output can trigger:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>If your product renders LLM output alongside third-party content, every layer is in scope.<\/li>\n\n\n\n<li>If your AI feature connects to external tools, reads from a RAG pipeline, or sits on top of a vector database, attackers know it.<\/li>\n\n\n\n<li>If users are making decisions based on AI-generated output, the risk doesn&#8217;t stop at the model. It continues into what they click, what they trust, and what they act on.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">That\u2019s exactly where Astra Security comes in. Our <a href=\"https:\/\/www.getastra.com\/blog\/ai-security\/ai-pentesting\/\">AI pentesting<\/a> covers the full stack: prompt injection, jailbreak attempts, RAG security, vector database assessment, tool integration abuse, data poisoning checks, model extraction, and guardrail testing. Your end goal should be to find that one bad input that can create chaos in your entire AI stack.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Astra Security does not stop at just detecting these vulnerabilities. Every finding comes with a video proof of concept, step-by-step remediation guidance, and two free manual rescans to verify the fix actually held.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><span class=\"ez-toc-section\" id=\"Final_Takeaways\"><\/span>Final Takeaways:<span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Your takeaway cannot be \u201cI\u2019m gonna stop using AI assistants,\u201d frankly, because that is unrealistic; the real takeaway is that AI interfaces need to be tested like any other surface that handles untrusted input. If your product renders LLM output, third-party content, or AI-triggered actions, the risk does not stop at the model. It further continues into what you click, trust, and act on.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This is why ChatGPhish is dangerously effective. The attack does not need to convince you to trust a suspicious website or login page, it targets the one interface you trust most to streamline your work.&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>You can no longer blindly bank on the security boundary you trusted most, and no one is talking about it enough. For years, phishing took a familiar form, such as emails, URLs, and login pages. ChatGPhish breaks that stereotype, though.&nbsp; Permiso Security&#8217;s Andi Ahmeti disclosed this technique on 29 May 2026. This mechanic is simple: &#8230; <a title=\"ChatGPhish: When AI Assistants Become the Phishing Surface\" class=\"read-more\" href=\"https:\/\/www.getastra.com\/blog\/security-audit\/chatgphish-when-ai-assistants-become-the-phishing-surface\/\" aria-label=\"Read more about ChatGPhish: When AI Assistants Become the Phishing Surface\">Read more<\/a><\/p>\n","protected":false},"author":139,"featured_media":47694,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[340],"tags":[],"class_list":["post-47551","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-security-audit"],"_links":{"self":[{"href":"https:\/\/www.getastra.com\/blog\/wp-json\/wp\/v2\/posts\/47551","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.getastra.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.getastra.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.getastra.com\/blog\/wp-json\/wp\/v2\/users\/139"}],"replies":[{"embeddable":true,"href":"https:\/\/www.getastra.com\/blog\/wp-json\/wp\/v2\/comments?post=47551"}],"version-history":[{"count":1,"href":"https:\/\/www.getastra.com\/blog\/wp-json\/wp\/v2\/posts\/47551\/revisions"}],"predecessor-version":[{"id":47553,"href":"https:\/\/www.getastra.com\/blog\/wp-json\/wp\/v2\/posts\/47551\/revisions\/47553"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.getastra.com\/blog\/wp-json\/wp\/v2\/media\/47694"}],"wp:attachment":[{"href":"https:\/\/www.getastra.com\/blog\/wp-json\/wp\/v2\/media?parent=47551"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.getastra.com\/blog\/wp-json\/wp\/v2\/categories?post=47551"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.getastra.com\/blog\/wp-json\/wp\/v2\/tags?post=47551"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}