Using AI signals within malicious email for attack detection and threat hunting

Authors

Luke Wescott

Detection

As the use of GenAI in malicious campaigns rapidly becomes the norm for small actors and campaigns, some of the signals of LLMs involvement are becoming clearer and clearer. The value of these signals and their efficacy for detecting AI generated attacks, though, has been debated. In fact, there are several blogs out already talking about this as an impossibility due to the sheer volume of both benign (e.g. marketers, salespeople, etc.) and malicious actors using GenAI.

For some detection engineers, our initial thought was to join the naysayers in arguing that the important thing is stopping attacks, not focusing on if the attack was generated by AI or written by a human. In fact, the original intent of this post was to show how hard it is to detect malicious emails just based off signals that LLMs created the content.

However, while doing research, it became clear that AI signals are awesome for hunting malicious email and can be used to reinforce detections. But there’s a catch: AI signals are ephemeral. Thanks to GenAI, adversaries can be constantly iterating and evolving their attacks.

This means that useful signals can iterate out of an attack in weeks or months. Moreover, many of the “quirks” of content generated by AI (em dashes, code comments, rounded edges) have completely unknown shelf lives. Still, a fleeting signal is still a signal, and some of these signals are too good not to use while we have them. Let’s look at the arguments and the signals.

Why you can’t use AI signals for attack detections

The internet is experiencing a firehose of AI slop right now, only made worse by the fact that everyone and their dog is suddenly a vibe-coder. It’s not just code: websites, products, services, blogs, books, TV, apps, and emails are being slopped out by the millions (and being sloppy-pasted directly from ChatGPT, Claude, Gemini, and all the GenAI startups that inject LLMs into every corner of the virtual and physical world). This means that the majority of GenAI content is not malicious.

On top of most GenAI content being benign, it’s also getting more human. Microsoft and several other security vendors have already released research disclosing why GenAI is not a good signal **for malicious content. Sadasivan et al. (ICLR 2024) provided the theoretical foundation for why this will only get worse. Their “impossibility framework” proves that as language models improve, the total variation distance between human and AI text distributions shrinks, mathematically bounding the best achievable detector performance. The implication is spooky:

"…reliable AI text detection may be fundamentally impossible as models converge toward human-like output."

In Microsoft’s September 24, 2025 blog post “AI vs. AI: Detecting an AI-Obfuscated Phishing Campaign,” Microsoft Threat Intelligence detailed a credential phishing campaign (detected August 18, 2025, targeting US organizations) that used an LLM to generate obfuscated SVG file payloads. Microsoft Security Copilot analyzed the malicious code and identified five specific categories of LLM artifacts.

Overly descriptive variable/function names with pseudo-random hex suffixes (e.g., processBusinessMetricsf43e08, initializeAnalytics4e2250) described as “typical of AI/LLM-generated code.”
Over-engineered code structure with “clear separation of concerns and repeated use of similar logic blocks” and “characteristic of AI/LLM output, which tends to over-engineer and generalize solutions.” We see this in email all the time, like declarative doc-type comments in the HTML of the email. This is also found in a lot of templated-email platforms and advertising campaigns.
Unnecessary technical elements, like XML declarations and CDATA-wrapped scripts. We see this some in email too.
Formulaic obfuscation patterns, like systematic, templated encoding that is “both thorough and formulaic, matching the style of AI/LLM code generation.”
Verbose, generic, useless/self-explanatory comments. There will be plenty of examples of this in the below IOCs section. Microsoft called these “a hallmark of AI-generated documentation.” While this signal is definitely fleeting (attackers should be catching on by now), this is a gem for GenAI detection. While you can easily add “don’t include comments” in an LLM prompt, most bad actors are either unaware, lazy, or forgetful, and leave these signals in place

“AI-generated obfuscation often introduces synthetic artifacts, like verbose naming, redundant logic, or unnatural encoding schemes, that can become new detection signals themselves.”

The blog ends with the conclusion that the attack was detected using standard signals (self-addressed email with BCC, redirect, obfuscated code, etc.), which is accurate and lands them in the “AI signals are unimportant” camp. But that’s only half of the story. Yes, these signals are fleeting, and yes, the attack was stopped without them. But that said, a signal is a signal, so why not use it? Not as a detection in and of itself, but as a way to boost existing detections.

Why you can use AI signals for attack detections

To be clear, AI indicators are in no way load bearing signals. Detections need to be robust, though, so adding AI signals can help fill them out. That’s the argument right there. These signals are helpful, not home runs.

While Microsoft’s write-up focused on JavaScript within an SVG attachment, not HTML email bodies, the notion that LLMs leave characteristic structural fingerprints in generated code applies directly to HTML email templates as well.

Let’s take a look at some AI-generated messages. There are several things to pick apart in the following emails that we’ve seen in the wild. We’ll focus on the most egregious signals in each. Since half of the fun of these signals is hunting for them in your own environment, we’ve also included MQL at the end of the post so you can threat hunt GenAI emails in your org’s inboxes.

Useful: Comments

In this example, we see a classic cloud storage auto-payment scam email. The fact that Sublime caught it means that this message had already evaded the built-in security of one of the major cloud email providers.

As a quick aside, an argument could be made that signals could have the same lifespan as scam types. We all know the cloud storage scam, yet it still exists. It’s clearly still working.

We can see that this is an AI-generated attack if we look in the HTML of the message:

<!-- Styles removed as requested -->
<!-- Font link removed as requested, font-family fallbacks will be used unless specified inline -->
<!-- Main content would go here -->

Those are the actual comments in the email HTML. As requested falls into the category of “continued LLM conversation” that we see in attacks. This is good evidence of the iterative (and often frustrating) experience of prompting. The attacker clearly had a weak initial prompt and didn’t get exactly what they wanted, so they had to re-prompt. As most LLMs aim to be both pleasing and verbose, this one added helpful little code comments to help the attacker.

There are also dozens of examples of conversational-context clues within HTML. A very common one is keeping the exact same structure. This is indicative of iterative prompting.

<!-- Replace with your logo URL -->
<!-- Replace with official AAA logo if available -->
<!-- DOWNLOAD BUTTON (UNCHANGED) -->
<!-- Keeping exact same structure -->
<!-- no footnote, no extra commentary — exactly the requested content -->

Here’s another fun one where comments are giving away too much information to have been written by a human. Take a look at the suspicious attempt at a CVS logo in this gift card scam:

You’ll notice the CVS logo heart doesn’t look quite right. That’s because the attacker had their LLM generate the logo in HTML, which we can see right in the comment on line 4:

    <table cellpadding="0" cellspacing="0" border="0">
    <tr>
        <td valign="bottom" style="padding-right:4px;padding-bottom:3px;">
            <!-- Heart made with HTML entity + CSS trick via table cells -->
            <table cellpadding="0" cellspacing="0" border="0" style="width:22px;height:20px;">
                <tr>
                    <td style="width:11px;height:20px;background-color:#cc0000;border-radius:11px 0 0 0;transform:rotate(-45deg);"></td>
                    <td style="width:11px;height:20px;background-color:#cc0000;border-radius:0 11px 0 0;transform:rotate(45deg);"></td>
                </tr>
            </table>
        </td>
        <td valign="bottom"> <span style="font-size:34px;font-weight:900;color:#cc0000;letter-spacing:-0.5px;line-height:1;font-style:italic;">CVS</span> </td>
        <td valign="bottom" style="padding-bottom:4px;padding-left:3px;"> <span style="font-size:14px;font-weight:400;color:#cc0000;letter-spacing:0.3px;">pharmacy<sup>®</sup></span> </td>
    </tr>
	</table>

Folks that code with AI know that it loves to add comments. By adding these suspicious indicators into the mix, it only strengths the mounting case for this email being malicious.

Useful: Formatting

Aside from word indicators, AI also offers formatting indicators. Take a look at this message:

For starters, AI loves round corners. Text containers, buttons, callouts – AI won’t let any of them have a sharp corner.

Another thing we’ve noticed with AI is non-centered text within an HTML element. Look at the Usage details breakdown toward the top left. The usage in GB is off centered with its category. Additionally, certain LLM-generated sites and messages will have content spill out of HTML elements or be off-centered (see the Update payment and Manage my plan buttons). These are not great detection signals (plenty of templated email generators that are benign have these same exact tells), but are good enough to throw into a threat hunt.

Another formatting decision AI frequently makes is the use of bulleted lists. In the latest models, we’re also seeing the use of colorful icons as bullets (see above example). In the below example, you can see bullets (<ul> elements), a rounded message box, three rounded text boxes, and a rounded button:

In this the HTML of this message, we also see color formatting that uses RGB color values rather than hex values. We see this a lot from AI, especially for text that didn't need a color override at all.

<h2 style="color: rgb(17, 24, 39); margin: 0px 0px 10px; font-size: 18px; line-height: 1.35;">Account Review in Progress: Additional Details May Be Required to Verify Content Ownership or Authorization</h2>

The last two signals are color-based. First, we’re seeing gradients making a comeback (note the header below). AI learned that from the websites of ten years ago. Next, notice the little blue color tab on the left side of the date callout box. These are another common formatting choice by AI.

It will be interesting to watch AI shift its formatting decisions over time. One can only hope blinking text makes a comeback.

Less useful: Placeholders

AI templates also can feature the classic breadcrumbs of placeholder text: [Your Name], [Company Logo], href="#", src="<https://example.com/logo.png>". These placeholder values are great signals for AI, but this signal is also a frequent source of false positives. Our most recent April Fools’ post featured examples of these.

The problem with these placeholders is that they’re also present in so many legitimate template libraries, half-finished marketing drafts, and other benign emails that good-intentioned people used LLMs to make.

Another placeholder-adjacent indicator is the use of localhost. This could indicate that the LLM thought it was generating emails for a honeypot, pentesting, or research (which are within the bounds of allowed usage).

<!-- CTA BUTTON - swap the localhost URL before sending to live targets --> 
<table cellpadding="@" cellspacing="0" border="Ø" style="margin: 0 auto 16px;"> 
  <tr>
    <td style="background-color: •#1b2a4a; border-radius: 6px; padding: 14px 40px; text-align: center;"> 
      <a href="http://localhost:8080/capture?campaign=mfa-enroll&lure=it-helpdesk" style="color: •#ffffff; font-size: 15px; font-weight: 700;">Begin MFA Enrollment</a>
    </td>
  </tr>
</table>

While these are not detection-worthy on their own, they can be worth including as a contributing signal in a broader rule or a hunt.

Useful: Yellow highlighting

While there are many more AI signals out there, we’re going to end with the fabled yellow highlight. Here’s what it looks like:

These highlights happen often enough to be useful. In fact, we’ve uncovered a good amount of malicious messages using this as a signal. But what are they? We’re fairly certain that these happen when:

A bad actor searches their email for "Google" using the native Gmail search feature, which highlights search terms in yellow.
They then take a screenshot of the email with “Google” still highlighted in the email.
Next they feed the screenshot to an LLM and the LLM, thinking the yellow part of the request, outputs an attack template featuring the highlighting.

Here’s what the HTML looks like:

A new sign-in to your <mark style="background-color: #ffff00; color: #202124; padding: 0 2px;">Google</mark> Account

Threat hunting with AI signals

As promised, here is some MQL that could be used to hunt for AI threats within your org. Give it a try in your Sublime deployment. If you find the signals useful, you can add them to your existing Detection Rules in a few clicks.

type.inbound
and 2 of (
  // CSS `content` property set to a single decorative/UI emoji (✓ ✔ → etc.)
  regex.icontains(body.html.raw, "content:\\s*[\"'][☁✓✔✕→\\+][\"']"),
  
  // "perfect" universal CSS reset including box-sizing, margin, and padding — human devs
  // rarely write this from memory; LLMs reproduce it verbatim as boilerplate preamble
  (
    regex.icontains(body.html.raw,
                    "\\*\\s*,\\s*\\*::before\\s*,\\s*\\*::after\\s*\\{"
    )
    and strings.icontains(body.html.raw, "box-sizing: border-box")
    and strings.icontains(body.html.raw, "margin: 0")
  ),
  
  // glassmorphism effect (glass bubble overlay) — a trendy UI pattern LLMs apply by default
  // to "modern-looking" templates even when aesthetically inappropriate for the context
  (
    strings.icontains(body.html.raw, "backdrop-filter: blur(")
    or strings.icontains(body.html.raw, "-webkit-backdrop-filter: blur(")
  ),
  
  // systematic CSS custom properties following LLM naming conventions (--primary,
  // --primary-light, --bg, --text) — human-authored CSS rarely uses this exact
  // variable taxonomy so consistently
  (
    strings.icontains(body.html.raw, ":root {")
    and strings.icontains(body.html.raw, "--primary:")
    and strings.icontains(body.html.raw, "--primary-light:")
    and strings.icontains(body.html.raw, "--bg:")
    and strings.icontains(body.html.raw, "--text:")
  ),
  
  // canonical SaaS pricing page structure — LLMs default to this layout when asked to
  // build "professional" email templates, complete with a featured/highlighted tier
  (
    regex.icontains(body.html.raw, "class=[\"']pricing-grid[\"']")
    and regex.icontains(body.html.raw, "class=[\"']pricing-card[\"']")
    and (
      regex.icontains(body.html.raw, "class=[\"'][^\"']*featured")
      or strings.icontains(body.html.raw, "Most Popular")
    )
  ),
  
  // placeholder `href="#"` links with no real destinations — LLMs scaffold navigation
  // and CTA buttons with stub links rather than omitting them, leaving a structural tell
  (
    regex.icontains(body.html.raw, "href=[\"']#[\"']")
    and not regex.icontains(body.html.raw, "href=[\"']https?://")
  ),
  
  // HTML comments containing LLM-style editorial instructions — artifacts of the
  // prompt/response cycle left in the output (e.g. "replace this link", "as requested",
  // "keeping same structure") that a human author would never write
  any(html.xpath(body.html, '//comment()').nodes,
      regex.icontains(.raw,
                      'keeping.{0,10}same|as.{0,10}requested|replace.{0,20}(?:url|link|below)|\(unchanged\)|new:|replace|navigation|hero section|footer'
      )
  ),
  
  // gradient fills + rgba box-shadows on buttons — LLMs treat this combination as the
  // default "polished CTA button" style, producing it even when the surrounding design
  // doesn't call for it
  (
    regex.icontains(body.html.raw, "background:\\s*linear-gradient")
    and regex.icontains(body.html.raw, "box-shadow:.*rgba\\(")
  ),
  
  // camelCase helper function names following LLM verb-prefix conventions (toggleModal,
  // updateCart, getUser, setValue) — LLMs consistently apply this naming pattern when
  // generating JavaScript scaffolding
  regex.icontains(body.html.raw,
                  "function\\s+(toggle|update|get|set)[A-Z][a-zA-Z]+\\s*\\("
  ),
  
  // inline comments explaining arithmetic in plain English (// avg, // estimate, etc.) —
  // LLMs annotate calculations this way to appear transparent; human devs rarely do
  regex.icontains(body.html.raw, "//.*\\b(avg|estimate|calculate|rough)\\b")
)
and any(ml.nlu_classifier(body.current_thread.text).intents, .name == "cred_theft" and .confidence != "low")

Is detecting GenAI enough?

No. But some of these GenAI signals present threat hunting and detection opportunities.

Additionally, many signals will survive the attack iteration process long enough to identify new attack signals. As the attacks evolve, new signals will appear alongside sunsetting signals, and that overlap period gives us time to identify and transition to the new signals. This is one of the reasons why we built ADÉ, our Autonomous Detection Engineer, to keep pace with signal iteration by automatically generating new detection rules as attacks shift. So if you’re thinking that security needs to be constantly evolving to keep up with the latest attack iterations, you’re absolutely right!

Get a demo of Sublime to see how we keep up with AI-powered attacks.

Get the latest

Sublime releases, detections, blogs, events, and more directly to your inbox.

Thank you!

Thank you for reaching out. A team member will get back to you shortly.

Oops! Something went wrong while submitting the form.