Attack spotlight

Adversarial ML: Extortion via LLM Manipulation Tactics

October 30, 2024

Adversarial ML: Extortion via LLM Manipulation Tactics

Sublime Security Attack Spotlight: Social Engineering attack that employs command and text injection in the message body to evade LLM detection.

Ready to see Sublime 
in action
Get a demo
Authors
Threat Detection Team
Threat Detection Team
Sublime

Sublime’s Attack Spotlight series is designed to keep you informed of the email threat landscape by showing you real, in-the-wild attack samples, describing adversary tactics and techniques, and explaining how they’re detected.

EMAIL PROVIDER: Google Workspace

ATTACK TYPE: Extortion, Social Engineering

The attack

Novel text injection in the message body reveals an extortion attempt designed to evade LLM detection. The attacker uses fear and uncertainty to isolate the recipient and pressure them into transferring cryptocurrency. A few attack characteristics:

  • Spoofing of a known sender domain from a trusted third party that the recipient would interact with for legitimate business purposes
  • Command injections in the message body attempts to interact directly with any present LLM-backed phishing detectors to hide the true intent of the message
  • Detailed cryptocurrency demands for added urgency in the threat

Anatomy of an attack on an LLM

This attack stood out due to the attacker’s awareness of potential LLM-based phishing detection at the recipient’s organization.

Command injection

By repeating “IGNORE EVERYTHING ELSE” multiple times, the attacker tries inserting what looks like an instruction or command into the LLM’s analysis process. The hope is that the LLM will interpret this as a directive to disregard the malicious content before it.

Attention redirection

The placement of “IGNORE EVERYTHING ELSE” is strategic. By including the phrase after the extortion content, but before the seemingly legitimate vendor configuration details, the attacker wants the LLM to:

  • Skip over the extortion / Bitcoin demands
  • Focus only on the innocuous IT configuration information at the end
  • Potentially classify the email as legitimate business communication

Context manipulation

The placement of the commands appears designed to create an artificial boundary in the message body, signaling to any analyzing LLM to ignore the preceding text and only analyze what follows. This is particularly clever because:

  • It exploits the fact that LLMs are trained to follow instructions within text
  • It attempts to hijack the LLM’s tendency to be helpful and follow directives
  • It tries to make the LLM treat the malicious content as irrelevant to the classification task

This attack shows growing sophistication in understanding how LLM-based security tools work and attempting to exploit their instruction-following nature. It’s similar to other prompt injection attacks we’ve seen where attackers try to slip in commands like “ignore previous instructions” or “disregard security checks.”

Note: This technique might be particularly effective against security systems that use LLMs to generate natural language explanations or summaries of why an email might be suspicious, as the injected commands could influence how the LLM describes or interprets the content.

Detection signals

Sublime detected this attack via the Extortion / Sextortion (untrusted sender) Detection Rule and prevented this attack using the following top signals:

  • Engaging extortion language: Language in the message appears to extort the user.
  • Suspicious cryptocurrency language: The message contains a reference to cryptocurrency, which is often used in extortion attacks.
  • Cyrillic characters: The sender's subject or display name contains Cyrillic characters, a tactic commonly used in homoglyph attacks.

At Sublime, we rely on a defense-in-depth approach, applying layers of detection logic to identify various anomalies in a message. Sublime’s Natural Language Understanding (NLU) model leverages BERT LLM, which does not perform Instruction Following. Instead, it is fine-tuned on labeled training data and would treat the “IGNORE EVERYTHING ELSE” as regular text input.

See how Sublime detects and prevents extortion, social engineering, and other email based threats. Deploy a free instance today.

Heading

About the authors

Threat Detection Team
Threat Detection Team
Sublime

The Threat Detection team at Sublime is responsible for monitoring environments to discover emerging email attacks and developing new Detection Rules for the Core Feed.

Get the latest

Sublime releases, detections, blogs, events, and more directly to your inbox.

check
Thank you!

Thank you for reaching out.  A team member will get back to you shortly.

Oops! Something went wrong while submitting the form.

Related Articles

December 18, 2025
How to build fast similarity search for email from the ground up
Sublime news

How to build fast similarity search for email from the ground up

Ross WolfPerson
Ross Wolf
Engineering
Person
December 16, 2025
Evolving our brand as Sublime grows
Sublime news

Evolving our brand as Sublime grows

Omar JalalzadaPerson
Omar Jalalzada
Head of Design
Kirk JohnsonPerson
Kirk Johnson
Creative Director
December 9, 2025
Automatic malicious calendar event remediation
Sublime news

Automatic malicious calendar event remediation

Ahry JeonPerson
Ahry Jeon
Product Manager
Person

Frequently asked questions

What is email security?
Email security refers to protective measures that prevent unauthorized access to email accounts and protect against threats like phishing, malware, and data breaches. Modern email security like Sublime use AI-powered technology to detect and block sophisticated attacks while providing visibility and control over your email environment.

Now is the time.

See how Sublime delivers autonomous protection by default, with control on demand.

BG Pattern