On this page:
Threat Detection
March 24, 2023
Message Query Language enables defenders to share protections against email attacks and powers all rules, insights, and hunts for Sublime.
Sublime is the world’s first open email security platform that lets anyone write, run, and share rules in a universal domain-specific language (DSL) to block email-borne attacks, hunt for threats, and more. In our previous post, we shared how Sublime provides protection against email attacks and enables defenders to share detection rules with others.
In this post, we’re introducing Message Query Language (MQL), the language that drives all rules, insights, and hunts for Sublime. MQL is the same language used by our Detection and ML teams to stop emerging threats like Business Email Compromise (BEC), HTML smuggling, credential phishing, and other email attacks before they cause damage.
With MQL, defenders can also write their own tailored rules for attacks they’re seeing, modify any existing rule written by the Sublime team, use rules written by peers in the community, and transparently understand why a message was flagged in the first place.
If you’re familiar with other languages for detection like YARA, Sigma, Snort/Suricata, or Event Query Language (EQL), then you’ll feel right at home with MQL. If not, don’t worry! We designed MQL to be intuitive, flexible, and easy to use.
You can try our hands-on Email Detection Engineering and Threat Hunting Labs—you'll be guided through the rule creation process while hunting simulated real-life attacks.
When the Sublime Platform processes an email message, it’s first seen in the archaic text EML format. This is the standard for email, but as a text standard, it’s challenging to work with. Even with standards such as RFC5322, not all email conforms, and it’s still a plain text format that makes detection logic difficult.
Instead of dealing with raw text, Sublime parses the format into a highly structured schema, the Message Data Model (MDM), specifically with detection in mind. There’s no need to wrangle complex regular expressions just to search headers or the body.
Instead, it’s easy to find and use the relevant fields. The MDM separates attachments, body, headers, recipients and various other fields into a single document that is easily represented by JSON.
For example, the MDM enables you to check whether hyperlinks have mismatched display vs target URLs or to retrieve a specific hyperlinked top-level domain (TLD):
Similarly, the MDM’s parsed headers let you easily describe SPF, DMARC, or DKIM failures to detect spoofs, or mismatched MAIL FROM and ENVELOPE FROM values:
This schema is used by MQL when writing email detections. For example, typing the MQL snippet type.inbound uses the MDM’s type
object and .inbound
boolean field to describe inbound email messages. More on syntax in the next section.
Inbound messages that contain at least one PDF attachment over 10MiB:
type.inbound
and any(attachments,
.file_type == "pdf" and .size > 10 * 1024 * 1024)
We designed MQL to be simple to read and write. Let’s dissect the above query to get a feel for the syntax:
type.inbound
– Retrieve the field from the MDM, type
-> inbound
. This is only true on incoming messages to a mailbox.
and
– Boolean AND
between two terms. MQL uses plain English words like and instead of symbols like &&
.
any(attachments, ...)
– Check if at least one attachment on the MDM matches some criteria. In MQL, there are several functions to check arrays, such as any, all, and distinct. In an array function, fields on a nested item are referenced with a preceding dot (.
).
.
(dot) – Access a nested item. The leading . indicates that a field is relative to a nested item, not root fields on the MDM.
.file_type == "pdf"
– Has a PDF file type.size > 10*1024*1024
– Has a file size greater than 10 MiB
. We can use arithmetic operations to perform calculations on the fly with MQL.The remaining core syntax, such as strings, literals, comments, and lists are designed to be intuitive. See the MQL syntax docs for a deeper dive.
All of Sublime’s novel detection capabilities are exposed via MQL in the same way: functions. Want to search for a substring or evaluate a regular expression? There’s a function for that. Check domain age via WHOIS? Function for that. Grab a screenshot from a URL and check if it looks like credential phishing? There’s a function for that, too.
There are a handful of top-level functions for the most common operations. The remaining functions are grouped in modules, which keeps them organized and easier to find. To do something with strings, type strings. and autocomplete will list what’s available (more on the rule editor later!). As of writing, these are the functions available:
all
any
distinct
filter
map
coalesce
length
File analysis functions, starting with file.
:
file.explode
file.oletools
Regular expressions, starting with regex
.
regex.contains
regex.icontains
regex.match
regex.imatch
Strings functions, starting with strings.
:
strings.concat
strings.contains
strings.icontains
strings.ends_with
strings.iends_with
strings.levenshtein
strings.ilevenshtein
strings.like
strings.ilike
strings.starts_with
strings.istarts_with
Machine learning functions, starting with ml.
:
ml.macro_classifier
ml.nlu_classifier
And finally, we saved a few of our favorite new functions for last, currently under beta.
:
beta.linkanalysis
beta.whois
Here’s a modified snippet of MQL from a Callback phishing rule that searches a ZIP file for images or PDFs, which are scanned for text with OCR. On the scanned text, this rule performs NLU to check if it contains text resembling a callback scam with high confidence.
It might sound complicated, but it’s actually just a few lines of MQL!
type.inbound
and any(attachments, .file_extension == "zip"
and any(file.explode(.),
.file_extension in~ ("pdf", "jpg", "jpeg", "png")
and any(ml.nlu_classifier(.scan.ocr.raw).intents,
.name == "callback_scam"
and .confidence == "high"
)
)
The Sublime Platform also maintains Lists, which are a collection of strings or items that can be accessed from any rule. Builtin lists are automatically maintained by the Sublime platform, providing immediate context globally or historically for your environment. For anything else, you can create and manage custom lists in your Dashboard or via API.
To reference a list in MQL, include it with in or an array function, such as any
.
Check that a sender’s domain is in the Tranco 1 Million: sender.email.domain.domain in $tranco_1m
Check that a sender has never sent emails to your organization before: sender.email.email not in $sender_emails
Check for a sender domain that’s highly similar to a domain that belongs to your organization (modified from our Lookalike sender domain rule)
type.inbound
and any($org_domains,
strings.levenshtein(sender.email.domain.domain, .) == 1
)
Automatically synced lists, automatically synced with sublime-security/static-files on GitHub:
$alexa_1m
$disposable_email_providers
$file_extensions_common_archives
$file_extensions_macros
$free_email_providers
$free_file_hosts
$free_subdomain_hosts
$majestic_million
$suspicious_tlds
$tranco_1m
$umbrella_1m
$umbrella_1m_tld
$url_shorteners
Dynamically maintained lists from historical messages, used to maintain patterns of communication:
$sender_domains
$sender_emails
$recipient_emails
$recipient_domains
Dynamically maintained lists, which are synced with your upstream email provider:
$org_display_names
$org_domains
$org_slds
In addition to strings, lists can also contain more complex objects, like users in a group from a cloud email provider. For example, $org_vips is automatically created and is easily configured to point to any Azure AD group or Google Group.
Here’s a snippet of MQL from a VIP Impersonation Rule that looks for sender display names matching someone in the VIP list, with an urgent tone, from a new sender:
type.inbound
and sender.email.email not in $sender_emails
and any($org_vips, .display_name == sender.display_name)
and any(ml.nlu_classifier(body.html.inner_text).entities,
.name == "urgency"
)
A language is only as good as its tools, which is why we’ve deliberately designed the MQL editor for all phases of detection engineering. The MQL editor uses the same core as Visual Studio Code, which makes it familiar to users, and enables features that are crucial to development and testing.
When writing rules in Sublime, you’ll quickly find all the features you expect from a mature IDE:
The editor puts Detection Engineering front and center. On the Rule creation page, attach or generate an EML to validate your MQL detects what it’s supposed to. It’s easy to quickly iterate with Test Rule and see the editor highlight the matching parts, indicating that they matched. If the rule resulted in a complete match, you’ll see that oh-so-satisfying Message flagged ✅ indicating that a rule is flagging the intended email.
To ensure that your Rule doesn’t mistakenly flag the wrong message, simply pop open the Backtest tab to run the rule over the last 24 hours of messages to see any matching results. With Test Rule and Backtest, you can quickly get a sense of the efficacy of a rule without ever needing to enable it live in production.
That just scratches the surface of what the MQL editor can do.
That’s a peek at some of the capabilities that set Message Query Language apart and how it was designed specifically to detect behavior in an email environment. With a low barrier to entry, and a simple syntax, MQL puts defenders in control with the tools they need to secure their email environments.
Try out Message Query Language using the free online EML analyzer.
Sublime releases, detections, blogs, events, and more directly to your inbox.
The latest research, attack spotlights, and product updates.
Experience Sublime’s adaptable email security platform and take control of your email environment today.