Sublime is the world’s first open email security platform that lets anyone write, run, and share rules in a universal domain-specific language (DSL) to block email-borne attacks, hunt for threats, and more. In our previous post, we shared how Sublime provides protection against email attacks and enables defenders to share detection rules with others.
In this post, we’re introducing Message Query Language (MQL), the language that drives all rules, insights, and hunts for Sublime. MQL is the same language used by our Detection and ML teams to stop emerging threats like Business Email Compromise (BEC), HTML smuggling, credential phishing, and other email attacks before they cause damage.
With MQL, defenders can also write their own tailored rules for attacks they’re seeing, modify any existing rule written by the Sublime team, use rules written by peers in the community, and transparently understand why a message was flagged in the first place.
If you’re familiar with other languages for detection like YARA, Sigma, Snort/Suricata, or Event Query Language (EQL), then you’ll feel right at home with MQL. If not, don’t worry! We designed MQL to be intuitive, flexible, and easy to use.
Message Data Model
When the Sublime Platform processes an email message, it’s first seen in the archaic text EML format. This is the standard for email, but as a text standard, it’s challenging to work with. Even with standards such as RFC5322, not all email conforms, and it’s still a plain text format that makes detection logic difficult.
Instead of dealing with raw text, Sublime parses the format into a highly structured schema, the Message Data Model (MDM), specifically with detection in mind. There’s no need to wrangle complex regular expressions just to search headers or the body.
Instead, it’s easy to find and use the relevant fields. The MDM separates attachments, body, headers, recipients and various other fields into a single document that is easily represented by JSON.
For example, the MDM enables you to check whether hyperlinks have mismatched display vs target URLs or to retrieve a specific hyperlinked top-level domain (TLD):
Similarly, the MDM’s parsed headers let you easily describe SPF, DMARC, or DKIM failures to detect spoofs, or mismatched MAIL FROM and ENVELOPE FROM values:
This schema is used by MQL when writing email detections. For example, typing the MQL snippet type.inbound uses the MDM’s type object and .inbound boolean field to describe inbound email messages. More on syntax in the next section.
Inbound messages that contain at least one PDF attachment over 10MiB:
.file_type == "pdf" and .size > 10 * 1024 * 1024)
We designed MQL to be simple to read and write. Let’s dissect the above query to get a feel for the syntax:
Retrieve the field from the MDM, type -> inbound. This is only true on incoming messages to a mailbox.
Boolean AND between two terms. MQL uses plain English words like and instead of symbols like &&.
Check if at least one attachment on the MDM matches some criteria. In MQL, there are several functions to check arrays, such as any, all, and distinct. In an array function, fields on a nested item are referenced with a preceding dot (.).
Access a nested item. The leading . indicates that a field is relative to a nested item, not root fields on the MDM.
.file_type == "pdf"
Has a PDF file type
.size > 10*1024*1024
Has a file size greater than 10 MiB. We can use arithmetic operations to perform calculations on the fly with MQL.
The remaining core syntax, such as strings, literals, comments, and lists are designed to be intuitive. See the MQL syntax docs for a deeper dive.
All of Sublime’s novel detection capabilities are exposed via MQL in the same way: functions. Want to search for a substring or evaluate a regular expression? There’s a function for that. Check domain age via WHOIS? Function for that. Grab a screenshot from a URL and check if it looks like credential phishing? There’s a function for that, too.
There are a handful of top-level functions for the most common operations. The remaining functions are grouped in modules, which keeps them organized and easier to find. To do something with strings, type strings. and autocomplete will list what’s available (more on the rule editor later!). As of writing, these are the functions available:
File analysis functions, starting with file.:
Regular expressions, starting with regex.
Strings functions, starting with strings.:
Machine learning functions, starting with ml.:
And finally, we saved a few of our favorite new functions for last, currently under beta. :
Here’s a modified snippet of MQL from a Callback phishing rule that searches a ZIP file for images or PDFs, which are scanned for text with OCR. On the scanned text, this rule performs NLU to check if it contains text resembling a callback scam with high confidence.
It might sound complicated, but it’s actually just a few lines of MQL!
and any(attachments, .file_extension == "zip"
.file_extension in~ ("pdf", "jpg", "jpeg", "png")
.name == "callback_scam"
and .confidence == "high"
The Sublime Platform also maintains Lists, which are a collection of strings or items that can be accessed from any rule. Builtin lists are automatically maintained by the Sublime platform, providing immediate context globally or historically for your environment. For anything else, you can create and manage custom lists in your Dashboard or via API.
To reference a list in MQL, include it with in or an array function, such as any.
Check that a sender’s domain is in the Tranco 1 Million:
sender.email.domain.domain in $tranco_1m
Check that a sender has never sent emails to your organization before:
sender.email.email not in $sender_emails
Check for a sender domain that’s highly similar to a domain that belongs to your organization (modified from our Lookalike sender domain rule)
strings.levenshtein(sender.email.domain.domain, .) == 1
Automatically synced lists, automatically synced with sublime-security/static-files on GitHub:
Dynamically maintained lists from historical messages, used to maintain patterns of communication:
Dynamically maintained lists, which are synced with your upstream email provider:
In addition to strings, lists can also contain more complex objects, like users in a group from a cloud email provider. For example, $org_vips is automatically created and is easily configured to point to any Azure AD group or Google Group.
Here’s a snippet of MQL from a VIP Impersonation rule that looks for sender display names matching someone in the VIP list, with an urgent tone, from a new sender:
and sender.email.email not in $sender_emails
and any($org_vips, .display_name == sender.display_name)
.name == "request"
A language is only as good as its tools, which is why we’ve deliberately designed the MQL editor for all phases of detection engineering. The MQL editor uses the same core as Visual Studio Code, which makes it familiar to users, and enables features that are crucial to development and testing.
When writing rules in Sublime, you’ll quickly find all the features you expect from a mature IDE:
- debugger to evaluate functions
- diagnostics to recognize possible logical errors
- errors, hints, and warnings
- function signature support
- syntax highlighting
The editor puts Detection Engineering front and center. On the Rule creation page, attach or generate an EML to validate your MQL detects what it’s supposed to. It’s easy to quickly iterate with Test Rule and see the editor highlight the matching parts, indicating that they matched. If the rule resulted in a complete match, you’ll see that oh-so-satisfying Message flagged ✅ indicating that a rule is flagging the intended email.
To ensure that your Rule doesn’t mistakenly flag the wrong message, simply pop open the Backtest tab to run the rule over the last 24 hours of messages to see any matching results. With Test Rule and Backtest, you can quickly get a sense of the efficacy of a rule without ever needing to enable it live in production.
That just scratches the surface of what the MQL editor can do.
That’s a peek at some of the capabilities that set Message Query Language apart and how it was designed specifically to detect behavior in an email environment. With a low barrier to entry, and a simple syntax, MQL puts defenders in control with the tools they need to secure their email environments.
Stay tuned for more blog posts where we’ll demonstrate how to use MQL to prevent real, trending threats.
Try out Message Query Language now using the free online EML analyzer.