Medium Severity

Link: Multistage landing - Scribd document

Labels

Credential Phishing

Evasion

Natural Language Understanding

Computer Vision

Optical Character Recognition

URL screenshot

Description

Detects when a Scribd document contains embedded links that are suspicious, particularly those targeting Microsoft services through various evasion techniques. The rule analyzes both the document content and linked destinations for suspicious patterns and redirects.

References

No references.

Sublime Security

Created May 14th, 2025 • Last updated Jan 12th, 2026

Feed Source

Sublime Core Feed

Source

GitHub

type.inbound
// only one link to Scribd
and length(distinct(filter(body.links,
                           .href_url.domain.root_domain in ("scribd.com")
                           and strings.istarts_with(.href_url.path, "/document")
                    ),
                    .href_url.url
           )
) == 1
and any(body.links,
        .href_url.domain.root_domain == "scribd.com"
        and strings.istarts_with(.href_url.path, "/document")
        and (
          // target the embedded links via XPath
          any(html.xpath(ml.link_analysis(.).final_dom,
                         '//a[@class="ll"]/@href'
              ).nodes,
              strings.parse_url(.raw).domain.tld in $suspicious_tlds
              or strings.parse_url(.raw).domain.domain in $free_subdomain_hosts
              or strings.parse_url(.raw).domain.root_domain in $free_subdomain_hosts
              // observed pattern in credential theft URLs
              or strings.ilike(strings.parse_url(.raw).path,
                               "*o365*",
                               "*office365*",
                               "*microsoft*"
              )
              // observed pattern in credential theft URLs
              or strings.ilike(strings.parse_url(.raw).query_params,
                               "*o365*",
                               "*office365*",
                               "*microsoft*"
              )
              // observed pattern in credential theft URLs
              or any(beta.scan_base64(strings.parse_url(.raw).query_params),
                     strings.ilike(., "*o365*", "*office365*", "*microsoft*")
              )
              or ml.link_analysis(strings.parse_url(.raw), mode="aggressive").credphish.disposition == "phishing"
              or ml.link_analysis(strings.parse_url(.raw), mode="aggressive").credphish.contains_captcha
              or strings.icontains(ml.link_analysis(strings.parse_url(.raw),
                                                    mode="aggressive"
                                   ).final_dom.display_text,
                                   "I'm Human"
              )
              // bails out to a well-known domain, seen in evasion attempts
              or (
                length(ml.link_analysis(strings.parse_url(.raw),
                                        mode="aggressive"
                       ).redirect_history
                ) > 0
                and ml.link_analysis(strings.parse_url(.raw), mode="aggressive").effective_url.domain.root_domain in $tranco_10k
              )
          )
          // credential theft language on the main Scribd page
          or any(ml.nlu_classifier(beta.ocr(ml.link_analysis(.,
                                                             mode="aggressive"
                                            ).screenshot
                                   ).text
                 ).intents,
                 .name == "cred_theft" and .confidence != "low"
          )
        )
)
// negate highly trusted sender domains unless they fail DMARC authentication
and (
  (
    sender.email.domain.root_domain in $high_trust_sender_root_domains
    and not headers.auth_summary.dmarc.pass
  )
  or sender.email.domain.root_domain not in $high_trust_sender_root_domains
)

MQL Rule Console

•Docs•Learning Labs

Playground

Test against your own EMLs or sample data.

Post about this on your socials.

Get Started. Today.

Managed or self-managed. No MX changes.