Trust, then autonomy: A new framework for evaluating security AI

1. Introduction

In this current phase of AI proliferation, the word “autonomous” has shifted from a meaningful descriptor to a marketing buzzword. This is because autonomy now sits on a spectrum, spanning AI assistants that require human input at every step to fully autonomous agents that operate end-to-end.

The spectrum measures functionality, but it's also a measure of trust. An AI that needs constant human feedback has been granted less trust A truly autonomous agent operates at the highest level of trust. But in either case, vendors can, and often do, confidently describe their AI as autonomous.

This means that the burden of moving a tool along the autonomy spectrum is on the buyer, not the vendor. Not even the most capable AI can be granted autonomy without extensive testing over time. Even if the capability exists, no AI should ever be given full autonomy right out of the gate – no matter what a vendor promises.

The use case for the AI further complicates the spectrum. For the rest of this guide, we’ll focus on AI built for security. Earning trust is hard for humans and AI alike. The stakes are just too high for trust to be handed out freely.

With that in mind, this guide is going to take a look at the current state of autonomy in security AI, the spectrum it sits on, and a framework for the trust-based path to autonomy you need while evaluating solutions

2. The promise of security AI over three years of RSAC

RSAC is a leading security conference and it was the perfect place to watch the evolution of the security AI narrative over the past few years. The changes seen year to year mirror the language used to advance AI along the autonomy spectrum.

‍

RSAC 2024
AI-powered everything

RSAC 2025
Agentic AI everywhere

RSAC 2026
AI moves from feature  
to foundation

AI chatter on the floor

Copilot
Summarization
Assistant
“Agent” barely mentioned

Agentic AI
Governance
Human-in-the-loop

Autonomous SOC
Multi-agent Workflows
Reasoning and Evidence
Adversaries caught up too

Security AI funding

Orgs that want security + continuity/archiving in one suite

Medium–Low (limited flexibility for rapid changes)

High (tuning and operational overhead)

Message from AI vendors

AI is a force multiplier

Agentic AI is a category

Agentic SOC is foundational

Implicit AI trust

Low: 
human required

Medium: human-in-the-loop

High: autonomous future

RSAC 2024
AI-powered everything

AI chatter on the floor

Copilot
Summarization
Assistant
“Agent” barely mentioned

Security AI funding

Orgs that want security + continuity/archiving in one suite

Message from AI vendors

AI is a force multiplier

Implicit AI trust

Low: human required

RSAC 2025
Agentic AI everywhere

AI chatter on the floor

Agentic AI
Governance
Human-in-the-loop

Security AI funding

Medium–Low (limited flexibility for rapid changes)

Message from AI vendors

Agentic AI is a category

Implicit AI trust

Medium: human-in-the-loop

RSAC 2026
AI moves from feature  to foundation

AI chatter on the floor

Autonomous SOC
Multi-agent Workflows
Reasoning and Evidence
Adversaries caught up too

Security AI funding

High (tuning and operational overhead)

Message from AI vendors

Agentic SOC is foundational

Implicit AI trust

High: autonomous future

The issue, though, is that the language is changing faster than the engineering. While vendors are touting autonomy, the products they’re selling typically require a human in the loop, landing their AI in the “assisted” or “guided” area of the spectrum (see below). The promise of autonomy currently exceeds the capabilities of most AI products, leaving a large trust gap.

3. The AI autonomy spectrum

Before continuing, we need to define the autonomy spectrum.

Each of these levels corresponds directly to trust in the AI model. The best AI products give organizations controls to dial autonomy up as trust grows, so teams can deploy now and expand autonomy as evidence accumulates. This decreases friction and promotes adoption by letting organizations implement and increment.

An important thing to note is that full autonomy cannot be reached without trust. In fact, the only difference between Level 4 and 5 is trust. Full autonomy is not a feature that can be bought. It is the end result of thorough testing that proves an AI can be trusted without human oversight.

Before we get into the framework your organization can use when evaluating autonomous AI, there is one thing you need to keep in mind: None of the levels are wrong.

Not every organization wants full autonomy
Just because an agent isn’t capable of full autonomy doesn’t mean it can’t be a powerful boost to productivity.

A well-designed Level 2 will always beat a poorly designed Level 4.

4. Trust framework for evaluating security AI

As we’ve established, trust is a key part of the autonomy spectrum. Building trust takes time. Building trust within security is even more difficult due to the stakes. Security AI can’t just be evaluated against known threats. It needs to be evaluated against the threats you’re facing  in your environment. And it needs to be tested against threats it hasn't seen before.

Security is inherently adversarial and adaptive, with attackers constantly testing your perimeter and adjusting their tactics and techniques for your environment. So even if the security AI being sold to you has been trained on billions of data points, how many of those data points are representative of the attacks you’re seeing in your environment? Trust  doesn’t come with model complexity. It comes through rigorous evaluation.

4.1 Evaluate the vendor before evaluating the product

As seen over recent years, security AI vendors are selling autonomy beyond the level of what they’ve engineered. While looking at different vendors, be sure to evaluate where they are in terms of capabilities that support autonomy. If your goal is full autonomy, but your vendor can only offer you up to Level 3, they aren’t the right fit. Here are some simple indicators to look for and phrasing to be aware of.

‍

Crawl

Walk

Run

Maturity level

Intuition-based

Rigorous and repeatable

Continuously improving

Indicators

No eval strategy
Demo-driven results
No version comparison

Defined benchmarks
Tracked metrics
Reproducible methodology

Feedback for evals
Benchmarked against human
Improvement over time

What vendors will say

“Our AI has caught X number of real threats.”

“Here is our methodology. Here is how we measure performance.”

“Here is our performance curve over 18 months.”

Crawl

Maturity level

Intuition-based

Indicators

No eval strategy
Demo-driven results
No version comparison

What vendors will say

“Our AI has caught X number of real threats.”

Walk

Maturity level

Rigorous and repeatable

Indicators

Defined benchmarks
Tracked metrics
Reproducible methodology

What vendors will say

“Here is our methodology. Here is how we measure performance.”

Run

Maturity level

Continuously improving

Indicators

Feedback for evals
Benchmarked against human
Improvement over time

What vendors will say

“Here is our performance curve over 18 months.”

Testing and evaluation takes time and effort you can't afford to spend on the wrong vendor.  So on top of the above maturity matrix, here is a non-exhaustive list of questions to ask before evaluating their security AI. If a vendor cannot answer these in detail, you have your answer.

Where does your product sit on the autonomy scale and what happens at the boundary?

Ask your vendor where they fit in our 1–5 spectrum of autonomy. Do they really mean that their agent can operate autonomously, but only within strict boundaries (Level 4), or do they mean humans must be involved in any decisions (Level 2)?

If the AI reaches its defined limitations, what happens? Does it transition from AI autonomy to human control? Are there hard stops that could cause late-night alerts requiring immediate attention, or are there fallback instructions that allow the agent to continue on a path within bounds?
What does your AI get wrong and how will I know when it happens in my environment?
‍Where does the vendor’s agent struggle? What use cases will require human supervision and which will be fine without oversight? What are the signs that an AI has made the wrong decision or performed the wrong operation? Can safeguards be put in place that are specific to these use cases?

Vendors will share their known knowns. They share known unknowns less willingly. Keep probing until every concern is answered.
If it makes a wrong call at 3 a.m., what's the blast radius and how do I roll it back?
‍After your vendor explains what could go wrong, they need to explain how to handle major issues (which they may promise will never happen). Will their Solution/Sales Engineering team provide clear documentation on how to contain the fallout from issues created by decisions their AI makes? This documentation should be specific to your environment, not just generic knowledge base articles.

This documentation should cover all the possibly affected systems and people, data flows showing the full radius, and remediation/rollback instructions for impacted systems. It should also list known causes of the mistake.
Has your AI been red-teamed internally and externally?

Security is only a design until a team actually attempts to hack the AI. Does the vendor have a red team? Have they run internal attack campaigns on the AI to see if it can be influenced or controlled via prompt injection or other methods? Has the vendor contracted external, unbiased pentesters to hack their AI?

Security AI systems protect and have access to your company’s most important information. You need an AI that’s been battle-hardened before you hop on your first demo call.
Can I see your evaluation methodology, not just the results?
‍
There’s a reason math professors make you show your work. Right answers can come from wrong reasoning. Apply that same reasoning to a different scenario and it falls apart. Your vendor’s AI needs to show its chain of thought, not just the final answer.

Ask how easy it is to see and correct the AI's reasoning. If you can see the logic, you can trust the result. And when you spot bad reasoning, you can hunt down where else it shows up and fix it.
If a regulator or lawyer asks why the AI made a specific decision, what can I show them?

This is an important follow up to the previous question. In that question, you’re asking the vendor to prove that you can trust their AI. In this question, you’re asking the vendor to verify that their AI won’t put your company into regulatory or legal exposure. When an employee does something that violates a contract or compliance standard, they can explain why. Can the vendor’s AI do the same?

Accountability is a trait applied to humans and their companies. The only accountability an AI can have is its audit trail; every step of its reasoning, recoverable after the fact. With that log, experts can determine if your company or your vendor is at fault for any incidents. Sales cycles can be long, but due diligence can be even longer.

4.2 The trust path to autonomy

Autonomy can’t be declared by the vendor you’ve selected. It is granted incrementally by the customer through evidence. Here are three simple steps to build trust:

Prove it works in your environment, not just demos
‍Every security AI will looks effective and autonomous in a demo, but that’s  not your environment. You need to perform structured testing before even considering deploying in production. This process will start with Level 2 autonomy and could progress as far as Level 3. It’s important to not rush through levels, though, as non-production environments will not face the variability that comes with real-world usage against novel attacks.
Show how well it works in production, as well as why
‍Once the security AI is live, it needs to show operational evidence of its effectiveness, not just benchmarks. The how well can be attributed to efficacy. What is the catch rate? What are the false positive and false negative rates?
‍
Equally important is why it’s working. The security AI needs to be transparent, explainable, and auditable in order to be trusted:
‍
- Transparent: How decisions are made must be visible.
- Explainable: Every decision must come with justified reasoning.
- Auditable: Every action must be reconstructable.
Prove it works in your environment, not just demos

Every security AI will looks effective and autonomous in a demo, but that’s  not your environment. You need to perform structured testing before even considering deploying in production. This process will start with Level 2 autonomy and could progress as far as Level 3. It’s important to not rush through levels, though, as non-production environments will not face the variability that comes with real-world usage against novel attacks.

4.3 Aligning the trust path with the autonomy spectrum

No matter which level of autonomy you’re trying to reach, there are simple ways to evaluate trust. With modern security AI solutions, you will likely start at Level 2 evaluation methods, but it’s important to not skip any levels along the path.

‍

_{LEVEL 1} Assisted

_{LEVEL 2} Guided

_{LEVEL 3} Supervised

_{LEVEL 4} Conditional

_{LEVEL 5} Full autonomy

How you know
it works

Benchmark results

Eval harness

Red team attestation

Model monitoring

Proven by Level 4 testing

What it shows while running

Output logs

Reasoning citations

Guardrail policies

Audit trails

How the human stays in control

Decision traces

Approval gates

Audit trails

How the human stays in control

n/a

4.4 What this looks like in practice with Sublime

Consider two common security AI use cases: an agent that helps investigate threats and an agent that helps generate detections. Both may be described as “agentic,” but they earn trust in very different ways.

An investigative system like ASA (Sublime’s Autonomous Security Analyst) is judged by the quality of its reasoning, evidence, and escalation behavior. A detection-generation system like ADÉ (Sublime’s Autonomous Detection Engineer) must also prove that its outputs are reliable in production and do not create unnecessary noise or blind spots. In both cases, the question is not whether the AI looks capable in a demo, but whether the evidence justifies giving it more autonomy over time.⁠⁠

As trust is earned, both agents provide options for moving along the autonomy spectrum, from guided to supervised to conditional to fully autonomous.

5. Trust needs to be earned

Apply this framework, and you'll know exactly when to extend more autonomy to your security AI. 

To learn how Sublime’s AI agents can be evaluated with this framework, get a live demo with one of our security experts.

Back to Resources

Trust, then autonomy: A new framework for evaluating security AI

1. Introduction

2. The promise of security AI over three years of RSAC

RSAC 2024 AI-powered everything

AI chatter on the floor

Security AI funding

Message from AI vendors

Implicit AI trust

RSAC 2025 Agentic AI everywhere

AI chatter on the floor

Security AI funding

Message from AI vendors

Implicit AI trust

RSAC 2026 AI moves from feature to foundation

AI chatter on the floor

Security AI funding

Message from AI vendors

Implicit AI trust

3. The AI autonomy spectrum

4. Trust framework for evaluating security AI

4.1 Evaluate the vendor before evaluating the product

Crawl

Maturity level

Indicators

What vendors will say

Walk

Maturity level

Indicators

What vendors will say

Run

Maturity level

Indicators

What vendors will say

4.2 The trust path to autonomy

4.3 Aligning the trust path with the autonomy spectrum

LEVEL 1 Assisted

How you know it works

What it shows while running

How the human stays in control

LEVEL 2 Guided

How you know it works

What it shows while running

How the human stays in control

LEVEL 3 Supervised

How you know it works

What it shows while running

How the human stays in control

LEVEL 4 Conditional

How you know it works

What it shows while running

How the human stays in control

LEVEL 5 Full autonomy

How you know it works

What it shows while running

How the human stays in control

4.4 What this looks like in practice with Sublime

5. Trust needs to be earned

Now is the time

RSAC 2024
AI-powered everything

RSAC 2025
Agentic AI everywhere

RSAC 2026
AI moves from feature  to foundation

_{LEVEL 1}Assisted

_{LEVEL 2}Guided

_{LEVEL 3} Supervised

_{LEVEL 4} Conditional

_{LEVEL 5} Full autonomy