AI SOC: How AI Agents Run Security Operations

Published on

March 4, 2026

Suffolk County's IT team got 960 alerts per day from 28 different tools. They investigated maybe 40% of them. When attackers moved against the network, the breach cost $25 million to remediate.

The alerts were already in the system. The team just didn't have time to look at them.

This is measurable:

40% of alerts are never investigated
61% of teams ignored alerts that later proved critical
71% of SOC analysts report burnout, citing alert fatigue

Copilot SOCs vs. Agentic: Why SOAR-Style AI Still Requires Manual Work

Vendors bolted AI onto legacy SOC platforms. Their pitch was simple: AI summarizes alerts. Analysts decide.

What changed: analysts now spend time evaluating AI summaries instead of investigating threats. The AI doesn't reduce work. It adds a layer of work on top of the original work.

A copilot tells you what to do. You still have to do it. You still have to verify it. You still navigate five tools to pull context. You still burn out.

This is just SOAR with a language model attached. Legacy SOAR promised automation through playbooks. Copilot SOCs promise automation through AI suggestions. Both require analysts to execute the work. Both fail to address the core problem: investigation speed.

Agentic SOC: How Multi-Agent Architecture Investigates Alerts End-to-End

An agentic SOC investigates alerts without waiting for human approval at each step.

One agent pulls endpoint logs: process execution, file modifications, network connections, DLL loads.
Another correlates identity and access patterns: recent logins, privilege changes, group membership modifications, and MFA events.
A third enriches indicators against threat intel: file hashes, domains, IPs, email headers.
A fourth builds a timeline: which events happened first, how long between events, and what happened across systems simultaneously.
A fifth score risk against organizational context: is the user traveling, is the system new to the environment, is this on an isolated network segment, does the user normally access this data?

These run in parallel.

Each agent sees results from the others in real-time. The endpoint agent marks a file as suspicious. The timeline agent immediately correlates that with network activity. The identity agent checks if the user who touched that file has abnormal access patterns.

Investigation takes 3-4 minutes instead of 60-90 minutes. Organizations see 70-80% improvement in response time. That matters. In a 960-alert-per-day environment, it's the difference between investigating all alerts or sampling 40%.

The analyst gets a completed investigation with a full evidence trail. A completed investigation showing what happened, when it happened, how it connects, what the risk is. They can override any finding. They can escalate for more scrutiny. They can reverse decisions. But the system already did the grunt work that used to take an hour and a half.

GraphRAG vs. Vector RAG: Graph Structure for Accurate Security Investigations

Large language models hallucinate. They're pattern-matching systems trained on text. When you ask an LLM a question, it doesn't search your data. It generates a plausible-sounding answer.

In security operations, that's unacceptable. If the system says a file hash is known to be malicious, you need to know that's actually true.

Retrieval-Augmented Generation (RAG) forces the LLM to answer only from data it retrieved. It grounds answers in facts instead of probabilistic guesses.

GraphRAG organizes that data as a graph instead of flat text chunks.

A graph is nodes (users, machines, IPs, files, events) connected by edges (relationships: user logged in from IP, process executed file, domain communicated with IP, service account accessed database).

When an alert comes in, GraphRAG starts with a search but then traverses the graph following relationships.

Example 1: Unusual login from an unfamiliar geography.

Traditional RAG retrieves the past 10 logins and similar anomalies from other organizations. GraphRAG retrieves the same starting point, then follows the graph. It finds machines the user accessed, other users with access to those machines, the user's org unit structure, recent access logs going back 60 days, MFA authentication history, corporate travel approvals, and VPN connection logs. It discovers the user submitted a travel request to Prague last week. The login came from Eastern Europe at the right time. It's legitimate travel. The alert is suppressed with a full evidence trail retained.

Example 2: Data exfiltration pattern.

User copies 500MB of sensitive files to USB in 20 minutes. Traditional RAG sees the file access volume is high and suggests escalation. GraphRAG traverses deeper:

It follows the graph to find the user's peer group, who else accesses these files.
It checks if 500MB in 20 minutes is abnormal for this user historically. It checks if the user's role normally involves bulk data exports.
It looks for correlations with recent privilege escalation or group membership changes.
It finds the user was recently added to a data export group for a legitimate business project. The volume is high but within normal parameters for that role. It's a false positive that traditional RAG would have escalated.

The graph forces precision. A node either connects to another node or it doesn't. A user either has a travel approval or doesn't. A machine either shows signs of compromise or it doesn't. The LLM can still misinterpret relationships. But it can't fabricate facts that don't exist in the graph.

Lettria study: GraphRAG achieved 80% correct answers vs. 50.83% with vector RAG. 30-40% accuracy improvement across security queries. In production, this means fewer false escalations wasting analyst time and fewer missed threats getting ignored because of low confidence in the system's findings.

Why Monolithic LLM Chains Fail: Bounded Micro-Agents for Trustworthy AI Response

Vendors understand GraphRAG works. Most implementations still fail because of how they're built.

Monolithic LLM chains feed retrieved information into a single model. That single model interprets all the evidence and makes the decision. One hallucination anywhere contaminates every downstream finding. If the model misinterprets graph data about user behavior, that error propagates through the entire investigation. The system gains one powerful model and loses all isolation of errors.

Security teams rightfully fear this. An AI system that makes wrong decisions in production causes real damage:

Isolate the wrong machine → legitimate database server goes offline → transactions fail → revenue stops
Revoke the wrong user session → executive loses work-critical access → business deal stalls → customer lost
Block legitimate traffic → application can't reach its service dependencies → users see errors → support team gets flooded

One wrong decision at scale costs hundreds of thousands of dollars. Analysts demand the ability to understand why each decision was made and the authority to override or reverse it immediately.

The alternative is bounded micro-agents. Each agent handles a specific domain. Endpoint agent analyzes logs and network data. Identity agent reviews authentication and access. Threat intel agent enriches indicators. Triage agent evaluates risk. Each has clear inputs and outputs.

One agent's error stays isolated. If the endpoint agent misinterprets a process execution pattern, the identity agent still has clean data. The timeline agent still has clean data. The system shows what each agent found independently. An analyst can review the endpoint agent's interpretation while trusting the identity agent's findings. They can spot the error and correct it.

That containment of errors is what makes autonomous response possible. That's what makes auditable AI actually trustworthy.

Audit Trails and Explainability: How Agentic Systems Enable Trust

Security teams need to trace every decision back to actual data. Can you show why the system blocked an IP? What data led to that decision? Can you reverse a decision if it was wrong? Can you explain the finding to compliance auditors?

Agentic systems with audit trails enable this. You can see: endpoint agent found process X, identity agent found authentication pattern Y, timeline agent found event sequence Z, and based on all three findings, confidence was 94%. If the analyst disagrees with that conclusion, they can see exactly where the system's reasoning diverged from fact.

Did the endpoint agent misidentify a legitimate process?
Did the identity agent misinterpret a VPN login as suspicious?
Did the timeline agent miss context? Audit trail shows all of it.

Copilot systems that just suggest actions don't provide this. An AI summary says "user accessed sensitive files abnormally." The analyst either trusts it or doesn't. If the system escalates incorrectly, there's no trail to understand why. No way to correct it for future similar cases. No way to prove to compliance that the decision was data-driven.

Strike48: Five Specialized Agents for Autonomous Alert Investigation

Strike48 investigates alerts end-to-end. Five agents work simultaneously over GraphRAG indexes built from your log environment. Each agent has full access to raw logs when needed.

Triage agent: receives alert, determines investigation scope or suppression recommendation.
Endpoint agent: analyzes process execution, file modification, network logs, privilege escalation, code injection.
Identity agent: reviews authentication patterns, access anomalies, group membership changes, MFA failures, and permission changes.
Threat intelligence agent: enriches indicators against known-malicious databases, scores severity, traces infrastructure connections, and finds similar infrastructure in your environment.
Timeline agent: correlates events across all systems, establishes sequence, identifies gaps in logs that suggest tampering.

Investigation output goes to analysts with full evidence. Each agent's findings are auditable. An analyst can see why the endpoint agent flagged a process, why the identity agent flagged an authentication pattern, what the threat intel agent found.

For high-confidence alerts (confidence > 90%, multiple agents agree), Strike48 executes response autonomously: isolate endpoint (network isolation or quarantine VLAN), revoke session (active and potential future logins), block IP (firewall rules), disable account (credential invalidation), block domain (DNS/proxy rules). All actions are logged to the audit trail. All actions are reversible within seconds.

For ambiguous cases (60-89% confidence or conflicting agent findings), escalates to the analyst with a full investigation complete. Analyst makes the call.

For low-risk suppressions (< 60% confidence or clear benign patterns), suppresses alert with evidence retained. An analyst can review the suppression rationale in audit logs.

Autonomous SOC Implementation: 90-Day Transition From Manual Investigation

Week 1: Audit mode. System investigates alerts and makes recommendations without taking action. Analysts review the system's work daily, spot-check reasoning, build confidence in findings. No analyst workload reduction yet. SOC is validating the system.
Week 2: System handles low-risk suppression decisions autonomously (confidence < 60%, benign patterns). Analysts stop seeing those alerts entirely. Analysts focus on escalations and medium-confidence cases (60-89%). Workload drops 20-30%. Analysts begin seeing time freed up to actually think about alerts instead of just triage them.‍
Week 3: First-level response actions like endpoint isolation, with analyst approval. System isolates a machine, analyst approves within 2-3 minutes, machine stays isolated while team investigates. Endpoint containment now happens in minutes instead of hours. Malware doesn't have time to spread or exfiltrate. Analysts begin experiencing what faster response actually feels like.‍
Week 4: Analysts freed from triage entirely. They focus on threat hunting (proactively searching for compromises the alerts missed), detection tuning (making alert rules smarter so fewer false positives), and investigation work that requires expertise (complex multi-stage attacks, supply chain compromises). The work analysts actually want to do.

Most teams move from investigating 30-40% of alerts to all of them in this window. Response times drop 70-80%. By week 4 or 5, analyst burnout stops. Teams that were losing people at 30% annual turnover suddenly retain people. New hires don't burn out in month three.

SIEM and EDR Integration: Automated Response Execution in Minutes

Strike48 connects directly to your SIEM, EDR, identity platform, and email gateway. It reads events in real-time, correlates signals across tools, and executes response actions within minutes.

Scenario 1: Insider threat.

A user in the finance department accessed sensitive deal files outside normal hours (11 PM), from a new geography (Ukraine), without travel approval in the system. Normally logs in 9-5, US-based. Identity agent confirms no recent password changes, MFA working correctly, but the last login was from a VPN IP in Ukraine.

The endpoint agent shows no malware, and legitimate business applications are running. Threat intel agent finds no infrastructure connections to known-bad IPs. Timeline agent correlates: the user's peer in accounting also accessed similar files 30 minutes later from the same VPN, suggesting shared knowledge. Risk score 87%. The system escalates with full evidence. Analyst reviews timeline, checks employee profile—finds user mentioned Prague travel plans in Slack two weeks ago. Approves session revocation and account suspension. Strike48 revokes all active sessions in 12 seconds, disables password login in 8 seconds, and blocks the account from accessing cloud storage in 6 seconds. Team begins formal investigation. User's access to sensitive files is contained.

Without agentic-response automation, analysts would have spent 45-60 minutes pulling logs from the identity system, EDR, email gateway, and file-access logs. By that time, user could have exported entire deal database to personal cloud storage.

Scenario 2: Malware execution.

EDR detects an unusual process spawned from a Microsoft Word document.

Normally, Word doesn't spawn processes. Endpoint agent analyzes: process is powershell.exe, spawned with an obfuscated command line, and an immediate network connection to an IP never seen before in the environment. File modification logs show .tmp files created in the temp directory, a common malware staging area.

Timeline agent correlates: document was received via email 3 minutes ago, sender external, subject line spoofing internal executive. Threat intel agent finds IP is known C2 infrastructure, 14 recent campaign reports filed. Confidence 96%. System immediately isolates the machine (removes it from the network while preserving RDP access for the analyst), blocks the malicious IP organization-wide, and terminates the process. EDR confirms isolation.

Analyst takes over the investigation. Zero minutes of analyst time spent on initial investigation.

Investigation Speed Comparison: 11 Minutes vs. 2+ Hours MTTR

Phase	Traditional SOC	Agentic SOC
Alert to investigation start	30 minutes (alert sits in queue waiting for analyst availability)	2 minutes (system immediately begins parallel investigation)
Investigation to escalation	60 minutes (analyst manually pulls logs from 5-7 tools, correlates events, makes risk determination)	3 minutes (all five agents complete analysis, consolidate findings, determine confidence)
Escalation to response decision	30 minutes (escalation goes to manager, manager coordinates with team, business context gathering)	5 minutes (analyst review of complete investigation with evidence trail, makes go/no-go call)
Response decision to action	45 minutes (analyst or SOC team manually isolates machine, revokes credentials, updates firewall)	1 minute (automated execution: network isolation, session revocation, IP blocking, account disabling)
Total MTTR	2+ hours (In a 960-alert-per-day environment, analyst won't see the alert for hours if they're triaging other incidents)	11 minutes (Even on a 960-alert-per-day schedule, analyst sees completed investigation within 6 minutes)

A malware download + lateral movement typically takes 10-15 minutes. A data exfiltration to cloud storage takes 20-30 minutes. An insider copying the entire database takes 15-25 minutes, depending on size.

In a traditional SOC, you're still on the phone with the manager when lateral movement completes. In an agentic SOC, the machine is already isolated.

Agentic vs. Copilot: Speed, Automation, and Analyst Retention

Copilot: suggests actions. Analyst still does the work.
Agentic: executes investigation. Analyst reviews evidence.

Copilot: improves suggestion quality. Alert fatigue remains.
Agentic: improves investigation speed. Fatigue decreases.

Copilot: analyst as validator. Analysts experience burnout.
Agentic: analyst as decision-maker. Burnout stops.

Suffolk County received 960 alerts per day. They investigated 40% of them. When the breach happened, the alerts were already in the system.

See how Strike48 automates end-to-end alert investigation, in minutes.

AI SOC: How AI Agents Run Security Operations

Copilot SOCs vs. Agentic: Why SOAR-Style AI Still Requires Manual Work

Agentic SOC: How Multi-Agent Architecture Investigates Alerts End-to-End

GraphRAG vs. Vector RAG: Graph Structure for Accurate Security Investigations

Why Monolithic LLM Chains Fail: Bounded Micro-Agents for Trustworthy AI Response

Audit Trails and Explainability: How Agentic Systems Enable Trust

Strike48: Five Specialized Agents for Autonomous Alert Investigation

Autonomous SOC Implementation: 90-Day Transition From Manual Investigation

SIEM and EDR Integration: Automated Response Execution in Minutes

Investigation Speed Comparison: 11 Minutes vs. 2+ Hours MTTR

Agentic vs. Copilot: Speed, Automation, and Analyst Retention

Latest Articles

The Agentic Log Intelligence Platform

Copilot SOCs vs. Agentic: Why SOAR-Style AI Still Requires Manual Work

Agentic SOC: How Multi-Agent Architecture Investigates Alerts End-to-End

GraphRAG vs. Vector RAG: Graph Structure for Accurate Security Investigations

Why Monolithic LLM Chains Fail: Bounded Micro-Agents for Trustworthy AI Response

Audit Trails and Explainability: How Agentic Systems Enable Trust

Strike48: Five Specialized Agents for Autonomous Alert Investigation

Autonomous SOC Implementation: 90-Day Transition From Manual Investigation

SIEM and EDR Integration: Automated Response Execution in Minutes

Investigation Speed Comparison: 11 Minutes vs. 2+ Hours MTTR

Agentic vs. Copilot: Speed, Automation, and Analyst Retention

Latest Articles

Share Article

The Agentic Log Intelligence Platform