CyberSOCEval: CrowdStrike and Meta Set the New Benchmark for AI in the SOC

The fight to stay ahead of cyber adversaries just took a leap forward. This week, CrowdStrike and Meta announced the launch of CyberSOCEval, a groundbreaking open-source benchmark suite designed to evaluate the real-world performance of AI—especially large language models (LLMs)—within the Security Operations Center (SOC).

Why CyberSOCEval, and Why Now?

Security operations leaders face a continual onslaught of alerts and ever-shifting attack techniques. Meanwhile, the adoption of AI is accelerating, especially for SOC automation, incident response, and threat analysis. Yet, until now, there has been no standardized way to measure if an AI solution can actually deliver tangible defense against modern threats in operational settings. CyberSOCEval fills that gap.

Developed atop Meta’s CyberSecEval framework and integrating CrowdStrike’s world-class adversary intelligence, CyberSOCEval offers practical, scalable benchmarks. These test models with real adversary tactics, expert-designed reasoning tasks, and pressure-tested scenarios drawn straight from the front line. The result: organizations and vendors finally have a level playing field to separate hype from operational reality.

How Does CyberSOCEval Work?

CyberSOCEval benchmarks LLMs across core SOC workflows, including:

Malware Analysis: Evaluates how effectively models dissect and understand malicious code.
Incident Response: Measures decision-making and investigative skills using attacker tradecraft as observed in sophisticated, real intrusions.
Threat Analysis Comprehension: Assesses the ability to parse, correlate, and react to complex threat intelligence reports.

By running through its open, adversary-driven evaluation suite, organizations can independently validate which AI systems truly enhance detection, response, and analyst productivity—rather than just generating impressive demos.

The Industry Impact

This benchmark suite brings several key benefits to security teams and AI developers:

Transparency: Model performance and weaknesses are now objectively visible, accelerating trust and adoption.
Operational Readiness: Security teams can select and tailor AI solutions that actually support risk reduction, not just vendor marketing claims.
Continuous Improvement: The open-source nature invites global collaboration, ensuring benchmarks evolve as threats and AI technologies do.

As AI becomes a staple of adversary playbooks, strengthening defenders’ AI tools with real-world benchmarks is nothing less than essential.

Where to Get CyberSOCEval

CyberSOCEval is now available to the AI and security community via Meta’s CyberSecEval framework. Security researchers, SOC engineers, and AI developers are encouraged to use and contribute to the benchmarks, supporting an evidence-driven future for AI in cyber defense.