Using AI for Offensive Security: Executive Report Summary

The Cloud Security Alliance (CSA) has released an important new report on Using AI for Offensive Security. The report examines how AI, namely large language models (LLMs) are transforming the offensive security landscape. It identifies how AI increases the capability of both security teams and bad actors, offering benefits while posing challenges.

The new landscape necessitates better AI integration, human oversight, and governance, risk, and compliance (GRC) controls. In this executive overview, we'll highlight the report's findings and recommendations on:

How AI agents and LLMs increase offensive security augmentation and automation
How threat actors are already using LLM tools
Why LLM implementation requires stronger GRC controls

LLMs Increase Augmentation and Automation

The CSA's report identifies areas where autonomous AI systems (AI agents) and LLMs can enhance offensive security capabilities. These enhanced capabilities enable AI and LLMs to automate testing procedures throughout the offensive security lifecycle.

AI Agent and LLM Offensive Security Capabilities

AI agents operate autonomously or semi-autonomously to achieve goals by utilizing data input. They can harness LLMs to plan tasks, trigger tasks, make decisions, and interact with their environments. Through LLMs, AI agents can use an iterative approach to achieve goals, continuously learning from input and adapting. This enhances AI offensive security capabilities in six key areas:

Data analysis
Code
Text generation
Planning realistic attack scenarios
Reasoning
Tool orchestration

1. Data Analysis

Offensive security testing generates an enormous amount of data that can overwhelm analysts. By leveraging LLMs, AI agents can streamline analysis of data from network services and tools to accelerate the generation of insights into target infrastructure.

2. Code

AI can help automate the analysis of source code from both open-source and closed-source environments. This expedites vulnerability analysis while reducing human error risk, ensuring detection of even complex vulnerabilities.

3. Text Generation

AI can automate generation of code based on authoritative external databases (Retrieval Augmentation Generation, or RAG). Relying on RAG improves the relevance and accuracy of LLM-generated responses.

4. Planning Realistic Attack Scenarios

AI can generate customized test cases for specific systems and configurations by analyzing resources and using pattern repositories. This increases test coverage scope while reducing time spent constructing tests.

5. Reasoning

AI agents can harness LLMs to optimize reasoning from available data to select next steps and relevant tools. This speeds up the process of assessing and prioritizing which vulnerabilities to pursue, as well as how.

6. Tool Orchestration

AI can help testers efficiently configure multiple tools for offensive security tasks by assisting with construction of requests, queries, and command-line arguments. This improves collection of vital data such as domain names, IP addresses, and other information from WHOIS records.

AI Offensive Security Automation Applications

Through AI agents and LLMs, AI can automate key tasks in each phase of the offensive security lifecycle, including:

Reconnaissance
Scanning
Vulnerability Analysis
Exploitation
Reporting

1. Reconnaissance

The reconnaissance phase of offensive security collects target information from public sources and passive reconnaissance techniques, such as collecting information about employees. AI can augment the reconnaissance phase by streamlining tasks such as:

Automating data collection, e.g.open source intelligence (OSINT)
Summarizing data
Analyzing threat landscapes
Planning adaptive tests

2. Scanning

The scanning phase probes systemsto map targets, networks, and vulnerabilities. AI automates scanning by facilitating processes such as:

Analyzing raw output and traffic
Detecting anomalies
Identifying vulnerability patterns
Generating scripts

3. Vulnerability Analysis

Vulnerability analysis deepens the results of initial scanning by prioritizing security risks based on severity and impact. AI automates vulnerability analysis by supporting tasks such as:

Evaluating risk impacts
Triaging vulnerabilities
Reducing false positives and duplicates
Analyzing root causes

4. Exploitation

The exploitation phase tests identified security weaknesses by launching simulated attacks under controlled conditions. AI automates exploitation by assisting with tasks such as:

Researching and selecting exploits
Generating and aligning payloads and malware
Providing post-exploitation guidance
Simulating social engineering

5. Reporting

The reporting phase can summarize offensive security findings related to identified vulnerabilities, successful exploits, potential impacts, and recommended remediation. AI streamlines reporting by helping with tasks such as:

Generating reports
Summarizing and visualizing report results
Modeling attack paths
Simplifying technical findings
Building knowledge bases

Threat Actors Use These Tools Already

While AI improves the operational capability of offensive security teams, these same capabilities in the wrong hands can serve malicious actors. Bad actors leverage AI to target low-criticality vulnerabilities or enhance their ability to target high-criticality vulns, making both a high priority for security teams to address urgently.

More broadly, threat actors currently use AI for purposes such as:

AI-assisted reconnaissance
AI-powered social engineering
Malicious code writing
Vulnerability research
Bypassing security features
Anomaly detection evasion
Operational command refinement

1. AI-assisted Reconnaissance

Just as offensive security teams use AI for reconnaissance, threat actors leverage AI to automate data collection on targets. They can rapidly process large amounts of data to isolate potential targets.

2. AI-powered Social Engineering

Bad actors can use AI to augment social engineering attacks. AI enables them to analyze public data about individuals and craft customized phishing emails and messages.

3. Malicious Code Writing

AI lowers technical barriers for developing and optimizing malicious scripts and malware. This makes it easier for bad actors to launch complex cyberattacks.

4. Vulnerability Research

Threat actors can use AI to mine public data on system and software vulnerabilities. They can identify exploitable weaknesses by analyzing security reports, patch notes, and exploit databases.

5. Bypassing Security Features

Bad actors can use AI to overcome security barriers such as multi-factor authentication and CAPTCHA challenges. This improves their ability to automate spam attacks and generate fraudulent accounts and profiles at scale.

6. Anomaly Detection Evasion

Threat actors can use AI to mimic normal behavior and traffic while engaging in malicious activity. This allows them to evade security detection efforts.

7. Operational Command Refinement

AI optimizes command-and-control (C&C) operations, making it harder for security teams to detect post-compromise behavior. This allows threat actors to maximize the efficiency of command sequences, remote control manipulation, and data extraction processes.

LLM Implementation Requires Strong GRC

Integrating AI into testing practices without compromising security or privacy depends on implementing strong governance, risk, and compliance policies. The CSA's report recommends that a solid GRC framework should include provisions for:

Safety: Safety should be prioritized throughout the AI model lifecycle by incorporating security and resilience measures at each stage.
Comprehension: Explainable AI outputs should make it easier for security teams to interpret automated findings.
Privacy: AI models should comply with established regulatory frameworks to protect identity, anonymity, and confidentiality.
Fairness: AI models should minimize biases to promote effectiveness in a diverse range of environments.
Transparency: AI systems should provide clear insights into their decision-making processes.
Third-party risk management: Suppliers of AI technologies should be given extra scrutiny for adherence to security and privacy standards, required certifications, and contractual and insurance coverage.

Adhering to GRC best practices helps ensure that offensive security measures don't introduce risks in areas such as security, privacy, and compliance.

How Cobalt Balances Automation with Human Talent

The CSA emphasizes that AI augments human capability rather than replacing it, and effective AI offensive security requires human oversight. While AI brings significant automation benefits to security, it remains limited by the scope of its training data and algorithms, which may be inadequate for confronting novel and complex situations. Without human supervision, LLM-powered technology may fall prone to unpredictability, hallucinations, and errors.

To balance the benefits of AI automation with the need for manual oversight, explore how our Pentesting as a Service (PtaaS) methodologies integrate AI-powered tools with human expertise. We combine manual human testing with a modern cloud-based delivery platform that lets you access real-time results for rapid remediation.

Our platform connects you with our team of expert pentesters, led by a core of professionals who collaborate with the Open Worldwide Application Security Project (OWASP) to develop industry security standards. Our on-demand service model lets you scale your testing up and down as needed.

Get started by scheduling a demo to see how we can help you leverage AI and manual expertise to enhance your offensive security.

Join us to explore what 10 years of data tells us about real risks during the State of Pentesting 2025 webinar.

Join us to explore what 10 years of data tells us about real risks during the State of Pentesting 2025 webinar.

Using AI for Offensive Security: Executive Report Summary

LLMs Increase Augmentation and Automation