Home > Articles

This chapter is from the book

This chapter is from the book

Case Study: Using MegaVul to Build an AI-Powered Vulnerability Detector

A global software company faced the challenge of securing a rapidly growing codebase across hundreds of microservices. Traditional static analysis tools were generating high false positive rates and missing subtle vulnerabilities, overwhelming security teams and slowing development velocity. The organization needed a more intelligent, automated way to detect vulnerabilities early—ideally, during code review or even before code was merged.

The security engineering team adopted MegaVul, a large-scale vulnerability dataset containing over 17,000 labeled vulnerable functions and 320,000 nonvulnerable functions, mined from 9,000+ real-world vulnerability fix commits. The MegaVul dataset can be found at https://github.com/Icyrockton/MegaVul.

The team fine-tuned a transformer-based code model on MegaVul’s function-level data, leveraging its balanced mix of vulnerable/nonvulnerable samples. For more complex vulnerabilities, they trained a graph neural network (GNN) variant using MegaVul’s control-flow and data-flow graph representations, allowing the model to reason about code semantics beyond syntax. The resulting model was deployed as a pre-commit hook and integrated into CI/CD pipelines to provide near-real-time feedback to developers.

As new vulnerabilities were discovered internally, the team contributed them back into their own version of the MegaVul training set, continuously improving detection accuracy.

The company obtained great results:

  • 40 Percent Reduction in False Positives: This number was an improvement compared to the organization’s previous static analysis tool.

  • 25 Percent Faster Code Reviews: Developers spent less time triaging irrelevant alerts and more time fixing real issues.

  • Early Catch Rate Increase: The team detected three times more high-severity vulnerabilities before production, significantly reducing remediation costs.

  • Scalable Security: The model was able to analyze millions of lines of code nightly, something previously impossible without dramatically increasing headcount.

Combining sequence-based and graph-based learning delivered deeper semantic understanding, catching vulnerabilities missed by pattern-matching tools. Continuous retraining keeps detection capabilities aligned with the organization’s evolving codebase and new threats.

Automated Incident Response

AI-driven security platforms increasingly offer automated or semi-autonomous incident response actions, often under the umbrella of SOAR. In practice, this means that when an alert fires, an AI agent can automatically take containment steps such as isolating a host from the network, disabling a user account, or deploying a firewall block—all in seconds, which is far faster than a human could react during an ongoing attack. These actions are typically guided by playbooks (predefined response workflows), but AI makes them smarter by tailoring the response to the situation. For instance, if an endpoint is confirmed via AI analysis to be infected with malware, an agent can immediately quarantine the machine and retrieve relevant logs for forensic analysis. Reinforcement learning is often used to optimize these response policies: An RL agent in a SIEM can learn which responses effectively mitigate threats with minimal disruption.

Over time, and through many incidents, it refines a policy like “if ransomware behavior is detected, kill the process and back up affected files,” with the highest reward being stopping the attack quickly. Such adaptive learning ensures that automated responses improve and adapt to new attack patterns. Many modern endpoint detection and response (EDR) and extended detection and response (XDR) solutions offer autonomous or one-click containment powered by AI analysis of threats.

Adaptive Security Mechanisms

Autonomous AI agents can also proactively manage the security posture of an environment. This means adjusting configurations, rules, or resource allocations on the fly in response to changes in risk. A practical example is an AI agent monitoring cloud infrastructure that might automatically tighten access controls or spin up additional decoy systems (honeypots) if it senses an increased threat level (say, an influx of scanning from a certain region). Another example: Network intrusion prevention systems (IPS) with AI might dynamically rewrite firewall or router rules when an attack is detected, then remove or relax them once the threat subsides, thus optimizing security without permanent manual rule changes. These agents essentially implement an adaptive defense—continuously balancing usability and security. The reinforcement learning approach is well suited here: The agent receives rewards for maintaining security (blocking attacks) and minimizing impact on normal operations; thus, it learns an optimal adaptive strategy. We see early forms of this in technologies like software-defined networks (SDNs) where AI can reroute or throttle traffic during attacks, or in cloud security posture management tools that autocorrect risky configurations. Over time, you can imagine a more fully autonomic security system where many lower-level decisions (patching a server, adding an IAM policy, revoking a certificate) are handled by AI agents based on policies and real-time threat intelligence.

Examples of Autonomous Cyber Defense

A milestone in autonomous cyber defense was DARPA’s Cyber Grand Challenge in 2016, where fully automated systems competed to find and patch vulnerabilities in real time without human input. The winning system, “Mayhem,” demonstrated that machines can autonomously scan software for bugs, develop exploits or patches, and apply them on the fly. This system proved the concept that AI agents can conduct both attack and defense tasks at machine speed, which is now spurring new research.

Today, companies like Darktrace (with its Antigena response system) show rudimentary autonomous defense in action; Antigena can independently decide to slow down or stop a likely compromised connection or device, buying time for human review.

Cloud providers also use autonomous agents: for example, AWS’s GuardDuty can trigger Lambda functions to automatically shut down compromised instances once certain threat criteria are met (an AI-driven workflow under the hood). On the offensive side (for defensive purposes), tools like automated penetration testers have emerged; these are essentially bots that use AI planning to work through an attack kill chain and see how far they can get, revealing gaps for organizations to fix.

As these autonomous systems evolve, we expect them to handle more complex decisions. However, a careful balance is needed: Fully autonomous responses carry risk (false positives could disrupt operations), so many implementations allow an AI agent to take specific low-regret actions immediately (such as isolating a machine that’s 99 percent confirmed to be infected) while higher-impact actions are left for human approval or review. The trajectory, nonetheless, is toward increasing autonomy, where AI agents become trusted co-defenders operating at a speed and scale unreachable by manual efforts alone.

AI Agents Automating Attack Surface Management

Attack surface management (ASM) is the continuous process of discovering, inventorying, and monitoring an organization’s IT assets (both on-premises and in the cloud) to identify potential attack vectors before attackers do.

In practice, ASM involves mapping all external-facing assets (such as websites, servers, cloud services) that could be infiltrated, classifying them by risk, prioritizing the most critical exposures, and remediating vulnerabilities promptly. This process provides organizations with real-time visibility into their digital footprint, helping to limit security gaps that cybercriminals might exploit. ASM is critically important because modern enterprises constantly expand their digital presence—through cloud adoption, IoT devices, remote workforce tools, and so on—which broadens the potential attack surface.

If these new assets or changes are not tracked and secured, they can become hidden entry points (“shadow IT”) for attackers. The challenge, however, is that many enterprises struggle with manual ASM. Security teams often lack visibility into all assets (in fact, many organizations only know about a fraction of the IT assets they actually own). Manually maintaining an up-to-date inventory and risk profile is labor-intensive and error-prone. Point-in-time audits or periodic scans can quickly become outdated, as new systems come online or configurations drift. Additionally, cyber threats evolve rapidly—attacks occur around the clock and tactics change constantly—making purely manual, reactive surface management inadequate. These challenges create a pressing need for more automated, intelligent ASM solutions in enterprise security.

AI agents are fundamentally transforming attack surface management by automating the entire lifecycle of discovery, monitoring, and mitigation at a scale that human teams simply cannot match. In this context, an AI agent is not just a script or a static tool; it is an autonomous, intelligent entity capable of perceiving the environment, reasoning about risk, and taking action toward the goal of reducing exposure. AI-driven ASM continuously scans networks, cloud environments, and Internet-facing assets, eliminating the lag and blind spots associated with periodic manual audits. By pulling data from diverse sources (for example, DNS records, IP ranges, cloud APIs, Shodan results, network scanners), these agents build and maintain a real-time, comprehensive asset inventory that includes ephemeral cloud instances, shadow IT, and newly onboarded infrastructure the moment they come online.

Beyond discovery, AI agents revolutionize vulnerability management by applying machine learning to detect weaknesses and misconfigurations in near real time. Rather than relying solely on static signatures or waiting for traditional CVE-based scanning cycles, they can infer patterns of risky configurations, identify anomalies, and flag emerging exposures before they become incidents. For example, an AI agent might recognize that a misconfigured S3 bucket or a recently deployed web server with outdated software is exposed to the Internet, automatically classify the risk, and trigger remediation workflows, all within minutes of the asset appearing. This proactive, adaptive approach allows security teams to focus on the highest-priority issues, shortens time to remediation, and drastically reduces the window of opportunity for attackers.

Ultimately, AI agents shift ASM from a reactive, periodic process to a living, continuously updated risk map, where threats are detected and mitigated dynamically. This evolution not only strengthens the organization’s security posture but also enables leaner teams to manage sprawling, cloud-native attack surfaces with far greater speed, accuracy, and confidence.

Real-Time Threat Monitoring and Response

AI agents don’t just catalog assets; they can also monitor them and respond to threats in real time. By analyzing network traffic, user behavior, and system logs, AI-driven ASM systems can identify suspicious activities or indicators of compromise as they happen.

If an anomaly or attack attempt is detected on an asset, the AI can automatically trigger defense measures much faster than a human could. For instance, AI agents could isolate an affected server, block a malicious IP address, or escalate an alert with recommended actions within seconds of detecting a threat. This real-time responsiveness is critical given that attackers often exploit vulnerabilities within hours or days of discovery. AI agents essentially act as 24/7 security sentinels—continuously watching the attack surface and initiating containment or mitigation workflows the moment something risky is found. This tactic reduces the window of exposure and frees up human analysts to focus on higher-level strategy rather than constant firefighting.

Overall, AI agents bring speed, scale, and intelligence to ASM. They tirelessly enumerate assets, evaluate risk, and take action, providing a force multiplier for security teams. As one industry perspective notes, AI-enhanced ASM offers benefits like automation (offloading repetitive tasks), scalability to handle large attack surfaces, and improved accuracy in identifying threats. In short, AI-driven ASM can maintain an up-to-date map of the enterprise’s attack surface and defend it in a more continuous and adaptive manner than manual methods ever could.

Sample Use Case: AI-Driven ASM with LangGraph

To illustrate the power of AI agents in attack surface management, consider a use case where an organization deploys a multi-agent ASM system built using LangGraph. LangGraph is a framework for creating structured AI workflows, allowing multiple AI agents to work together in a coordinated “graph” of tasks and decisions. It is part of the popular LangChain framework.

In a LangGraph-powered ASM solution, each agent can be specialized (one for asset discovery, one for vulnerability analysis, and so on), and LangGraph orchestrates their interactions and decision-making flow. This structured approach ensures the system operates autonomously yet in a controlled, transparent manner—essentially encoding the security team’s logic and processes into an AI-driven workflow.

Scenario: An enterprise seeks an automated system that continuously maps its external-facing assets, checks them for vulnerabilities, and triggers remediation steps if high-risk vulnerabilities are identified. Using LangGraph, the security team designs an ASM workflow with multiple AI agents working in concert:

  • Asset Discovery Agent: This agent’s goal is to maintain a live inventory of all Internet-facing assets. Using LangGraph, it’s configured as a node that runs on a schedule (or is triggered by certain events) to gather asset data. It employs various tools/APIs—for example, querying cloud infrastructure for new hosts, scanning company domains for subdomains and certificates, and searching IP address ranges for responsive services. The AI agent can interpret the results to determine which systems likely belong to the organization (for example, by domain name or metadata) and then update a central asset database. Because it’s AI-driven, the agent can even learn patterns of the organization’s infrastructure to improve discovery (for instance, learning naming conventions or typical cloud deployments). This continuous mapping ensures previously unknown assets (like a forgotten test server or a newly acquired domain) are promptly “seen” and brought under management.

  • Vulnerability Assessment Agent: Once assets are identified, another LangGraph agent automatically evaluates their security posture. This agent might integrate with scanning tools (for ports, services, known vulnerabilities) and also use AI to analyze configuration data. For each asset (such as server, application, API endpoint), it checks for common weaknesses—open ports, outdated software versions, misconfigurations, and known CVEs. Beyond traditional scanners, the AI can correlate information (for example, combining scan results with external threat intelligence about active exploits). The LangGraph workflow ensures that for each newly discovered asset, this assessment agent is invoked. The agent then produces a risk report or alert if a serious vulnerability or misconfiguration is found. For example, if it finds an S3 bucket that is publicly accessible or a web server running a version with a critical flaw, it flags this as an issue requiring action.

  • Threat Monitoring Agent: In parallel, the system includes an agent focused on monitoring the attack surface for signs of active threats. This could involve watching traffic logs for suspicious patterns targeting the company’s assets or scanning dark web and open-source intelligence for mentions of the company’s domains or leaked credentials. If the asset discovery agent added a new IP or domain, the monitoring agent immediately starts tracking it for any inbound attacks or unusual activity. For instance, if a newly launched cloud server suddenly sees a burst of inbound connections on an unexpected port, the monitoring agent (leveraging an AI anomaly detection model) would catch that and classify it as potentially malicious scanning. This agent ensures that the moment an attacker starts probing or exploiting any part of the attack surface, it’s detected in real time. It works closely with the vulnerability agent’s output as well: if a high-risk vulnerability is found on an asset, the monitoring agent may intensify scrutiny on that asset for any exploitation attempts.

  • Decision and Orchestration Logic: LangGraph links these agents in a logical flow. Each agent is a node in the graph, and their outputs flow into decision nodes that determine the next steps. For example, after the vulnerability assessment agent runs, a decision node evaluates the severity of any findings. If no significant issues are found, the workflow might loop back into continuous discovery and monitoring (maintaining a watchful normal state). But if a critical vulnerability or an active threat is detected, LangGraph routes the workflow to the remediation phase. This graph-based orchestration is powerful; it can incorporate conditional branches and even parallel paths. It’s essentially implementing an expert decision tree crafted by the security team (and easy to adjust) but executed automatically by AI agents. LangGraph ensures the right agent gets triggered at the right time based on the evolving context (asset changes, new vulnerabilities, threat alerts), acting as the “brain” coordinating the multi-agent system.

  • Remediation Agent and Workflows: When a serious risk is identified, the ASM system doesn’t stop at detection; it also takes action. A remediation agent kicks in as the next node in the LangGraph. This agent is configured to initiate appropriate response workflows. In some cases, it might directly execute automated fixes. For instance, if the vulnerability agent found a critical patch missing on a server, the remediation agent could prompt a patch management tool to apply the update or quarantine that system from the network until it’s fixed. For cloud misconfigurations (like an open storage bucket), the agent could call the cloud’s API to adjust the access settings immediately. In other scenarios, the remediation agent might create a detailed ticket or alert for the DevOps and security teams, enriched with all the context the previous agents gathered (asset details, vulnerability evidence, recommended fix). This method ensures humans are brought into the loop for oversight on particularly sensitive actions. The LangGraph orchestration can also require human approval at certain junctions if desired (for example, maybe auto-patching is allowed for low-risk systems, but for a production server, the agent generates a change request for a team to review).

Throughout this process, LangGraph provides the structured backbone that ties everything together. The graph of agents and decision nodes defines the workflow clearly: how data flows from discovery to detection to response. Each agent focuses on its task (thanks to LangGraph’s design, which encourages specialized, modular agents), and the framework handles passing the necessary data along (for example, the list of assets discovered flows to the vulnerability agent; the findings from that flow to the decision node; and so on). This modular, multi-agent design is beneficial because each component can be improved or swapped independently without breaking the whole system.

For instance, the team could upgrade the vulnerability agent’s AI model or plug in a new scanning tool, and as long as its outputs remain compatible, the LangGraph workflow continues smoothly. Similarly, adding a new type of check (say, a cloud compliance agent) is as simple as adding a new node and hooking it into the graph at the right point.

In summary, using LangGraph to orchestrate AI agents, the enterprise ends up with an autonomous ASM system that

  • Continuously maps its attack surface

  • Intelligently assesses and monitors for risks

  • Triggers timely mitigation actions—all with minimal human intervention

The security team gains a constantly up-to-date view of their exposure and can trust that immediate steps will be taken the moment something dangerous pops up. This example showcases how AI agents, coordinated via a framework like LangGraph, can achieve a level of speed and breadth in attack surface management that would be impossible to replicate with manual efforts alone.

Benefits and Challenges of AI-Driven ASM

Adopting AI-driven attack surface management offers several key advantages for enterprise security. AI agents dramatically reduce the manual workload on security teams by automating repetitive discovery and analysis tasks. Instead of engineers spending time continually scanning or combing through logs, AI can handle these tasks at machine speed. This automation frees up human analysts to focus on strategic security improvements while routine monitoring is handled autonomously.

Automation also minimizes human error in asset tracking and analysis. In terms of cost and effort, an AI-driven ASM system can operate 24/7 without fatigue—something human teams cannot match. An AI-based ASM provides continuous, real-time visibility into an organization’s assets and its security state. Unlike periodic audits, the system is always watching. This means emerging threats are caught as soon as they occur: If an attacker starts exploiting a new vulnerability or probing the network, the AI will notice the anomalous pattern immediately.

Furthermore, AI agents can respond in real time by generating instant alerts or even taking direct action to contain threats. Faster detection and response greatly reduce the window attackers have, thereby limiting potential damage. In short, AI-driven ASM turns security into a “live” operation rather than a series of after-the-fact reactions.

AI agents excel at handling large volumes of data and can scale as the enterprise grows. Whether an organization has 100 assets or 900,000, an AI-driven solution can continuously cover the entire attack surface without a linear increase in manpower. This scalability is vital as modern enterprises have complex, distributed infrastructures. AI’s ability to integrate data from cloud services, on-prem networks, and external sources means it can provide a unified, organization-wide view of risk. Moreover, AI’s speed and pattern recognition capabilities allow it to maintain accuracy even at scale.

While AI-driven ASM is powerful, enterprises should be mindful of several challenges and limitations when implementing it. AI isn’t infallible. Especially when first introduced, it may flag benign activities as malicious, generating false positives. For example, an unusual but legitimate IT configuration might be misclassified as a threat. These incorrect alerts still require human investigation and can overwhelm security teams if too frequent.

The goal is to calibrate the system so that alerts are reliable, striking the right balance between sensitivity and specificity in threat detection. AI can play a major role in this calibration, dynamically tuning detection thresholds to minimize false positives without missing true threats. Security teams should regularly test the AI’s performance and retrain models with fresh data to keep pace with emerging attack techniques. Just as importantly, because AI agents themselves expand the attack surface, they must be secured like any other critical system. This means applying robust identity, access control, and monitoring to the AI agent to ensure it cannot be hijacked, manipulated, or used as a pivot point by attackers. In other words, the AI must not only defend the enterprise but also be hardened as part of the enterprise’s own security posture.

AI agents often require broad access to monitor user activities, network traffic, and system configurations to be effective. This raises privacy concerns—both internally (monitoring employee or customer data) and with respect to regulations. Enterprises must ensure that the data fed into AI-driven ASM (which could include sensitive information) is handled in accordance with privacy laws and company policies.

For instance, if an AI agent analyzes user login patterns to detect anomalies, the organization must consider how that data is stored, who can access it, and how it is used. Additionally, when using third-party or cloud-based AI services, concerns arise about sharing sensitive asset data with these providers. Strong data governance, anonymization where possible, and transparency are needed to address these issues. Companies should also be prepared to explain and document how their AI is making decisions, especially in regulated industries. This is part of the broader challenge of AI explainability. Equally important, AI infrastructure itself is complex and resource-intensive, often spanning both cloud and on-premises environments. Ensuring that these environments are properly secured (from data pipelines to model hosting to inference endpoints) requires coordinated investment in cloud security controls, on-prem security monitoring, and continuous configuration management. Without this, the AI system can become a high-value target for attackers.

Despite these challenges, none are insurmountable. Many can be mitigated with proper planning: tuning algorithms to reduce false positives, establishing procedures to regularly update models, ensuring tight access controls and encryption on sensitive data, and integrating AI in a phased, well-tested manner. It’s also worth noting that attackers are increasingly leveraging AI for offense, so defenders adopting AI is a necessary evolution. The key is doing so thoughtfully.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.