- Chapter Objectives
- Technical Aspects of AI in Threat Intelligence
- Case Study: Using CNNs for Malware Classification
- Case Study: Detecting and Analyzing Phishing Campaigns
- Leveraging AI to Automate STIX Document Creation for Threat Intelligence
- Case Study: Automating Threat Intelligence for a Financial Institution
- Autonomous AI Agents for Cyber Defense
- Case Study: Using MegaVul to Build an AI-Powered Vulnerability Detector
- AI Coding Agents
- Summary
- Multiple-Choice Questions
- Answers to Multiple-Choice Questions
- Exercises
Technical Aspects of AI in Threat Intelligence
AI-driven threat intelligence leverages a range of AI models, neural network architectures, and advanced techniques that enable systems to learn from data and continually improve over time. The following sections describe the main technical components and some historical examples.
Traditional Predictive AI Models, Supervised, and Unsupervised Learning
Traditional machine learning models are still relevant. Many people believe that generative AI models, like the O-series models from OpenAI, Claude, and open-weight models like DeepSeek, are the only choice for cybersecurity. Traditional AI models that are trained on labeled datasets can be used to learn to classify threats or benign behavior. Decision trees, support vector machines, and neural networks can identify malware or phishing by learning from known examples. For instance, a classifier can be trained on features of malicious versus clean files or emails to accurately flag malware and phishing attempts based on past labeled data. Then it can automatically generate threat intelligence based on observed behavior.
Unsupervised models can detect anomalies without requiring labeled attack data, which is great for uncovering new or stealthy threats. Clustering algorithms (for example, K-means, DBSCAN) group similar behavior and flag outliers in network traffic that may indicate a cyber attack. This anomaly detection capability helps identify unknown threats, such as novel intrusion patterns or insider misuse, that deviate from normal baselines.
Deep Learning and Neural Networks
Deep neural networks (such as multilayer perceptrons, CNNs, and RNNs) automatically learn complex patterns from large datasets. In cybersecurity, deep learning has demonstrated success in areas such as malware analysis and fraud detection. For example, convolutional neural networks (CNNs) and recurrent neural networks (RNNs) can capture intricate sequences or structures in data: CNNs have been used to analyze binary executables or network traffic, while RNNs handle sequences of system calls or user actions. Deep learning can recognize subtle, high-dimensional patterns and detect zero-day malware and complex fraud schemes that signature-based methods miss.
Figure 4-1 illustrates a CNN architecture used for analyzing binary executables (such as .exe, .dll, and .bin files) for malware classification.
In Figure 4-1, the model inspects binary executable files, which can be transformed into an image-like representation. Each file type (.exe, .dll) is processed as structured data for feature extraction. The convolutional layers perform feature extraction. The first convolutional layer applies convolution operations with the Rectified Linear Unit (ReLU) activation function, which introduces nonlinearity.
In this example, the model extracts patterns from binary files, similar to how CNNs extract edges from images.
The pooling layer reduces dimensionality by selecting key features, improving computational efficiency, and reducing noise.
The second convolutional layer further refines features, detecting more complex patterns indicative of malware or benign behaviors. The second pooling layer further compresses data while preserving key features.
The flatten layer converts the extracted features from the convolutional layers into a one-dimensional vector for input into the fully connected layer. The fully connected layer is a deep neural network layer that connects all neurons, capturing relationships between features to make final predictions. The output layer uses a SoftMax activation function, which produces a probability distribution over different classifications, such as
Benign software
Malware (for example, Trojans, ransomware, spyware)
Potentially unwanted programs (PUPs)

