AI-Driven Threat Intelligence

By Omar Santos and Petar Radanliev
Feb 25, 2026

📄 Contents

␡

⎙ Print

< Back Page 2 of 13 Next >

This chapter is from the book 

AI-Powered Digital Cyber Resilience

Learn More Buy

Technical Aspects of AI in Threat Intelligence

AI-driven threat intelligence leverages a range of AI models, neural network architectures, and advanced techniques that enable systems to learn from data and continually improve over time. The following sections describe the main technical components and some historical examples.

Traditional Predictive AI Models, Supervised, and Unsupervised Learning

Traditional machine learning models are still relevant. Many people believe that generative AI models, like the O-series models from OpenAI, Claude, and open-weight models like DeepSeek, are the only choice for cybersecurity. Traditional AI models that are trained on labeled datasets can be used to learn to classify threats or benign behavior. Decision trees, support vector machines, and neural networks can identify malware or phishing by learning from known examples. For instance, a classifier can be trained on features of malicious versus clean files or emails to accurately flag malware and phishing attempts based on past labeled data. Then it can automatically generate threat intelligence based on observed behavior.

Unsupervised models can detect anomalies without requiring labeled attack data, which is great for uncovering new or stealthy threats. Clustering algorithms (for example, K-means, DBSCAN) group similar behavior and flag outliers in network traffic that may indicate a cyber attack. This anomaly detection capability helps identify unknown threats, such as novel intrusion patterns or insider misuse, that deviate from normal baselines.

Deep Learning and Neural Networks

Deep neural networks (such as multilayer perceptrons, CNNs, and RNNs) automatically learn complex patterns from large datasets. In cybersecurity, deep learning has demonstrated success in areas such as malware analysis and fraud detection. For example, convolutional neural networks (CNNs) and recurrent neural networks (RNNs) can capture intricate sequences or structures in data: CNNs have been used to analyze binary executables or network traffic, while RNNs handle sequences of system calls or user actions. Deep learning can recognize subtle, high-dimensional patterns and detect zero-day malware and complex fraud schemes that signature-based methods miss.

Figure 4-1 illustrates a CNN architecture used for analyzing binary executables (such as .exe, .dll, and .bin files) for malware classification.

FIGURE 4.1

A Convolutional Neural Network Architecture Used for Analyzing Binary Executables

In Figure 4-1, the model inspects binary executable files, which can be transformed into an image-like representation. Each file type (.exe, .dll) is processed as structured data for feature extraction. The convolutional layers perform feature extraction. The first convolutional layer applies convolution operations with the Rectified Linear Unit (ReLU) activation function, which introduces nonlinearity.

WHAT IS NONLINEARITY?

Nonlinearity in the context of machine learning and neural networks refers to a property where an input-output relationship does not have a one-to-one correspondence between inputs and outputs. In other words, the output of a function or layer may change even if only minor changes are made to the input.

The introduction of nonlinearity in convolutional layers with ReLU activation allows the network to learn more complex patterns and relationships in the data, such as edges, shapes, and textures, which are important for tasks used in object detection.

In this example, the model extracts patterns from binary files, similar to how CNNs extract edges from images.

The pooling layer reduces dimensionality by selecting key features, improving computational efficiency, and reducing noise.

The second convolutional layer further refines features, detecting more complex patterns indicative of malware or benign behaviors. The second pooling layer further compresses data while preserving key features.

The flatten layer converts the extracted features from the convolutional layers into a one-dimensional vector for input into the fully connected layer. The fully connected layer is a deep neural network layer that connects all neurons, capturing relationships between features to make final predictions. The output layer uses a SoftMax activation function, which produces a probability distribution over different classifications, such as

Benign software
Malware (for example, Trojans, ransomware, spyware)
Potentially unwanted programs (PUPs)

< Back Page 2 of 13 Next >

🔖 Save To Your Account

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Email Address