- Chapter Objectives
- Technical Aspects of AI in Threat Intelligence
- Case Study: Using CNNs for Malware Classification
- Case Study: Detecting and Analyzing Phishing Campaigns
- Leveraging AI to Automate STIX Document Creation for Threat Intelligence
- Case Study: Automating Threat Intelligence for a Financial Institution
- Autonomous AI Agents for Cyber Defense
- Case Study: Using MegaVul to Build an AI-Powered Vulnerability Detector
- AI Coding Agents
- Summary
- Multiple-Choice Questions
- Answers to Multiple-Choice Questions
- Exercises
Case Study: Using CNNs for Malware Classification
Traditional signature-based and heuristic malware detection methods struggle to detect zero-day malware and obfuscated malicious code. Historically, organizations have developed a CNN-based malware detection system that automatically learns patterns from binary executable files and classifies them as either benign or malicious.
When you train a CNN, you start by collecting a dataset of labeled executable files (for example, from VirusTotal, malware repositories, or enterprise threat intelligence feeds). The binary files are transformed into structured matrices. Convolutional and pooling layers are used to identify unique malware features.
The fully connected layers and SoftMax activation are used to classify files. You can train the model using labeled malware and benign datasets. You also can use cross-validation and evaluate performance with metrics like accuracy, precision, recall, and F1-score.
Although accuracy, precision, recall, and F1-score are widely used and useful metrics, they may not be sufficient for evaluating your AI model’s performance, especially if you’re planning to deploy it in production. Additional model evaluation metrics include mean squared error (MSE), which is useful for regression problems. MSE measures the difference between predicted and actual values.
Mean absolute error (MAE) is similar to MSE but uses the absolute difference instead of squaring it. Root mean squared percentage error (RMSPE) is a variation of MSE, suitable for problems with an extensive range of values. Area under the receiver operating characteristic curve (AUC-ROC) measures the model’s ability to distinguish between positive and negative classes. Area under the precision-recall curve (AUC-PR) can be used to evaluate the model’s performance in terms of precision and recall at different thresholds.
Additional model deployment metrics can measure latency (the time taken by the model to make predictions on new inputs), memory usage, and robustness to adversarial examples (assessing the model’s ability to withstand intentionally crafted input examples designed to mislead it).
When evaluating your AI model before deployment, consider using a combination of these metrics to gain a well-rounded understanding of its performance and limitations. This approach will help you identify areas for improvement and fine-tune your model for optimal results in production. In this book, we will not address the additional technical details of model evaluation. However, for further details, visit the GitHub repository at https://hackerrepo.org.
Once you train and evaluate the model, you can deploy as a cloud-based API or embed within endpoint detection and response (EDR) solutions.
Natural Language Processing (NLP)
Natural language processing techniques allow threat intelligence systems to interpret and analyze unstructured text data, which is abundant in cybersecurity (for example, logs, security reports, email content, dark web forums). By text mining threat reports and parsing phishing emails, NLP models can extract indicators of compromise; identify attacker tactics, techniques, and procedures (TTPs); and even infer attacker intent. For example, an NLP-driven system might scan social media or underground forums for threat chatter or analyze an email’s language and entities to determine whether it’s phishing.
