Home > Articles

This chapter is from the book

AI Model Training, Fine-Tuning, and RAG for Bug Bounties

In Chapter 8, you learned about AI model training, fine-tuning, and RAG. As a security researcher participating in bug bounties, you can fine-tune and use RAG to enhance your ability to identify and report vulnerabilities.

Deploying AI Models

You can easily deploy AI models in cloud platforms such as Azure AI Studio, AWS Bedrock, and Google Cloud Vertex AI. Figure 11-9 shows a model deployment screen within Azure AI Studio, specifically for deploying the Meta-Llama model. The project resource is named omar-demo-project-23, which is the current environment in which the model will be deployed. The custom name for the endpoint is omar-ai-abc123. After deployment, the system will automatically generate an endpoint URL. The deployment is named meta-llama-3-1-8b-1 because it is based on the Meta-Llama 3.1 model using the 8-billion parameter version. Inferencing data collection is disabled, meaning the deployment will not collect data during inference runs.

Figure 11-9

Figure 11-9

Azure AI Studio Model Deployment Screen

Fine-Tuning AI Models

Fine-tuning involves adjusting a pretrained model using a specific dataset to enhance its performance for particular tasks. This process is invaluable for bug hunters because it allows you to tailor AI models to recognize specific types of vulnerabilities or security issues unique to certain software or systems. By fine-tuning models with data from known vulnerabilities, you can improve the accuracy and relevance of your automated tools and potentially AI agents.

Figure 11-10 shows the interface of Google Cloud’s Vertex AI and its Colab Enterprise environment. This notebook is used for fine-tuning the Meta-Llama 3.1 8B-Instruct model.

Figure 11-10

Figure 11-10

Fine-Tuning AI Models in Google Vertex AI

In Figure 11-10, the model is being run on an NVIDIA A100 GPU, as specified in the accelerator_type parameter.

The following are additional details about the fine-tuning parameters shown:

  • Batch Size: The per_device_train_batch_size is set to 10, meaning each GPU will process one batch at a time.

  • Gradient Accumulation Steps: This parameter is set to 8, meaning gradients will accumulate over 8 batches before updating the model weights, helping manage memory constraints.

  • Max Sequence Length: The model will process sequences of up to 4096 tokens at a time.

  • Max Steps: This parameter is set to −1, which suggests there is no limit on the number of steps for training.

  • Epochs: Training will run for 1.0 epoch, meaning the entire dataset will be passed through the model once.

  • Precision: Fine-tuning is performed using 4-bit precision (fine-tuning_precision_mode), which helps reduce the memory usage while still being performant.

  • Learning Rate: The learning rate is set to 5e-5 (0.00005), which controls how much the model adjusts weights with each training step.

  • Learning Rate Scheduler: The learning rate will follow a “cosine” decay schedule, gradually reducing as training progresses.

The following are the LoRA parameters used in the example shown in Figure 11-10:

  • LoRA Rank: Set to 16, this refers to the rank in the low-rank adaptation technique, commonly used to fine-tune large models efficiently by focusing only on specific layers.

  • LoRA Alpha: Set to 32, this controls the scaling factor for LoRA, which influences the weight adjustment during fine-tuning.

  • LoRA Dropout: Set to 0.05, meaning 5 percent of the neurons are randomly dropped during training to prevent overfitting.

There are many other fine-tuning settings, but the following are displayed in Figure 11-10:

  • Gradient Checkpointing: Uses Enabled (True), which helps save memory by recomputing some layers during the backward pass.

  • Attention Implementation: Uses flash_attention_2, a more memory-efficient attention mechanism that allows for faster processing of large sequences.

Using RAG and AI Agents for Bug Bounty Hunting

RAG combines the capabilities of language models with data retrieval from vector databases and other sources, providing contextually relevant information that enhances the model inference and reduces the likelihood of hallucinations or confabulations. For bug bounty hunters, RAG can be used to access up-to-date information about potential vulnerabilities or exploits as they emerge.

Many different tools such as LlamaIndex, LangChain, LangGraph, and cloud services such as Google Vertex AI can help you build RAG implementations and AI agents. Figure 11-11 shows the Google Vertex AI Agent Console, which is used for building and managing AI agents. A custom AI agent named Omar’s Bug Bounty Hunter Agent is being configured.

Figure 11-11

Figure 11-11

Creating AI Agents in Google Vertex AI

The configured “Goal” provides a description of the agent’s purpose, which is to assist bug bounty hunters in identifying and exploiting vulnerabilities while promoting ethical hacking practices. The “Instructions” section provides guidelines for the agent’s behavior. These instructions include offering explanations for bug hunting tactics, assisting with tools and frameworks, providing insights on various vulnerability types, and guiding responsible disclosure.

You can also configure data stores (using the built-in vector database) to use your data and very easily create a RAG deployment. These Vertex AI agents feature special state handlers known as data store handlers. These handlers allow the data store agent to engage in conversations with end users about the stored content.

Tool Calling

You can also link your agent to external tools (tool calling) and enable it to perform tasks more effectively. Figure 11-12 shows an example of tool calling in Google Vertex AI.

Figure 11-12

Figure 11-12

Tool Calling in Google Vertex AI

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.