Home > Articles

This chapter is from the book

This chapter is from the book

AI Coding Agents

An “AI coding agent” can be defined as an AI system that automates and assists across the Software Development Lifecycle (SDLC), capable of understanding high-level objectives expressed in natural language and executing a custom series of tasks to achieve them. This goes far beyond simple code completion; it involves generating, optimizing, debugging, and even deploying code with remarkable speed and accuracy. The key differentiator is this ability to perform complex, multistep actions in pursuit of a goal, marking a shift from reactive assistance to proactive, goal-oriented execution.

The journey to the modern AI coding agent is built on a long history of innovations in developer tooling. In the nascent days of computing in the 1960s, coding was a laborious process involving punch cards and primitive line editors like TECO, which operated on text one command at a time. The 1970s brought the advent of interactive, full-screen editors with the creation of the legendary vi and Emacs, tools so foundational that they sparked the decades-long “editor wars.” A significant leap in developer workflow occurred in the 1980s with the emergence of the first integrated development environments (IDEs). Borland’s Turbo Pascal, released in 1983, was a breakthrough product that combined a code editor, compiler, and runtime into a single, cohesive program, inventing the modern IDE concept and drastically improving efficiency.

The path toward AI-powered assistance began with early static code analysis tools, which focused on identifying bugs and optimizing performance based on predefined rules. The 2000s saw the introduction of statistical models that improved code completion, but these systems lacked a true understanding of programming context and intent. The genuine revolution arrived with the development of transformer-based large language models (LLMs) trained specifically on vast repositories of source code. These models demonstrated an unprecedented ability to comprehend programming concepts across multiple languages and frameworks. The release of OpenAI Codex changed things forever. Codex became the engine for the first generation of modern AI coding agents, most notably going beyond the initial capabilities of GitHub Copilot, and set the stage for the explosion of innovation that followed.

The following are some of the most popular AI coding tools although the list grows on a daily basis. For a list of tools and other resources, check out my GitHub repository at https://hackerrepo.org.

  • GitHub Copilot

  • Cursor

  • Windsurf

  • Claude Code

  • Codex

  • Cline

  • Sourcegraph Codi and AMP

  • Warp

  • Lovable

  • Replit

The Modern IDE

An integrated development environment is a software application that combines all the tools programmers need into a single, comprehensive workspace. Historically, an IDE includes a code editor with features like syntax highlighting and code completion, along with tools for building, debugging, and managing code, making the software development process more efficient. Popular historical examples include Visual Studio Code, Eclipse, and even vim. However, let’s look at the anatomy of a “modern IDE.” Figure 4-7 shows Cursor, which is an AI-powered code editor built on the open-source codebase of Visual Studio Code (VS Code).

FIGURE 4.7

FIGURE 4.7

The Cursor IDE

As you can see in Figure 4-7, Cursor inherits the same underlying architecture of VS Code, user interface, and extensibility via extensions (making it familiar to anyone who has used VS Code). Cursor’s key features are more deep because it offers AI features beyond what is possible with a simple VS Code extension.

The Cursor IDE interface is divided into four main sections. On the far left is the Explorer panel, which displays a project’s folder structure, allowing you to quickly open files, browse directories, and manage your project. Next is the main code editor, where you write and edit your code. It supports multiple tabs, syntax highlighting, and other features, just like VS Code. The third column is the Claude Code plug-in panel. You can use it to run commands, create agents, or ask Claude for help with code, documentation, or reviews.

On the far right is the Cursor AI agent, which in this case displays detailed AI-generated responses and code explanations relevant to what you’re working on. At the bottom of the interface is a terminal and debug area, where you can run shell commands, view errors and warnings, or debug your application without leaving the IDE.

Putting it all together, here’s how the parts of the IDE shown in Figure 4-7 can help in a typical workflow. You keep your project files open, browse through folders, edit files, view code, and so on. This is where you’re writing the core logic, modules, workflows, and the like. When you need higher-level AI assistance (say you want to refactor across files, find/fix bugs, generate tests, commit changes, or run linting/test suites), you issue commands via Claude Code in the terminal or create subagents to fully automate tasks.

Whenever you are unsure what code does (legacy code, complex logic, API usage, and so on), you could invoke the AI assistant panel to ask: “Explain this code snippet,” “What does this function do?,” or “Document this code”.

Many users report that Claude Code and other agentic coding tools like Codex agent do a better job maintaining “intent” over a complex multistep instruction, particularly when the tasks span across modules or files. It can remember prior context better, and there’s less back-and-forth. Claude Code supports things like agent manifests, hooks, and operational rules that let you define how you want it to work (for example, how permissions, file access, and scope are handled). This capability is helpful when you want consistency or want to embed best practices/guardrails.

If your workflow involves delegating parts of development (say, for prototyping, or for routine tasks, test generation, or refactoring), Claude Code is more capable of being trusted to do more on its own, rather than your needing to oversee every prompt. This capability can lower friction. Although Claude Code has many strengths, it isn’t always strictly “better” in every scenario. There are some trade-offs/contexts where Cursor or just model integration in Cursor shine. When you’re doing small changes, live coding, exploring, or tweaking, Cursor (with its editor integration) is more immediate. It provides auto-completion, suggestions inline, and minimal context switching. If you’re working on small modules or making incremental changes, you might not need the heavy machinery that Claude Code provides.

Cursor’s UI-based AI tools integrated in the editor tend to have a less steep learning curve compared to setting up agents or manifests, or defining command-line workflows. For things like debugging in real time, live code reviews inside the editor, or quick fixes, having suggestions come directly in the editor is more fluid. Claude Code tends to be more batch or goal-oriented, so for rapid iteration, Cursor often has the edge. If you have a large or interconnected codebase, needing to understand cross-dependencies, perform big refactors, or maintain consistency, then Claude Code’s deeper context, tooling, and automation pay off. If you’re doing multistep tasks (for example, “generate tests + run them + update failing ones + commit”) rather than just “fix this one bug / write this one function,” Claude Code can reduce your manual overhead.

Now let’s put all that into a single narrative using the example of an IDE integration like the one in Figure 4-7. Say that you’re editing main.py in the editor, adding a new feature. You realize you need some unit tests. You switch to the integrated terminal and run “claude generate tests for feature X” (or a similar natural-language instruction). Claude Code analyzes your codebase, finds relevant modules/function definitions, generates test skeletons, and even populates assertions.

It outputs code differentials (diffs), which you can inspect either via the Claude sidebar (if it shows changes) or via IDE diff tools. Maybe you accept or tweak them manually in the editor. Then you run the updated test suite (via terminal) to verify that tests pass. If there are failures, you go back and edit in editor with guidance from the AI explanation side panel (“why is this failing?”).

When ready, you commit via Claude Code or via your usual git workflow; maybe you do Claude commit “Add tests for feature X” or just manually use git because Claude Code supports those operations too.

In the right-side panel, you might ask, “Walk me through the changes” or “Why is this function calling that other module?” to stay oriented.

Core Technological Pillars of an AI Coding Tool

The capabilities of modern AI coding agents are supported by a confluence of several key technologies. This modular architecture is not only powerful but also generalizable, with the coding domain serving as an ideal proving ground due to its structured nature, clear success metrics (for example, code that compiles and passes tests), and a rich ecosystem of existing tools like compilers and linters.

The architectural patterns being perfected in today’s coding agents are likely to become the blueprint for autonomous agents in numerous other professional domains, from cybersecurity to financial analysis. Figure 4-8 shows some of the high-level core technological pillars of a typical AI coding tool.

FIGURE 4.8

FIGURE 4.8

Core Technological Pillars of a Typical AI Coding Tool

As illustrated in Figure 4-8, at the heart of every agent is an AI model, which acts as its foundational “brain.” Trained on immense datasets of text and code, LLMs provide the core ability to understand natural language, reason about problems, and generate human-like text and code.

To move from simple generation to autonomous action, agents employ architectural patterns or loops. A prominent example is the Reason and Act (ReAct) framework, utilized by tools such as Cursor, Claude Code, Codex, Google’s Gemini CLI, and many others. This is an iterative process where the agent

  • Reasons about a given task to form a plan

  • Chooses an action, such as using a tool (for example, running a command in the terminal)

  • Observes the result of that action and then repeats the cycle, using the new observation to refine its reasoning for the next step

This loop enables agents to tackle complex, multistep problems that require interaction with an external environment.

Context is king! An agent’s effectiveness is directly proportional to its understanding of the specific project it is working on. A model’s built-in context window is often insufficient to contain an entire codebase. To overcome this issue, many advanced agents use retrieval-augmented generation (RAG). This technique involves creating a searchable index (often a vector database) of the entire codebase. When a developer makes a request, the RAG system first retrieves the most relevant code snippets, API definitions, or documentation from this index. This retrieved information is then injected into the prompt sent to the LLM, providing it with deep, project-specific context. This allows the agent to generate far more accurate and idiomatic code that respects the project’s existing patterns and conventions.

To extend their capabilities beyond code manipulation, agents are beginning to adopt standards for tool use. The Model Context Protocol (MCP) is an emerging framework that allows agents to connect to and utilize a wide array of external tools and services. An agent with MCP support can be configured to interact with platforms like Figma to understand design specifications, Webex or Slack to send notifications, Stripe to process payments, or Jira to manage tickets, effectively bridging the gap between the coding environment and the broader development ecosystem.

AI Coding Tools and Digital Cyber Resilience

AI coding tools are a double-edged sword for digital cyber resiliency. They can significantly enhance an organization’s defense capabilities through automation and advanced analysis, but they also introduce new attack vectors and vulnerabilities. To achieve true cyber resilience, organizations must adopt a balanced strategy that uses AI for defense while vigilantly managing the risks it creates.

AI tools automate and enhance security tasks at a scale and speed that is not possible for humans, strengthening an organization’s ability to withstand, respond to, and recover from cyber attacks.

AI can automate the discovery and monitoring of vulnerabilities, providing real-time updates on an organization’s risk posture. By analyzing historical data, AI can predict where new vulnerabilities might emerge and help prioritize critical patches.

AI-powered tools can simulate sophisticated, real-world attack scenarios to test and stress-test an organization’s defenses. This capability helps security teams proactively identify weaknesses and improve their resilience against emerging threats.

Security Risks Associated with AI Coding Tools

The widespread adoption of AI coding assistants also creates a larger attack surface and introduces a new set of risks to the Software Development Lifecycle.

AI assistants could generate code with security flaws, including common vulnerabilities like SQL injection and cross-site scripting (XSS). This happens because the models are trained on large public codebases that contain vulnerable code, which the AI can then replicate in new applications.

By accelerating the speed and volume of code production, AI tools can outpace an organization’s traditional security controls, leading to a net increase in vulnerabilities and a larger attack surface. The AI models themselves can be vulnerable to attack and manipulation.

In some cases, AI tools can invent or “hallucinate” nonexistent software packages. Malicious actors can then register those package names to distribute malware. Software developers may inadvertently expose sensitive or proprietary code by feeding it into an AI coding assistant. Additionally, AI model inversion attacks can potentially reveal a model’s sensitive training data.

Best Practices for Secure AI Coding

To manage the risks and maximize the benefits of AI coding tools, organizations can implement the following best practices:

  • Integrate Security into AI Workflows: Shift to a DevSecOps model where security testing is automated and embedded directly into the CI/CD pipeline. This includes automated security scans (SAST, DAST, SCA) on all AI-generated code.

  • Educate Developers on Secure Prompting: Train developers on how to write clear, specific prompts that include security requirements and constraints. This guides the AI to produce safer code and prevents it from taking insecure shortcuts.

  • Enforce Human Oversight and Review: Never blindly trust AI-generated code. Maintain mandatory, human-led security review gates and continuous code auditing to catch logical flaws and vulnerabilities that automated tools might miss.

  • Implement a “Zero-Trust” Approach: Treat AI assistants as untrusted services. This means systematically stripping sensitive data and secrets from all prompts before they are sent to the AI.

  • Manage AI Tool Usage and Vendors: Establish clear policies for what AI tools developers can use and what data can be shared with them. Prioritize vendors who explicitly promise not to use customer code for model training.

  • Develop an Incident Response Plan for AI: Create clear protocols for addressing AI-related security incidents, including auditing code exposure, revoking compromised API keys, and blocking unauthorized AI endpoints.

Modern AI coding tools allow developers to define rules, or guardrails, that shape and constrain the code the AI generates. By creating and enforcing security-focused rules, development teams can train AI to prioritize secure coding practices, reduce the risk of common vulnerabilities, and ensure compliance with organizational policies.

You can create contextual guidance rules. These rules provide security-focused instructions that help the AI understand and integrate best practices specific to your project, technology stack, and security standards.

Mandate that the AI never includes secrets like API keys, passwords, or credentials directly in the code. For example, “use a secure vault for sensitive credentials. Never hardcode secrets.”

Adhere to OWASP standards. Explicitly instruct the AI to follow guidelines from the OWASP Top 10 list of web application security risks and all other guidance such as their vulnerability prevention cheat sheet series.

For cryptographic operations, direct the AI to use modern, secure algorithms and libraries instead of older, potentially insecure methods. Enforce secure output encoding. Create rules for proper encoding to prevent XSS attacks.

Train developers on how to use secure prompting techniques. By explicitly including security requirements in their prompts, developers can guide the AI to generate safer code from the start. Integrate rules into your CI/CD pipeline and other parts of the SDLC. This includes using Static Application Security Testing (SAST) tools that can flag rule violations in AI-generated code before it’s deployed.

Use pre-commit hooks that automatically scan AI-generated code for rule violations before it is committed to a repository, preventing insecure code from entering the codebase.

The Need for a Comprehensive AI Usage Policy

A comprehensive AI usage policy is an essential strategic document that provides clear and consistent guidelines for how AI tools can be used within an organization. In the rapidly evolving landscape of AI-driven development, the proliferation of new tools (from generative AI for code completion to automated testing assistants) introduces both great productivity potential and significant risk. Without clear guardrails, development teams may unintentionally expose sensitive data, infringe on intellectual property rights, or introduce security vulnerabilities, all of which can severely harm the business.

The policy must provide a precise and unambiguous list of AI tools approved for use in development. This sanctioned list helps prevent “shadow AI,” where employees use unauthorized tools that may not meet the company’s security and data handling standards. For unapproved tools, a clear process should be established for how developers can formally request their review and potential authorization. The policy should also define what makes an AI tool “reputable” and safe to use, such as its data privacy practices and security credentials.

An important part of the policy is defining how different types of data are handled when interacting with AI tools. Rules must explicitly detail what data can be used with which tools, with special consideration for sensitive information like intellectual property, customer data (e.g., regulated by the European AI Act, GDPR, or HIPAA), and proprietary business logic. Strong security procedures should mandate the use of secure environments for AI interaction and prohibit hard-coding credentials into AI-generated outputs. The policy should outline security practices such as data encryption, access controls, and regular audits of AI systems to ensure compliance.

It is also important to mandate that all AI-generated content or code is thoroughly reviewed and tested by a human developer before being deployed to production. The policy must establish a clear governance framework, assigning roles and responsibilities for the oversight, management, and review of AI systems. This ensures that humans are ultimately accountable for decisions and actions taken with AI assistance.

Simply documenting a policy is not enough; it must be communicated effectively to the entire organization, not just a memo. Communication should use multiple channels, such as company-wide town halls, engaging screensaver messages, and dedicated intranet pages to capture employees’ attention. The tone should be helpful and educational, explaining the “why” behind the policy to encourage buy-in, rather than simply dictating rules.

Since technology is continuously evolving, the policy must be a living document that is regularly reviewed and updated. It is important to establish feedback channels, like a dedicated Slack channel or regular surveys, where developers can report on the practical challenges and needs of using AI tools. This two-way communication builds trust and ensures the policy remains relevant. To increase awareness and compliance, the policy should be paired with ongoing training sessions that provide practical, real-world examples of safe and unsafe AI usage.

For a policy to be effective, its enforcement mechanisms and consequences for violations must be clearly outlined and applied consistently across all employees, regardless of seniority. This builds trust and ensures fairness. The policy should detail how noncompliance will be handled, from verbal warnings to more severe disciplinary actions. Technology can also assist in enforcement by using monitoring tools to detect and log potential violations related to data or internet usage.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.