Tabla de contenidos

Title

Tabla de contenidos

Title



Blog

Opiniones

More code, more risk: What AppSec needs in the age of AI development

cover-claude-code-security (https://unsplash.com/photos/a-computer-chip-with-the-letter-a-on-top-of-it-eGGFZ5X2LnA)

Jason Chavarría

Escritor y editor

Actualizado

26 feb 2026



9 min

More people are writing software than ever before, and the pace is accelerating. The rise of AI coding assistants and vibe coding means that building applications is no longer restricted to professional developers; anyone who can describe what they want can instruct an agent to produce it. The consequence is straightforward: there will be much more software, it will change much faster, and it will reach production way sooner. More code means more risk. And the security implications of that equation deserve serious attention.

This is the context in which Anthropic launched Claude Code Security last week as a limited research preview for Enterprise and Team customers. It uses AI reasoning to scan codebases, trace data flows across files, and suggest patches for the vulnerabilities it finds. Using Claude Opus 4.6, Anthropic's team reportedly found over 500 high-severity vulnerabilities in production open-source software that had gone undetected for decades despite expert review.

AI-powered reasoning about code is a genuine capability improvement over rule-based static analysis. The ability to understand how components interact, to follow data across functions and files, and to catch context-dependent weaknesses that rigid pattern matching misses represents real progress in how the industry approaches the code layer of security. At Fluid Attacks, we have been investing in exactly this kind of capability through our own AI SAST, which uses large language models to reason about code semantically rather than merely matching signatures. When a company as technically formidable as Anthropic enters this space, it confirms that the direction of travel is right: AI belongs in application security, and the tools that rely on it will keep getting better.

We should note, however, that Claude Code Security is still in its early stages. It is a research preview, not a generally available product; the range of weakness types it can detect at scale has not yet been publicly announced; and questions remain about how consistently it performs across different languages, frameworks, and codebase sizes. These are not criticisms but acknowledgments of the natural growing pains of a promising technology.

More code means more vulnerabilities, even with better tools

There is a tempting assumption behind every advance in automated security: that better detection will eventually close the gap. But the math does not work that way. As AI accelerates development, the sheer volume of code being produced grows faster than any improvement in detection can offset.

This means the security challenge is not shrinking; it is compounding. And it is compounding in an environment where most teams are starting to think about security after production, not before. Things get back to building first and assessing risk later, which means vulnerabilities reach production before anyone has evaluated them. When you combine that habit with the unprecedented speed at which code now moves from an agent's output to a live deployment, testing in production becomes essential.

There is also a dangerous assumption forming around AI-assisted remediation. Because AI agents can now detect a vulnerability and propose a fix in the same workflow, some will conclude that the loop is closed: the agent writes the code, finds the bugs, and fixes them. The code must be secure. But it will not be. An agent remediating its own output does not guarantee a secure result but a faster cycle that still requires validation.

AI changes how software is built, not who is responsible for securing it

At Fluid Attacks, we frame the current shift this way: "AI builds. Humans command. We secure." The emphasis on "command" is deliberate. AI agents are transforming every stage of software development, from writing code to testing it to proposing fixes, but the responsibility for security does not transfer to the agent. Humans remain in control. They define what gets built, they set the risk tolerance, and they are accountable when something goes wrong.

If the agent can code, scan, and remediate, why involve humans at all? The answer is that security is a judgment call that depends on context an agent does not fully possess: the business logic of the application, the threat model of the organization, the regulatory requirements of the industry, risk tolerance. These are human decisions.

Plus there is the case of false negatives. We know that false positives, for their part, will lose some of their sting in a world where development speed makes remediation cheaper and faster, even when it comes to false alarms. But a false negative, a real vulnerability that goes undetected, is always relevant. It represents unmanaged risk and an open vector for attack. This is why layered testing, combining AI reasoning, deterministic source code analysis, dynamic testing, and human review, is not a luxury: It is the only approach that minimizes the blind spots any single method inevitably has.

Our pentesters have already tried AI tools to amplify their work, and we summarize the synergy this way: Humans think strategically; agents execute massively. The agent scales execution. running parallel tests, analyzing entire repositories, generating vulnerability candidates. The human provides strategic thinking, the creativity to find paths no model anticipates, and the judgment to confirm whether a vulnerability is exploitable in the context of the real, running system.

A changing landscape demands a broader view

While it is tempting to focus on what Claude Code Security does and does not do, the more important conversation is about the broader changes reshaping software development and, therefore, application security.

The developer's role is evolving from writing code to instructing agents, defining requirements, and validating outputs. We believe the tools are shifting too: IDEs may well give way to command-line interfaces with agents, and eventually to environments where LLMs function as a brain with access to the developer's desktop. Developer machines themselves may become the sandboxes where applications are tested. These are not distant scenarios but the direction in which the early adopters are already heading.

At the same time, application architectures are becoming more distributed. APIs proliferate. MCP servers emerge as new components in how systems interconnect. LLMs and AI models become actors in the software supply chain, with a more critical role in both development and production. New user interfaces, such as chat and voice, introduce new categories of weaknesses that did not exist a few years ago.

All of this creates attack surfaces that are more dynamic and harder to manage. And a tool that reads source code, no matter how intelligently, addresses only one layer of this picture. Code with perfect syntax can still be vulnerable. An application whose source code passes every static analysis check can still be exploitable if its runtime environment is misconfigured, if its infrastructure has gaps, or if the interactions between its components introduce flaws that only manifest when the system is actually running.

Testing the system as a whole

This is what we believe the conversation needs to center around, and it is the principle that guides our work at Fluid Attacks: security requires testing the coherent system as a whole, across the code layer, the runtime environment, and the infrastructure.

Static analysis, even AI-powered static analysis, examines code without executing it. Many of the vulnerabilities that end up in incident reports are not problems that can be found by reading source code more carefully. They are behaviors that emerge when an application runs in its actual environment, requests pass through the API stack, authentication middleware chains together, and components interact across services. These vulnerabilities require actually running the application and testing it under conditions that resemble how an attacker would probe it.

Claude Code Security does not do this. Nor does any static analysis tool, traditional or AI-powered. This is not a flaw in Claude Code Security's design, but simply reflects the scope of what it set out to accomplish. The question for security teams is whether their programs cover the layers that code scanning, however sophisticated, cannot reach.

The same logic applies to vulnerability exploitation. Confirming that a finding is genuinely exploitable in a specific environment requires more than reading code and reasoning about it; it requires testing the live system. This is the domain of penetration testing, where human expertise remains indispensable. At Fluid Attacks, we combine automated tools, including our AI SAST, with expert manual pentesting precisely because the two approaches find different things. An AI model can reason about a potential access-control flaw in source code; a skilled pentester can confirm whether that flaw is actually exploitable through the application's authentication chain in its deployed environment.

Third-party risk is expanding beyond packages

The conversation about supply-chain security has historically centered on open-source dependencies: scanning packages for known CVEs, generating SBOMs, checking licenses. That remains important, but the supply chain of a modern application now includes components that traditional SCA was never designed to evaluate.

When an application relies on an LLM for decision-making, that model is a third-party dependency. When it connects to external services through MCP servers, those servers are part of the attack surface. When AI agents use skills and plugins to extend their capabilities, each skill is a component that can introduce risk. Testing third-party components is no longer just SCA over packages; it must extend to models, skills, and MCPs. The tooling and methodology for this kind of testing is still emerging, and it represents one of the most important frontiers in application security.

What enterprise AppSec programs actually manage

The enterprise-level application security programs do not manage a single repository or a single team, but instead juggle different tech stacks, release cadences, and risk tolerances. The challenges they face go well beyond what any scanner, however intelligent, is designed to address.

Consider what an enterprise security team must coordinate on a daily basis. Remediation does not happen in a vacuum: it requires assigning findings to the right developers, tracking whether fixes were applied, and verifying that those fixes actually resolved the vulnerability whether they introduced new ones. With hundreds of open findings across multiple projects, the question is not just "did we detect the bug?" but "who is responsible for fixing it, by when, and how do we confirm it was fixed?" They need a whole structured workflow and goals to address them.

Then there is the question of policy. Enterprises need to define and enforce risk thresholds that determine whether a build can proceed to production. They need to manage risk acceptance decisions with proper governance: when a team decides to accept a known vulnerability temporarily, that decision must be documented, time-bound, and auditable. And they need criteria to break the build that are consistent across the organization, not left to the discretion of individual developers or teams.

Compliance adds another layer. Various frameworks require organizations to demonstrate not only that they test their software, but that they can trace the lifecycle of a vulnerability from detection through remediation, with evidence that stands up to audit. A scanning tool that produces findings without feeding them into a governed, traceable process does not satisfy these requirements; what matters is the system around the scanner.

The system of record endures

The tools developers use to write code are changing rapidly; some of them, like traditional IDEs, may not survive the shift to agent-driven development at all. Systems of creation are transient by nature; they evolve, they get replaced, they follow the tooling preferences of the moment.

An ASPM platform, by contrast, is set to become the authoritative source of truth about the security posture of an organization's software portfolio. It is where every finding from every testing method converges into a single, governed, traceable dataset. It is where remediation workflows live, where policy enforcement happens, where compliance evidence is generated, and where development and security teams come together around shared, complete information.

While the systems of creation keep changing, the system of record endures. And the organizations that manage risk effectively will be those that anchor their security programs not to whichever scanning tool is newest, but to a platform that persists, integrates, and governs across every layer of the application.

Our platform is built on this conviction. It is the single source of truth where results from all testing methods converge, where the organizational workflows around those results (assignment, remediation, verification, policy enforcement, compliance reporting) are part of the same system, and where the right people across teams have the information they need to make security decisions together.

Moving forward together

We welcome Claude Code Security as a sign that the industry is converging on a truth we have held for a long time: AI reasoning belongs in application security, and it meaningfully improves what static code analysis can accomplish. The technology will mature, its scope will expand, and the tools built on it will become part of how every organization thinks about securing its code.

But we also believe that the future of application security is not about any single scanning capability, however powerful. It is about testing systems integrally, across code, environment, and infrastructure; about bringing development and security teams together around shared, complete information; and about anchoring security programs to a system of record that endures while the tools around it evolve.

AI changes how software is built, not who is responsible for securing it. That is the principle we build on at Fluid Attacks, and Claude Code Security's arrival reinforces our conviction that it is the right one.

Get started with Fluid Attacks' application security solution right now

Etiquetas:

ciberseguridad

pruebas-de-seguridad

software







Suscríbete a nuestro boletín

Mantente al día sobre nuestros próximos eventos y los últimos blog posts, advisories y otros recursos interesantes.

Otros posts

cover-how-to-benchmark-appsec-tools (https://unsplash.com/photos/fingers-interacting-with-a-stock-market-graph-on-a-tablet-lnuOh9vs8v0)

Opiniones

Jason Chavarría

•

16 de febrero de 2026

Cómo comparar herramientas de seguridad de aplicaciones: guía para la toma de decisiones

Leer post



cover-best-of-jcun-6 (https://unsplash.com/photos/a-rabbit-in-a-garden-sb-5OB5qrwU)

Opiniones

Simon Correa

•

21 de enero de 2026

Lo mejor de JCUN 6ta edición

Leer post



best-of-pwnedcr-0x08 (https://unsplash.com/photos/a-stone-with-a-skull-and-crossbones-painted-on-it-FKMaGGPIio0)

Opiniones

Simon Correa

•

21 de enero de 2026

Lo mejor de PWNEDCR 0x08

Leer post



cover-htb-business-ctf-how-to-be-top-10 (https://unsplash.com/photos/a-close-up-of-a-number-on-a-rock-vbQsU3kVVPI)

Opiniones

Simon Correa

•

21 de enero de 2026

Cómo ser top 10 mundial en el Business CTF de HTB

Leer post



cover-best-of-8-8-matrix (https://unsplash.com/photos/a-pair-of-red-and-blue-surfboards-sitting-next-to-each-other-UVa6OF2XXIc)

Opiniones

Simon Correa

•

21 de enero de 2026

Lo mejor de 8.8 Matrix

Leer post



cover-best-of-dragonjarcon-2025 (https://unsplash.com/photos/black-dragon-head-wall-decor-zQMN9fLJehM)

Opiniones

Simon Correa

•

20 de enero de 2026

Lo mejor de DragonJARCON 2025

Leer post



cover-best-of-bsides-lv-2025 (https://unsplash.com/photos/welcome-to-fabulous-las-vegas-nevada-signage-vuHYi6C6tBs)

Opiniones

Simon Correa

•

16 de enero de 2026

Lo mejor de BSides LV 2025

Leer post



cover-def-con-navigating-the-chaos (https://unsplash.com/photos/a-close-up-of-a-map-on-a-table-o13boYCGD2M)

Opiniones

Simon Correa

•

15 de enero de 2026

DEF CON: Navegando el caos

Leer post



Inicia tu prueba gratuita de 21 días

Descubre los beneficios de nuestra solución Hacking Continuo, de la que ya disfrutan empresas de todos los tamaños.

Prueba gratis

Contactar a ventas

Inicia tu prueba gratuita de 21 días

Descubre los beneficios de nuestra solución Hacking Continuo, de la que ya disfrutan empresas de todos los tamaños.

Prueba gratis

Contactar a ventas

Inicia tu prueba gratuita de 21 días

Descubre los beneficios de nuestra solución Hacking Continuo, de la que ya disfrutan empresas de todos los tamaños.

Prueba gratis

Contactar a ventas

Las soluciones de Fluid Attacks permiten a las organizaciones identificar, priorizar y remediar vulnerabilidades en su software a lo largo del SDLC. Con el apoyo de la IA, herramientas automatizadas y pentesters, Fluid Attacks acelera la mitigación de la exposición al riesgo de las empresas y fortalece su postura de ciberseguridad.

Lee un resumen de Fluid Attacks