Índice

Título
Índice
Índice
Título

Filosofia

Inside Fluid Attacks' AI SAST workflow for finding CVE candidates

cover-ai-sast-cve-candidates-workflow (https://unsplash.com/photos/a-close-up-of-a-microscope-O1Trld5iYho)
Felipe Ruiz

Redator e editor de conteúdo

9 min

A finding generated by an AI SAST scanner is only the beginning of a vulnerability report. Before it can help a maintainer fix code, support a CVE request or become a public advisory, researchers have to answer hard questions: Is the reported path reachable? Is the input attacker-controlled? Are there effective sanitization, escaping, validation, or authorization in place? Does the behavior affect a published and supported product? Can the impact be reproduced and explained clearly?

Those questions define the workflow behind recent vulnerabilities identified with Fluid Attacks' AI SAST scanner. Our scanner helps surface candidate findings in open-source repositories, but the public security outcomes depend on selecting projects with real ecosystem impact, analyzing code context, validating exploitability, discarding false positives, deduplicating repeated findings and coordinating disclosure through human analysts.

This blog post focuses on the open-source CVE research we are currently conducting with AI SAST. That research is an active experimentation effort on public repositories, not a claim that AI can replace vulnerability researchers. At the same time, AI SAST is already part of Fluid Attacks' offering to clients, where automated analysis, AI-assisted reasoning and expert support work together.

The distinction matters. In a highly scrutinized ecosystem such as open-source software, maintainers need more than an AI-generated alert. They need evidence, affected versions, a technical explanation, reproduction steps and a clear statement of impact. They also need reports that respect their scope, their threat model and the rules of coordinated vulnerability disclosure.

That is why Fluid Attacks treats AI SAST as part of a hybrid security workflow. The scanner is designed to reason about source code with more context than a traditional rule-based tool can usually handle. It can look beyond isolated patterns and help surface candidates involving source-to-sink flows, template rendering, data persistence, authorization checks and interactions across files. But security analysts still decide whether a finding is real, whether it qualifies as a CVE candidate and how it should be reported.

How the scanner moves from code to candidates

At a high level, the scanner does not ask a model to read an entire repository and guess where the vulnerabilities are. It first reduces the codebase into smaller units that can be analyzed with more precision. The first stage parses the repository and divides the code into functions, which are the units where many security-relevant behaviors can be observed: inputs are received, values are transformed, queries are built, templates are rendered and authorization checks are applied.

Those functions are then evaluated by a proprietary machine learning model trained on vulnerability knowledge accumulated through Fluid Attacks' security research and penetration testing. The model acts as a candidate generator. Instead of declaring a vulnerability on its own, it produces statements such as: "This function may contain this weakness, with this score." That first pass narrows the search space for deeper analysis.

The third stage is more contextual. The list of candidates is passed to specialized AI agents, meaning task-focused model workflows designed to reason about specific categories of weakness. These agents use tool calls, which are controlled ways for the model to inspect related code, follow function calls and navigate the repository. That matters because many vulnerabilities are not visible on a single line. A stored XSS (cross-site scripting) issue, for example, may involve user input in one file, persistence in another and unsafe rendering in a template or browser sink somewhere else.

This architecture helps explain the difference between candidate detection and vulnerability confirmation. The machine learning stage can flag a function as suspicious because it resembles known vulnerable patterns. The agent stage can then inspect the surrounding context to ask better questions: Is the value attacker-controlled? Is it sanitized or escaped before reaching a sink? Is that operation actually reachable? Is there a permission check? Does the path depend on the default configuration, or on privileged customization?

For researchers, the AI SAST output is not a final advisory. It is a starting point for triage. The scanner can produce a SARIF file (a standard format for static analysis results), so findings can be reviewed in an IDE and linked back to files and lines. From there, a researcher clones the repository, studies the reported path, checks the surrounding code and decides whether the candidate is a real vulnerability, a false positive, a duplicate or a finding that is real but not suitable for a CVE ID.

How repositories are selected

When the goal is to identify CVE candidates, not every open-source repository is equally useful to review. Fluid Attacks' security analysts prioritize projects where a confirmed vulnerability would have a real security impact on the broader software ecosystem.

The first filter is maturity: A target should not be an academic exercise, a proof of concept or an abandoned repository with little practical use. The project should be a real application, framework or component that users can install, deploy and depend on. Signals such as GitHub stars, installation counts, package distribution, release activity and community adoption help estimate whether the software has enough usage to justify deeper review.

Scope also matters: Researchers avoid projects that are already covered by a specific bug bounty program or clearly fall under another CNA's scope. A CNA (CVE Numbering Authority) is an organization authorized to assign CVE IDs for vulnerabilities within a defined scope. If a product belongs to another CNA, the issue may still be valid, but the assignment should follow that CNA's channel rather than Fluid Attacks'.

This is especially important for AI-assisted discovery because the scanner can generate many candidates, while professionals' time is finite. Running the workflow against projects with low adoption would produce activity without necessarily generating value for users, maintainers or the ecosystem. By contrast, focusing on mature open-source applications and frameworks increases the chance that a confirmed vulnerability will help multiple teams reduce risk exposure.

Fluid Attacks is also working on an internal model to rank software popularity before starting vulnerability research. The goal is not to replace researcher judgment, but to make prioritization more systematic. At the time of writing, that model is not integrated into the AI SAST pipeline, but it points to the same principle: discovery should be guided by potential impact.

From candidate finding to CVE candidate

Once our AI SAST scanner produces a candidate finding, the issue is no longer "did the scanner find a suspicious pattern?" but "does this behavior represent a vulnerability that should enter a public disclosure?"

Researchers start by reviewing the source-to-sink flow identified by the scanner. A source is a place where data enters the application, such as a request parameter, stored user input or imported content. A sink is a place where that data can cause harm, such as an HTML renderer, a SQL query builder or an authorization-sensitive operation. The researcher checks whether the data is attacker-controlled, whether it reaches the sink in a dangerous context and whether sanitization, escaping, validation or permission checks prevent exploitation.

If the issue still looks valid, the researcher reproduces it locally using default configurations. A CVE candidate should normally reflect a vulnerability in the product or component as shipped, not a risky customization that only appears after privileged changes. Reproduction also helps separate a plausible code path from a demonstrable security impact.

Only then does the finding move toward CVE evaluation: Fluid Attacks follows the CNA Operational Rules and the CVE Program's application guidance for CNAs. In practice, that means analysts confirm that the issue affects a published and supported product, has a demonstrable impact on confidentiality, integrity or availability, applies to identifiable affected versions and can be explained with enough technical evidence for the vendor and the CVE ID assignment.

Some valid findings do not become CVE candidates because a vendor may consider the behavior outside its threat model, the impact may not be demonstrable enough, or the issue may depend on privileged configuration, unsupported deployment conditions or code that is not part of the product's normal release. In other cases, the product falls within the scope of another CNA, so Fluid Attacks should not be the organization assigning the ID.

For findings that do qualify, our researchers collect the material needed for escalation: installation details, affected context, a technical explanation of the root cause, proof-of-concept steps, images showing exploitation and video evidence when useful.

Why human triage remains essential

Fluid Attacks' AI SAST scanner is intentionally used as a discovery tool, not as an autonomous disclosure system. This differentiation is worth noting in open-source software research, where maintainers often receive low-quality automated reports and may explicitly reject submissions that appear to have been generated entirely by AI. In our case, a report that reaches a vendor should therefore represent Fluid Attacks' security judgment, not only a model's prediction.

Human triage begins by determining whether the candidate is exploitable. Security analysts review the reported flow, inspect related files and check whether the apparent vulnerability survives the application's real controls. Many candidates are discarded because the data is already sanitized, escaped, validated or parameterized before reaching the risky operation. Others are rejected because the supposed attacker-controlled input is actually internal, constant, privileged or unreachable in a realistic flow.

There are also cases where the sink is not dangerous in context. For example, a value may be stored but never rendered as executable HTML or a suspicious authorization path may be protected by backend checks that are not obvious from the first reported function. The scanner helps researchers decide where to look, while researchers decide what the evidence proves.

Deduplication is frequently another human task. The same underlying issue can appear across several files, sinks, call paths or variants of a component. Without consolidation, the assessment would overstate the number of vulnerabilities and create unnecessary noise for maintainers. Analysts group repeated findings, identify the clearest root cause and decide which report best represents the real issue.

The human role becomes even more important when a finding is valid. Researchers still have to determine impact, affected versions, severity, reproducibility and whether the behavior belongs to the vendor's threat model. They also prepare the proof of concept, document the vulnerable flow and coordinate disclosure in a way that maintainers can act on.

What the current results show

The current research dataset gives a useful view of the funnel from AI-generated findings to public advisories. The figures below are based on the execution records available at the time of writing this post and should be understood as a snapshot, not as a final benchmark.

Our analysts have reviewed 1,006 findings generated by the AI SAST scanner. After manual triage, 163 were marked as true positives, meaning the researchers considered the reported behaviors as real vulnerability candidates, which, after a consolidation, became 118 unique true positives.

Of those findings, 28 became CVE candidates. That means roughly one in four confirmed unique vulnerabilities met the additional criteria for CVE evaluation. The rest could still be real security issues, but not necessarily issues that belong in the CVE workflow. Some may lack enough demonstrable impact, depend on conditions outside the product's default threat model, fall under another CNA's scope or require handling through a different channel.

Our public output, which began in late April 2026, so far includes 12 unique advisories associated with AI SAST: nine XSS issues, two SQL injection issues and one authorization issue. XSS occurs when attacker-controlled content reaches a browser execution context without sufficient escaping or sanitization. SQL injection happens when attacker-controlled input alters the structure or behavior of a database query. Authorization issues appear when the application fails to enforce who is allowed to perform an operation or access an object.

The distribution reflects the weakness categories currently being explored. Our scanner is not yet intended to find every vulnerability type. In this research workflow, it is being applied to a limited set of typologies where context is central: whether data is controlled by an attacker, whether it crosses file or function boundaries, whether it is stored and later rendered, whether a database abstraction builds unsafe SQL, or whether an authorization check is missing in the actual backend path.

Fluid Attacks gives advisories code names. One example from the AI SAST workflow is "Motley," the advisory for a SQL injection in Corteza's MSSQL backend. That report is useful because the vulnerable behavior involved JSON metadata parsing, missing key validation, incorrect T-SQL escaping and generated SQL. Other AI SAST advisories, such as "Billie" for improper authorization in Camaleon CMS, and "Pink," "Weeknd," and "Bizkit" for stored XSS, show the same pattern in different weakness classes: the issue is confirmed by following context, not by trusting a single suspicious line.

Why AI SAST and traditional SAST are complementary

Traditional SAST remains valuable because many weaknesses have stable patterns. If a request parameter is echoed in a response without validation, or if a known dangerous function receives untrusted data, a rule-based or dataflow-oriented scanner can often efficiently flag the risk. The point of AI SAST is not to replace that capability, but to cover cases where the pattern is less stable and the surrounding context carries more of the security meaning.

A useful contrast, for instance, is "Skims-8," a Fluid Attacks advisory detected by our traditional SAST scanner. The report describes a reflected XSS issue in Ad Inserter, where web content was dynamically generated without validating a potentially untrusted value. The vulnerable line shown in the advisory is direct enough for a conventional scanner to surface: a POST value is echoed back into the page.

Motley, although a different kind of weakness, shows that the Corteza issue was not just "input reaches SQL." The relevant path involved an HTTP meta parameter, JSON-object parsing, skipped key validation, MSSQL-specific escaping behavior, generated JSON-path SQL and the permissions required to reach the affected endpoint. Detecting that kind of issue requires following how data changes across functions and files, then asking whether the controls in that path are actually effective.

Each approach has a better fit: traditional SAST is strong for stable patterns with low contextual burden, while AI SAST is promising for weaknesses whose exploitability depends on broader code context.

For clients and research teams, the best option is not to choose one scanner over another. It is combining them with human review so that simple patterns are caught efficiently, complex paths receive deeper analysis and final reports are backed by reproducible evidence.

What this work is teaching us

The open-source CVE work described here is an active research and experimentation effort. We are using public repositories to evaluate how far AI SAST can go in vulnerability discovery, where it performs well and where expert validation remains necessary. The purpose is not to present AI as a finished replacement for security research, but to learn where it can make that research faster, broader and more precise.

At the same time, AI SAST is not only a research idea. Fluid Attacks offers AI-powered static application security testing as part of its broader security approach, where automated techniques, AI-assisted analysis and expert support work together. The open-source findings give us a transparent way to show part of that capability in action: candidate generation, contextual code analysis, manual triage, CVE evaluation and responsible disclosure.

The AI SAST public advisories published so far show that this technique can help identify vulnerabilities in real software projects, including cases that require more context than a single-line pattern. They also make clear why the process must remain careful: findings have to be reproduced, deduplicated, scoped, documented and disclosed through the right channels.

That is the standard we are continuing to pursue. In open-source research, it means better target prioritization, more systematic candidate review and more evidence-backed disclosures. For organizations building and maintaining software, it means a practical way to expand static analysis beyond obvious patterns. If your team wants to see how AI SAST produces value for maintainers and users, not just more alerts, and can strengthen your AppSec, contact us.

Get started with Fluid Attacks' AppSec solution

Tags:

cibersegurança

software

codigo

teste-de-seguranca

machine-learning

Assine nossa newsletter

Mantenha-se atualizado sobre nossos próximos eventos e os últimos posts do blog, advisories e outros recursos interessantes.

Comece seu teste gratuito de 21 dias

Descubra os benefícios de nossa solução de Hacking Contínuo, da qual empresas de todos os tamanhos já desfrutam.

Comece seu teste gratuito de 21 dias

Descubra os benefícios de nossa solução de Hacking Contínuo, da qual empresas de todos os tamanhos já desfrutam.

Comece seu teste gratuito de 21 dias

Descubra os benefícios de nossa solução de Hacking Contínuo, da qual empresas de todos os tamanhos já desfrutam.

As soluções da Fluid Attacks permitem que as organizações identifiquem, priorizem e corrijam vulnerabilidades em seus softwares ao longo do SDLC. Com o apoio de IA, ferramentas automatizadas e pentesters, a Fluid Attacks acelera a mitigação da exposição ao risco das empresas e fortalece sua postura de cibersegurança.

Consulta IA sobre Fluid Attacks

Assine nossa newsletter

Mantenha-se atualizado sobre nossos próximos eventos e os últimos posts do blog, advisories e outros recursos interessantes.

As soluções da Fluid Attacks permitem que as organizações identifiquem, priorizem e corrijam vulnerabilidades em seus softwares ao longo do SDLC. Com o apoio de IA, ferramentas automatizadas e pentesters, a Fluid Attacks acelera a mitigação da exposição ao risco das empresas e fortalece sua postura de cibersegurança.

Assine nossa newsletter

Mantenha-se atualizado sobre nossos próximos eventos e os últimos posts do blog, advisories e outros recursos interessantes.

Mantenha-se atualizado sobre nossos próximos eventos e os últimos posts do blog, advisories e outros recursos interessantes.

As soluções da Fluid Attacks permitem que as organizações identifiquem, priorizem e corrijam vulnerabilidades em seus softwares ao longo do SDLC. Com o apoio de IA, ferramentas automatizadas e pentesters, a Fluid Attacks acelera a mitigação da exposição ao risco das empresas e fortalece sua postura de cibersegurança.

Assine nossa newsletter

Mantenha-se atualizado sobre nossos próximos eventos e os últimos posts do blog, advisories e outros recursos interessantes.

Mantenha-se atualizado sobre nossos próximos eventos e os últimos posts do blog, advisories e outros recursos interessantes.