Frontier AI and the next phase of software vulnerability defence

As advanced AI lowers the cost of discovering and exploiting software vulnerabilities, Europe must treat open source security and rapid patch deployment as critical resilience infrastructure

Context. Frontier AI systems have crossed an important threshold for cybersecurity and software resilience. They are no longer limited to code completion, triage, or report writing; the most capable models can now assist with vulnerability discovery, exploitability analysis, and, increasingly, patch generation. Anthropic’s Project Glasswing, launched on April 7, 2026, put this shift in the public eye by giving selected critical software operators and maintainers early access to Claude Mythos Preview for defensive security work. Anthropic described Glasswing as an initiative to secure critical software with early access to frontier AI, involving major infrastructure, cloud, financial, and open source actors, and extending access to more than 40 organisations that build or maintain critical software infrastructure. Through our longstanding partnership with the Alpha Omega Project, the Eclipse Foundation has been part of the Glasswing Project since its inception, giving us direct experience with this emerging model of AI-assisted vulnerability discovery. To our knowledge, we are currently the only EU-domiciled organisation participating in the initiative, giving us a unique vantage point on how frontier AI capabilities are beginning to reshape software security and resilience.

Why this matters now. The capability jump appears not to be limited to one model, one vendor, or ecosystem. The UK AI Security Institute found that Claude Mythos Preview represented a step-change in cyber performance, including autonomous progress on multi-step attack simulations and high success on expert-level cyber tasks. Less than three weeks later, AISI reported that OpenAI GPT-5.5 reached a similar level of performance on its cyber evaluations and was the second model to complete one of AISI’s multi-step cyber-attack simulations end-to-end. Open-weight models are also narrowing the gap: CAISI’s May 2026 evaluation described DeepSeek V4 Pro as the most capable Chinese model it had evaluated across cyber, software engineering, science, reasoning, and mathematics, while still lagging the leading U.S. frontier by roughly eight months. The implication is that capabilities currently concentrated among a small number of frontier AI providers will quickly become cheaper, more widely available, and harder to govern. This makes early institutional experience with frontier defensive AI workflows strategically important.

The immediate risk: a vulnerability and patching imbalance. Critical infrastructure operators, including energy, transportation, water, health, public services, and financial services, are especially exposed because many run complex, legacy, opaque software estates where patching is slow, cautious, and operationally disruptive. This is not a hypothetical concern: the US Government’s GAO’s 2025 review of critical federal legacy systems found outdated languages, unsupported hardware or software, and known cybersecurity vulnerabilities in systems supporting missions such as critical infrastructure, tax processing, and national security. CISA similarly warns that outdated software is a gateway for threat actors in critical infrastructure contexts, including public-facing routers, VPNs, and firewalls used to reach operational systems. NCSC now expects a “vulnerability patch wave”: a rush of software updates across open source and commercial software stacks as AI accelerates discovery of long-standing technical debt.

The systemic threat. AI-assisted vulnerability discovery changes the economics of offense. It lowers the time, expertise, and cost needed to find flaws, validate exploitability, and turn known-but-unpatched vulnerabilities into working attacks. This is particularly dangerous where many institutions depend on the same software, libraries, cloud providers, identity systems, network appliances, or open source software components. For decades, many IT and operational technology environments have treated patching as an operational disruption to be minimized, especially when systems appear to be running correctly. In some cases, this caution is understandable: outages can have safety, economic, regulatory, and reputational consequences. But the balance of risk is changing. When AI can accelerate vulnerability discovery and exploitation, “stable but unpatched” systems can quickly become systematically exposed. Changing this culture and making rapid, well-tested patch deployment a core resilience function may be one of the hardest short-term challenges. The most serious concern is the possibility that threat actors will use AI to discover and exploit “zero day” vulnerabilities before patches are available.  The IMF warned on May 7, 2026 that advanced AI models can reduce the time and cost of identifying and exploiting vulnerabilities, increasing the likelihood of correlated failures across widely used systems; it specifically identified financial intermediation, payments, and confidence as systemic risk channels. The same concern applies to cross-sector dependencies among finance, energy, telecommunications, public services, and digital infrastructure.

The Eclipse Foundation’s key finding. Our most important finding from the one-month experiment with Mythos was that operational workflows, validation pipelines, and human oversight matter at least as much as the model. Glasswing’s real significance is not simply that one frontier model found more vulnerabilities. It is that a community of security researchers can iteratively improve prompts, agentic harnesses, target selection methods, reproduction pipelines, and triage workflows, and then apply those methods across multiple market-available models, including cheaper and more widely accessible ones. Anthropic’s own technical write-up reinforces this point: its vulnerability work used agentic scaffolds, containers, target ranking, repeated runs, and validation methods rather than a bare chatbot prompt. NCSC likewise stresses that practical AI cyber capability comes from AI systems—models combined with tools, workflows, and human oversight—not from raw model capability alone.

The opportunity: shared defence capacity at machine speed. The same tools that raise offensive risk can strengthen defense if deployed first and responsibly. AI can continuously scan code and dependencies, generate fuzzing harnesses, prioritize findings by exploitability and exposure, propose patches, produce regression tests, summarize impact for maintainers, and accelerate coordinated vulnerability disclosure. NCSC identifies three high-value defensive uses: reducing attack surface through AI-enabled testing and hardening, improving detection and investigation, and automating mitigation and response where carefully governed. The strategic objective should be to convert AI-driven discovery into AI-assisted remediation faster than adversaries can convert discovery into exploitation. Organisations that are not using AI to strengthen threat detection, prevention, and mitigation risk being outpaced by AI-enabled attackers.

What Europe should do now. Europe should treat AI-enabled open source security as shared digital resilience infrastructure. That means investing in trusted vulnerability discovery, coordinated disclosure, maintainer support, patch validation, and deployment readiness across the software components that underpin critical services. No single vendor, foundation, institution, or member state can solve this alone; it requires an ecosystem response. Europe should also ensure that trusted public institutions and open source ecosystems remain directly involved in frontier AI cybersecurity evaluation and remediation efforts, rather than relying on commercial actors outside Europe.

Open source is not the problem; it is the solution. The risk does not come from open source. It comes from the fact that many organisations depend on software they do not fully understand, cannot fully inventory, and cannot patch quickly enough. Open source makes that dependency visible, auditable, and repairable. Open source governance structures may become increasingly important in AI-enabled remediation ecosystems because their transparency, global maintainer communities, reproducible builds, public issue tracking, coordinated disclosure practices, and shared security tooling make them uniquely capable of operating at ecosystem scale. CISA’s Open Source Software Security Roadmap explicitly recognizes OSS as a public good supported by diverse communities and calls for supporting critical OSS components relied on by government and critical infrastructure. Furthermore, AI is enabling increasingly sophisticated reverse engineering tools that can generate source code from proprietary binaries, making “security through obfuscation” an implausible strategy. 

Conclusion. This is a global software resilience issue, with direct implications for Europe’s security, digital sovereignty, and strategic autonomy. Critical infrastructure and financial services everywhere rely on globally developed open source components, shared platforms, and common supply chains. Local champions will matter, but isolated national responses will not be sufficient. The priority is coordinated action: shared AI-enabled vulnerability discovery, trusted disclosure channels, maintainer support, rapid patch production, dependency intelligence, and deployment capacity across public and private sectors. Ultimately, resilience will depend not only on AI capability itself, but on trusted ecosystems capable of coordinating remediation rapidly across shared infrastructure. The Eclipse Foundation will work with public institutions, industry, and open source communities to help strengthen these shared resilience capabilities across the European software ecosystem. Ultimately, resilience will depend not only on AI capability itself, but on trusted ecosystems capable of coordinating remediation rapidly across shared infrastructure. Open source should be treated not as the weak link, but as the coordination layer through which Europe and its global partners can find, fix, validate, and deploy security improvements at the speed modern resilience now requires.