Prompt Attacks: When Curiosity Becomes a Security Flaw

Nov 11, 2025

Prompt Attacks: When Curiosity Becomes a Security Flaw

Prompt attacks herald a new era: where security depends on the psychology of models.

Each week, Cleo Insight offers 5 minutes to turn the latest research papers into actionable insights.

The recent revelations by Tenable on seven critical vulnerabilities of ChatGPT remind us of a simple truth: AI isn't attacked like software, but manipulated like a mind. These flaws don't rely on malicious code but rather on hijacked instructions capable of leading a model to betray its internal logic.

Two new research studies show that this threat, known as prompt injection, is now the weak link in any generative AI infrastructure, even in the most secure professional environments.

Prompt Injection Attacks in Large Language Models and AI Agent Systems (Research Collective, 2025 — 8 min)

Prompt injections are identified as one of the most serious threats to language models (LLMs) deployed in production.
A meta-analysis of 120 studies shows that 5 well-crafted documents are enough to redirect a model's output in 9 out of 10 cases through poisoning the RAG (Retrieval-Augmented Generation) process.
Incidents like the CVE-2025-53773 vulnerability on GitHub Copilot (CVSS score: 9.6) show that the threat is no longer theoretical: it's operational.

Key takeaway: Prompt attacks are a structural risk that calls for deep defense rather than a quick fix.

Prompt Injection as an Emerging Threat: Evaluating the Resilience of Large Language Models (Ganiuly & Smaiyl, 2025 — 12 min)

This study assesses the actual resilience of several major models against these attacks. Four models (GPT-4, GPT-4o, LLaMA-3 8B Instruct, and Flan-T5-Large) were tested according to three robustness metrics. The results confirm that model size does not guarantee security: it's ethical alignment and the tuning of safeguards that determine resilience.
GPT-4 shows the most robustness (RDR = 9.8%, SCR = 96.4%), but no architecture is completely secure.

Key takeaway: Even the most advanced models remain vulnerable to indirect and substitution attacks.

Key takeaways for you – summary of the two studies

1. The vulnerabilities revealed by Tenable concretely illustrate the vulnerabilities theorized by research: prompt attacks are a systemic threat.

2. A multi-layered security strategy is essential (monitoring, filtering, reinforcing system prompts).

3. Ethical alignment and AI governance become management priorities, not just technical challenges.

At Cleo, we help companies turn intelligence gathering into a collective reflex. Our goal is not to give you just another tool, but to help you integrate a true decision-making partner into your teams.

Newsletter

Weekly

Weekly Tech News

Explore the latest trends in AI, tech, and beyond!

Cleo Insight

Cleo Learn

Ressources

Contact

Select Language