Security Guide2025 Edition

OWASP Top 10 for
LLM Applications

The definitive guide to understanding security risks in Large Language Model applications. Learn about each vulnerability with real-world examples and actionable prevention strategies.

Official OWASP Source•Last updated: November 27, 2025

87%

of companies hit by AI attacks in 2024

34M+

messages leaked in OmniGPT breach

11%

of data pasted to ChatGPT is confidential

$200

to extract 10K training examples

The 10 vulnerabilities explained

Each vulnerability with real-world examples and prevention strategies

LLM01Critical

Prompt Injection

Attackers craft malicious inputs to override your LLM's instructions, bypass safety measures, or extract sensitive information.

Impact: Unauthorized access, data breaches, and compromised decision-making.

Microsoft Copilot Spear-Phishing (2025)

Security researchers turned Microsoft Copilot into a spear-phishing bot by hiding malicious commands in plain emails. The hidden prompts hijacked Copilot's instructions and executed unauthorized API calls.

ChatGPT Memory Exploit (2024)

A persistent prompt injection attack manipulated ChatGPT's memory feature, enabling long-term data exfiltration across multiple user conversations.

How to prevent:

Implement strict input validation and sanitization
Use system prompts that are resistant to override attempts
Apply output filtering to detect injection attempts

LLM02High

Insecure Output Handling

Failing to validate or sanitize LLM outputs before using them in downstream applications, enabling code execution or other exploits.

Impact: XSS attacks, SQL injection, remote code execution, and system compromise.

Vanna AI Remote Code Execution

A vulnerability in Vanna AI, a tool for database interaction via prompts, allowed attackers to embed harmful commands that achieved remote code execution through unvalidated LLM outputs.

Chevrolet Dealership Chatbot

Customers manipulated a Chevrolet dealership chatbot to offer vehicles at $1 by exploiting improperly handled outputs, creating both financial and reputational damage.

How to prevent:

Treat LLM output as untrusted user input
Apply context-appropriate encoding (HTML, SQL, etc.)
Implement Content Security Policy headers

LLM03High

Training Data Poisoning

Malicious actors inject harmful data into training datasets, causing models to produce unsafe, biased, or inaccurate outputs.

Impact: Compromised model integrity, backdoor vulnerabilities, and degraded performance.

WormGPT & Malicious LLMs

WormGPT was created by fine-tuning an open-source model on malicious datasets containing malware code, exploit write-ups, and phishing templates—demonstrating how poisoned training data creates dangerous AI tools.

Amazon CodeWhisperer Data Concerns

Amazon warned employees not to share confidential information with AI tools after noticing responses closely resembled sensitive company data that had been used in training.

How to prevent:

Verify data provenance and integrity
Implement data sanitization pipelines
Monitor model outputs for anomalies

LLM04Medium

Model Denial of Service

Attackers send resource-intensive requests to overwhelm LLM systems, causing service disruptions and inflated operational costs.

Impact: Service unavailability, degraded user experience, and unexpected cloud bills.

Recursive Query Attacks

Attackers discovered that asking LLMs to perform infinitely recursive tasks or generate extremely long outputs could consume massive compute resources, effectively creating a denial of service.

Token Exhaustion Exploits

Malicious users exploited pay-per-token APIs by crafting prompts that maximized token consumption, resulting in massive unexpected bills for application owners.

How to prevent:

Implement rate limiting and request throttling
Set maximum token limits per request
Monitor and alert on unusual usage patterns

LLM05High

Supply Chain Vulnerabilities

Third-party components, pre-trained models, or external datasets may contain vulnerabilities that compromise your entire application.

Impact: System compromise, data breaches, and inherited security flaws.

Exposed LLM Servers

Security researchers found 30+ vector database servers online without authentication, exposing private emails, customer PII, financial data, and medical records from companies using third-party LLM infrastructure.

GPT Store System Prompt Leaks (2024)

Many custom OpenAI GPTs were vulnerable to prompt injection, causing them to disclose proprietary system instructions and API keys embedded by their creators.

How to prevent:

Audit all third-party dependencies regularly
Use trusted model sources with verified checksums
Implement software composition analysis

LLM06Critical

Sensitive Information Disclosure

Models inadvertently leak confidential data, proprietary information, or personal details through their responses.

Impact: Privacy violations, competitive disadvantage, regulatory penalties, and legal liability.

Samsung Source Code Leak (2023)

Samsung engineers pasted proprietary semiconductor source code into ChatGPT for review. The confidential code became part of the training data, potentially accessible to other users. Samsung subsequently banned all generative AI tools.

Google Training Data Extraction

Google researchers extracted 10,000+ verbatim training examples from ChatGPT using $200 worth of queries. They obtained names, phone numbers, and addresses by forcing the model to malfunction with repetitive commands.

How to prevent:

Implement PII detection and redaction
Use data loss prevention (DLP) filters
Train models with privacy-preserving techniques

LLM07High

Insecure Plugin Design

LLM plugins that process untrusted inputs without proper access control become attack vectors for code execution and data theft.

Impact: Remote code execution, unauthorized data access, and privilege escalation.

McKinsey Chatbot Hack

Security firm Zenity quickly compromised a chatbot built by McKinsey. The chatbot was persuaded to send customer information from an integrated Salesforce system through its plugin architecture.

Google Bard Document Exfiltration

Attackers embedded malicious prompts in a Google Docs file. When Bard accessed the document through its plugin, it was tricked into exfiltrating sensitive information to external endpoints.

How to prevent:

Apply least privilege to plugin permissions
Validate all plugin inputs and outputs
Implement plugin sandboxing and isolation

LLM08High

Excessive Agency

LLMs with unchecked autonomy to perform actions can execute harmful operations beyond their intended scope.

Impact: Unauthorized actions, data modification, and unintended system changes.

AI Cyber Espionage Campaign

Anthropic reported the first AI-orchestrated cyber espionage campaign where hackers used Claude to automate attacks. The AI was given excessive permissions that allowed it to execute complex attack chains autonomously.

Autonomous Agent Exploits

Security researchers demonstrated that AI agents with tool access could be manipulated to perform unauthorized file operations, send emails, or make API calls without user consent.

How to prevent:

Implement human-in-the-loop for critical actions
Define strict action boundaries and permissions
Log and audit all autonomous actions

LLM09Medium

Overreliance

Treating LLM outputs as infallible without verification leads to poor decisions based on hallucinations or inaccurate information.

Impact: Flawed decision-making, legal liability, and reputational damage from AI errors.

Legal Brief Hallucinations

A lawyer used ChatGPT to research case law and submitted a brief citing six completely fabricated cases. The court imposed sanctions, and the incident became a cautionary tale about AI overreliance in professional settings.

Medical AI Misdiagnosis

Healthcare providers relying solely on AI diagnostic tools without verification have faced malpractice concerns when models provided incorrect or hallucinated medical recommendations.

How to prevent:

Always verify LLM outputs against authoritative sources
Implement confidence scoring and uncertainty indicators
Train users on LLM limitations

LLM10Medium

Model Theft

Attackers steal proprietary models, weights, or architectural details to replicate capabilities or extract training data.

Impact: Loss of competitive advantage, intellectual property theft, and exposed training data.

OmniGPT Breach

OmniGPT, an AI chatbot aggregator, suffered a breach exposing 34 million user messages and thousands of API keys. Attackers sold the data dump for just $100 on the dark web, including proprietary model configurations.

Model Extraction Attacks

Researchers demonstrated that with enough API queries, attackers can reconstruct functional copies of proprietary models through systematic probing, stealing millions of dollars in R&D investment.

How to prevent:

Implement robust API authentication and rate limiting
Monitor for model extraction patterns
Use watermarking techniques for model outputs

How does ScanMyLLM help you fix these risks?

Prompt injection detection

We test 15+ injection techniques and show you which succeed, with code-level fixes to block them

System prompt extraction

We attempt to leak your instructions, then provide hardening strategies that actually work

Output validation gaps

We identify where unfiltered responses enable exploits, with sanitization patterns you can copy-paste

Full remediation playbook

Every finding includes severity rating, exploit proof, and step-by-step fix guidance your devs can implement immediately

Vulnerabilities identified. Fixes included. Delivered in 48 hours.

OWASP Top 10 forLLM Applications

The 10 vulnerabilities explained

How does ScanMyLLM help you fix these risks?

OWASP Top 10 for
LLM Applications