The definitive guide to understanding security risks in Large Language Model applications. Learn about each vulnerability with real-world examples and actionable prevention strategies.
87%
of companies hit by AI attacks in 2024
34M+
messages leaked in OmniGPT breach
11%
of data pasted to ChatGPT is confidential
$200
to extract 10K training examples
Each vulnerability with real-world examples and prevention strategies
Attackers craft malicious inputs to override your LLM's instructions, bypass safety measures, or extract sensitive information.
Impact: Unauthorized access, data breaches, and compromised decision-making.
Microsoft Copilot Spear-Phishing (2025)
Security researchers turned Microsoft Copilot into a spear-phishing bot by hiding malicious commands in plain emails. The hidden prompts hijacked Copilot's instructions and executed unauthorized API calls.
ChatGPT Memory Exploit (2024)
A persistent prompt injection attack manipulated ChatGPT's memory feature, enabling long-term data exfiltration across multiple user conversations.
How to prevent:
Failing to validate or sanitize LLM outputs before using them in downstream applications, enabling code execution or other exploits.
Impact: XSS attacks, SQL injection, remote code execution, and system compromise.
Vanna AI Remote Code Execution
A vulnerability in Vanna AI, a tool for database interaction via prompts, allowed attackers to embed harmful commands that achieved remote code execution through unvalidated LLM outputs.
Chevrolet Dealership Chatbot
Customers manipulated a Chevrolet dealership chatbot to offer vehicles at $1 by exploiting improperly handled outputs, creating both financial and reputational damage.
How to prevent:
Malicious actors inject harmful data into training datasets, causing models to produce unsafe, biased, or inaccurate outputs.
Impact: Compromised model integrity, backdoor vulnerabilities, and degraded performance.
WormGPT & Malicious LLMs
WormGPT was created by fine-tuning an open-source model on malicious datasets containing malware code, exploit write-ups, and phishing templates—demonstrating how poisoned training data creates dangerous AI tools.
Amazon CodeWhisperer Data Concerns
Amazon warned employees not to share confidential information with AI tools after noticing responses closely resembled sensitive company data that had been used in training.
How to prevent:
Attackers send resource-intensive requests to overwhelm LLM systems, causing service disruptions and inflated operational costs.
Impact: Service unavailability, degraded user experience, and unexpected cloud bills.
Recursive Query Attacks
Attackers discovered that asking LLMs to perform infinitely recursive tasks or generate extremely long outputs could consume massive compute resources, effectively creating a denial of service.
Token Exhaustion Exploits
Malicious users exploited pay-per-token APIs by crafting prompts that maximized token consumption, resulting in massive unexpected bills for application owners.
How to prevent:
Third-party components, pre-trained models, or external datasets may contain vulnerabilities that compromise your entire application.
Impact: System compromise, data breaches, and inherited security flaws.
Exposed LLM Servers
Security researchers found 30+ vector database servers online without authentication, exposing private emails, customer PII, financial data, and medical records from companies using third-party LLM infrastructure.
GPT Store System Prompt Leaks (2024)
Many custom OpenAI GPTs were vulnerable to prompt injection, causing them to disclose proprietary system instructions and API keys embedded by their creators.
How to prevent:
Models inadvertently leak confidential data, proprietary information, or personal details through their responses.
Impact: Privacy violations, competitive disadvantage, regulatory penalties, and legal liability.
Samsung Source Code Leak (2023)
Samsung engineers pasted proprietary semiconductor source code into ChatGPT for review. The confidential code became part of the training data, potentially accessible to other users. Samsung subsequently banned all generative AI tools.
Google Training Data Extraction
Google researchers extracted 10,000+ verbatim training examples from ChatGPT using $200 worth of queries. They obtained names, phone numbers, and addresses by forcing the model to malfunction with repetitive commands.
How to prevent:
LLM plugins that process untrusted inputs without proper access control become attack vectors for code execution and data theft.
Impact: Remote code execution, unauthorized data access, and privilege escalation.
McKinsey Chatbot Hack
Security firm Zenity quickly compromised a chatbot built by McKinsey. The chatbot was persuaded to send customer information from an integrated Salesforce system through its plugin architecture.
Google Bard Document Exfiltration
Attackers embedded malicious prompts in a Google Docs file. When Bard accessed the document through its plugin, it was tricked into exfiltrating sensitive information to external endpoints.
How to prevent:
LLMs with unchecked autonomy to perform actions can execute harmful operations beyond their intended scope.
Impact: Unauthorized actions, data modification, and unintended system changes.
AI Cyber Espionage Campaign
Anthropic reported the first AI-orchestrated cyber espionage campaign where hackers used Claude to automate attacks. The AI was given excessive permissions that allowed it to execute complex attack chains autonomously.
Autonomous Agent Exploits
Security researchers demonstrated that AI agents with tool access could be manipulated to perform unauthorized file operations, send emails, or make API calls without user consent.
How to prevent:
Treating LLM outputs as infallible without verification leads to poor decisions based on hallucinations or inaccurate information.
Impact: Flawed decision-making, legal liability, and reputational damage from AI errors.
Legal Brief Hallucinations
A lawyer used ChatGPT to research case law and submitted a brief citing six completely fabricated cases. The court imposed sanctions, and the incident became a cautionary tale about AI overreliance in professional settings.
Medical AI Misdiagnosis
Healthcare providers relying solely on AI diagnostic tools without verification have faced malpractice concerns when models provided incorrect or hallucinated medical recommendations.
How to prevent:
Attackers steal proprietary models, weights, or architectural details to replicate capabilities or extract training data.
Impact: Loss of competitive advantage, intellectual property theft, and exposed training data.
OmniGPT Breach
OmniGPT, an AI chatbot aggregator, suffered a breach exposing 34 million user messages and thousands of API keys. Attackers sold the data dump for just $100 on the dark web, including proprietary model configurations.
Model Extraction Attacks
Researchers demonstrated that with enough API queries, attackers can reconstruct functional copies of proprietary models through systematic probing, stealing millions of dollars in R&D investment.
How to prevent: