OWASP AI Security Project: Top 10 LLM Vulnerabilities Guide
Artificial intelligence (AI) is kind of a big deal. And when things are a big deal, they're ripe to be exploited. Fortunately, mounting concerns about AI security and privacy are met by plenty of guidance on best practices from the good folks in the open source world.
The OWASP AI Security Project has emerged as a crucial initiative, offering developers clear, actionable guidance on designing, creating, testing, and procuring secure and privacy-preserving AI systems. This comprehensive guide, focusing on Large Language Models (LLMs), addresses the critical vulnerabilities these systems face.
OWASP Top 10 for LLM Applications
The explosive interest in Large Language Models, triggered by the launch of mass-market pre-trained chatbots in late 2022, has been nothing short of revolutionary. Businesses are swiftly incorporating LLMs into their operations and customer-facing products. However, this rapid adoption has often outpaced the development of robust security protocols, leaving many applications exposed to significant risks.
Recognizing the urgent need for a consolidated resource to address these security challenges, OWASP stepped in. Their mission aligns perfectly with promoting safer AI technology adoption, especially for developers grappling with the unique risks associated with LLMs.
The creation of the OWASP Top 10 for LLM Applications list was an ambitious project, leveraging the collective wisdom of a global team comprising nearly 500 experts, including over 125 active contributors. This diverse group brought together insights from AI companies, security firms, ISVs, cloud hyperscalers, hardware providers, and academia.
Want to learn more about protecting your APIs beyond the OWASP Top 10? Check out Kong's API Summit.
The OWASP Top 10 for LLM Applications
- Prompt Injection
- Insecure Output Handling
- Training Data Poisoning
- Model Denial of Service
- Supply Chain Vulnerabilities
- Sensitive Information Disclosure
- Insecure Plugin Design
- Excessive Agency
- Overreliance
- Model Theft
Let's delve into each of these vulnerabilities.
LLM01: Prompt Injection
Prompt Injection occurs when an attacker manipulates a large language model through crafty inputs, causing unintended actions by the LLM. Direct injections overwrite system prompts, while indirect ones manipulate inputs from external sources. This can lead to data exfiltration, social engineering, and other issues.
Prevention strategies include enforcing privilege control on LLM access to backend systems and adding a human in the loop for extended functionality. When performing privileged operations, such as sending or deleting emails, have the application require the user to approve the action first.
LLM02: Insecure Output Handling
This vulnerability occurs when an LLM output is accepted without scrutiny, exposing backend systems. Misuse may lead to severe consequences like XSS, CSRF, SSRF, privilege escalation, or remote code execution.
To mitigate this risk, treat the model as any other user, adopting a zero-trust approach, and apply proper input validation on responses coming from the model to backend functions. Follow the OWASP ASVS (Application Security Verification Standard) guidelines to ensure effective input validation and sanitization.
LLM03: Training Data Poisoning
This occurs when LLM training data is tampered with, introducing vulnerabilities or biases that compromise security, effectiveness, or ethical behavior. Sources include Common Crawl, WebText, OpenWebText, & books.
Prevention strategies include verifying the supply chain of the training data, especially when sourced externally, and maintaining attestations via the "ML-BOM" (Machine Learning Bill of Materials) methodology. Use strict vetting or input filters for specific training data or categories of data sources to control the volume of falsified data.
LLM04: Model Denial of Service
Attackers cause resource-heavy operations on LLMs, leading to service degradation or high costs. The vulnerability is magnified due to the resource-intensive nature of LLMs and the unpredictability of user inputs.
To prevent this, implement input validation and sanitization, cap resource use per request or step, and enforce API rate limits. Continuously monitor the resource utilization of the LLM to identify abnormal spikes or patterns that may indicate a DoS attack.
LLM05: Supply Chain Vulnerabilities
The LLM application lifecycle can be compromised by vulnerable components or services, leading to security attacks. Using third-party datasets, pre-trained models, and plugins can add vulnerabilities.
Mitigation strategies include carefully vetting data sources and suppliers, including T&Cs and their privacy policies, only using trusted suppliers. Implement sufficient monitoring to cover component and environment vulnerabilities scanning, use of unauthorized plugins, and out-of-date components.
LLM06: Sensitive Information Disclosure
LLMs may inadvertently reveal confidential data in their responses, leading to unauthorized data access, privacy violations, and security breaches. It's crucial to implement data sanitization and strict user policies to mitigate this.
Prevention strategies include integrating adequate data sanitization and scrubbing techniques to prevent user data from entering the training model data. Implement robust input validation and sanitization methods to identify and filter out potential malicious inputs to prevent the model from being poisoned.
LLM07: Insecure Plugin Design
LLM plugins can have insecure inputs and insufficient access control. This lack of application control makes them easier to exploit and can result in consequences like remote code execution.
To mitigate this, plugins should enforce strict parameterized input wherever possible and include type and range checks on inputs. Plugins should be designed to minimize the impact of any insecure input parameter exploitation following the OWASP ASVS Access Control Guidelines.
LLM08: Excessive Agency
LLM-based systems may undertake actions leading to unintended consequences. The issue arises from excessive functionality, permissions, or autonomy granted to the LLM-based systems.
Prevention strategies include limiting the plugins/tools that LLM agents are allowed to call to only the minimum functions necessary. Limit the permissions that LLM plugins/tools are granted to other systems to the minimum necessary in order to limit the scope of undesirable actions.
LLM09: Overreliance
Systems or people overly depending on LLMs without oversight may face misinformation, miscommunication, legal issues, and security vulnerabilities due to incorrect or inappropriate content generated by LLMs.
To mitigate this, regularly monitor and review the LLM outputs. Use self-consistency or voting techniques to filter out inconsistent text. Cross-check the LLM output with trusted external sources. This additional layer of validation can help ensure the information provided by the model is accurate and reliable.
LLM10: Model Theft
This involves unauthorized access, copying, or exfiltration of proprietary LLM models. The impact includes economic losses, compromised competitive advantage, and potential access to sensitive information.
Prevention strategies include implementing strong access controls (e.g., RBAC and rule of least privilege) and strong authentication mechanisms to limit unauthorized access to LLM model repositories and training environments. Regularly monitor and audit access logs and activities related to LLM model repositories to detect and respond to any suspicious or unauthorized behavior promptly.
In conclusion, as AI and LLMs continue to evolve and become more integrated into our daily lives and business operations, understanding and addressing these top 10 vulnerabilities is crucial for developers. By implementing robust security measures and staying informed about emerging threats and best practices, developers can create AI systems that are not only powerful and efficient but also secure and trustworthy.
OWASP AI and LLM FAQs
Q: What is the OWASP AI Security Project?
A: The OWASP AI Security Project is a comprehensive guide that provides clear and actionable insights on designing, creating, testing, and procuring secure and privacy-preserving AI systems. It focuses on Large Language Models (LLMs) and addresses key vulnerabilities and security concerns in AI applications.
Q: How was the OWASP Top 10 for LLM Applications list created?
A: The list was created by an international team of nearly 500 experts with over 125 active contributors from diverse backgrounds including AI companies, security companies, and academia. They brainstormed for a month, proposed 43 distinct threats, and through multiple rounds of voting, refined these to the ten most critical vulnerabilities. Each vulnerability was then scrutinized by dedicated sub-teams and subjected to public review.
Q: What is Prompt Injection and how can it be prevented?
A: Prompt Injection occurs when an attacker manipulates an LLM through crafted inputs, causing unintended actions. It can lead to data exfiltration and social engineering. Prevention strategies include enforcing privilege control on LLM access to backend systems and adding human oversight for extended functionality, especially for privileged operations.
Q: What is Training Data Poisoning and how can it be mitigated?
A: Training Data Poisoning occurs when LLM training data is tampered with, introducing vulnerabilities or biases that compromise security, effectiveness, or ethical behavior. To mitigate this, developers should verify the supply chain of training data, maintain attestations via the "ML-BOM" methodology, and use strict vetting or input filters for specific training data or categories of data sources.
Q: What is Model Theft and how can it be prevented?
A: Model Theft involves unauthorized access, copying, or exfiltration of proprietary LLM models, leading to economic losses and compromised competitive advantage. Prevention strategies include implementing strong access controls and authentication mechanisms, regularly monitoring and auditing access logs, and responding promptly to any suspicious or unauthorized behavior related to LLM model repositories.
Q: How can developers address the issue of Overreliance on LLMs?
A: Overreliance occurs when systems or people depend too heavily on LLMs without proper oversight, potentially leading to misinformation and security vulnerabilities. To mitigate this, developers should regularly monitor and review LLM outputs, use self-consistency or voting techniques to filter out inconsistent text, and cross-check LLM output with trusted external sources for validation.