Security Audit

OWASP Large Language Model (LLM) Top 10 Explained

Updated on: July 16, 2024

OWASP Large Language Model (LLM) Top 10 Explained

Since their inception, LLMs or large language models have rapidly integrated into various fields over the past couple of years, giving rise to a new set of security challenges in the field. LLMs like ChatGPT or GitHub’s Co-Pilot are also prone to cyber-attacks, where exploiting a single vulnerability can disrupt thousands of organizations that rely on them for their day-to-day activities.

To address this security gap, OWASP has created a list that helps identify and mitigate the most severe vulnerabilities that could deeply impact LLMs.

The OWASP Large Language Model (LLM) Top 10 List

  1. Prompt Injection
  2. Insecure Output Handling
  3. Training Data Poisoning
  4. Model Denial of Service
  5. Supply Chain Vulnerabilities
  6. Sensitive Information Disclosure
  7. Insecure Plugin Design
  8. Excessive Agency
  9. Overreliance
  10. Model Theft

What is the OWASP LLM Top 10?

The OWASP Large Language Model (LLM) Top 10 lists the most frequent and significant security risks in large language model applications. It aims to educate developers, designers, and organizations about potential security risks that may arise from the deployment of large language models.

OWASP Large Language Model (LLM) Top 10 Explained with Examples

OWASP LLM Top 10 list

LLM01: Prompt Injection

Prompt Injection is an attack in which the attacker uses crafted input prompts to manipulate the LLM to execute unintended actions or extract sensitive information.

It can be performed in two ways:

  1. Direct Prompt Injections, or jailbreaking, is an attack in which the attackers modify or reveal the system prompts, allowing them to interact with the underlying system by exploiting the insecure functions.
  2. Indirect Prompt Injection is a method in which LLMs accept external input, and attackers send specially crafted input prompts that allow them to manipulate the users or the systems associated with the LLM.

Mitigation Suggestions

  • Establish Trust boundaries and treat LLM as an external user
  • Set up proper access control mechanisms to limit access to the backend
  • Separate the external user prompts from the predefined prompts

Example Attack Scenarios for Prompt Injections

An attacker inputs specially crafted prompts to trick the LLM intro into revealing confidential information about the application, such as API keys, or modifying the outputs to perform unintended actions.

Prompt Injection Attack

LLM02: Insecure Output Handling

Insecure Output Handling is a vulnerability that occurs due to insufficient validation and sanitization of input and output and improper handling of the output generated by the LLMs before it is passed on to the applications.

This security weakness can lead to vulnerabilities like XSS, SSRF, CSRF or even remote code execution.

Mitigation Suggestions:

  • Setup Input Validation and Sanitization mechanisms.
  • Implement Output Encoding before directing the output to end users.
  • Implement proper access controls to avoid the processing of sensitive commands or prompts.

Example Attack Scenarios for Insecure Output Handling

An LLM allows the crafting of SQL queries for a database, and the attackers request a query to update the users’ table or delete all the tables. If this query is not validated and sanitized properly, it could delete all the databases associated with the LLMs.

LLM03: Training Data Poisoning

Training Data Poisoning is a weakness that occurs when attackers modify or manipulate the training data with harmful data. This causes the LLM to learn from incorrect or biased data and produce skewed predictions.

This can cause the poisoned data to be served to users or lead to issues like software exploitation, which can harm the brand’s reputation.

Mitigation Suggestions

  • Verify the source of the training data
  • Use strict input filters to allow specific data for training.
  • Monitor and Verify the data for anomalies.

Example Attack Scenarios for Training Data Poisoning

A competitor or attackers create documents with biased information to feed to the targeted LLMs training data, which causes it to unintentionally generate a more biased output, which benefits the attackers.

Training Data Poisoning

LLM04: Model Denial of Service

Model Denial of Service (DoS) is a type of attack in which attackers cause resource-heavy operations to disrupt the availability of the LLM, slowing it down or making it unavailable to users or associated applications.

This could also lead to the model learning from this barrage of inputs and allowing the attacker to manipulate the context window(Input length) set by the LLM.

Mitigation Suggestions

  • Implement proper request throttling or rate-limiting mechanisms.
  • Set a strict input length limit for the context window.
  • Continuously monitor the utilization of resources and restrict excessive utilization.

Example Attack Scenarios for Model Denial of Service

Attackers can flood the LLM with a high volume of long inputs that can reach the limit of the defined context window, causing strain and increased resource usage, making it unresponsive.

LLM05: Supply Chain Vulnerabilities

Supply chain vulnerabilities generally occur in LLM applications when the third-party resources or libraries used in the development introduce external security risks to the applications.

Such a vulnerability can result in the application becoming un-operational and even lead to data breaches.

Mitigation Suggestions

  • Use Verified and secure third-party dependencies
  • Use Model and Code Signing
  • Regularly update third-party components

Example Attack Scenarios for Supply Chain Vulnerabilities

Attackers can manipulate and inject publicly available datasets with malicious inputs to generate a backdoor when it is used to retrain a model. This allows the attackers to perform unauthorized actions or even access sensitive data in that environment.

Supply Chain Vulnerabilities

LLM06: Sensitive Information Disclosure

Sensitive Data Exposure is a vulnerability that can occur when the LLM reveals sensitive information about the system or the algorithms being used in its output. Attackers can access such information to gain unauthorized access to the system.

It can lead to data breaches and loss of Intellectual Property and cause legal penalties for non-compliance and privacy violations.

Mitigation Suggestions

  • Implement data sanitization mechanisms.
  • Implement strict data access policies.
  • Monitor model outputs for information leaks.

Example Attack Scenarios for Sensitive Information Disclosure

Attackers can craft malicious input prompts that exploit the absence or ineffectiveness of the input validation mechanisms deployed in the LLMs to reveal the PII of other application users.

LLM07: Insecure Plugin Design

Insecure Plugin Design occurs when the LLM plugins introduce vulnerabilities into the system. These vulnerabilities can cause the LLMs to accept insecure input prompts or have improper access control mechanisms, making them easier targets for attackers to exploit.

Due to the lack of proper authentication and authorization, the plugins trust the data input via other plugins, causing data leakages, privilege escalation, or remote code execution, which leads to system failure.

Mitigation Suggestions

  • Follow secure coding practices for development.
  • Restrict plugin access to data and functions.
  • Plugins should use proper authentication systems to maintain access controls.

Example Attack Scenarios for Insecure Plugin Design

An insecure plugin can be targeted by attackers to input content generated by other insecure LLMs and perform any unauthorized actions as the plugin assumes the data is being input by an end user.

Insecure Plugin Design

LLM08: Excessive Agency

Excessive Agency is a vulnerability that occurs when the LLMs are given control over crucial functions or are given excessive permissions. These actions can cause unintended damage to the data or the associated applications.

It can deeply impact the confidentiality, integrity, and availability of the associated applications and introduce various vulnerabilities.

Mitigation Suggestions

  • Limit the autonomy and permissions of LLMs
  • Implement human-in-loop controls for critical operations
  • Implement authorization in downstream applications

Example Attack Scenarios for Excessive Agency

Attackers can exploit a plugin developed for executing specific shell commands by feeding it instructions to perform undefined and higher privilege commands to gain unauthorized access or perform unintended functions.

LLM09: Overreliance

Overreliance in LLMs can occur when the output generated by the models is trusted by the users and the applications without any validation or confirmation on whether the generated outputs are accurate.

This overreliance on the LLMs’ outputs can lead to miscommunication or introduce security vulnerabilities due to incorrect outputs.

Mitigation Suggestions

  • Use LLM outputs as recommendations.
  • Review and validate LLM outputs for accuracy.
  • Enhance the model by fine-tuning output quality.

Example Attack Scenarios for Overreliance

A healthcare system designed to provide diagnosis and prescribing medicines, due to overreliance, can lead to incorrect and harmful treatments or procedures and can be fatal for the patients.

LLM10: Model Theft

Model Theft refers to unauthorized access to the LLM, which allows attackers to gain sensitive information about the model, such as its parameters, which can be used to replicate the model and use it for themselves.

This vulnerability leads to financial and reputational loss to organizations and creates mistrust amongst the users of the LLMs.

Mitigation Suggestions

  • Secure LLM with strong encryption suites.
  • Implement robust access control mechanisms.
  • Regularly monitor and audit the access logs.

Example Attack Scenarios for Model Theft

An attacker can query the model repeatedly with selected inputs, collecting sensitive information from the outputs to replicate the model and use it for themselves without access to the original one.

Model Theft

Final Thoughts

The OWASP LLM Top 10 lists critical security risks associated with LLMs. Identifying and mitigating these vulnerabilities helps developers and organizations ensure secure and robust models. 

Practices like role-based access controls, strong cipher suites, input validation, and regular monitoring of the LLMs and their output can help protect them from various cyberattacks.

Nivedita James Palatty

Nivedita is a technical writer with Astra who has a deep love for knowledge and all things curious in nature. An avid reader at heart she found her calling writing about SEO, robotics, and currently cybersecurity.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments

Psst! Hi there. We’re Astra.

We make security simple and hassle-free for thousands
of websites and businesses worldwide.

Our suite of security products include a vulnerability scanner, firewall, malware scanner and pentests to protect your site from the evil forces on the internet, even when you sleep.

earth spiders cards bugs spiders

Made with ❤️ in USA France India Germany