As GenAI and large language model (LLM) usage has evolved, so has the guidance from industry-leading organizations such as the Open Web Application Security Project (OWASP). OWASP has provided key resources, among them the Top Ten for LLM (Large Language Model) Applications project. Recently, the project released the “LLM AI Cybersecurity & Governance Checklist,” which I’ll be looking at in this analysis.
Checklist Overview
The LLM AI Cybersecurity & Governance Checklist aims to enable leaders to quickly identify key risks associated with GenAI and LLMs and equip them with associated mitigations. OWASP stresses the checklist isn’t exhaustive and will shift as the GenAI and LLM usage progresses and the tools themselves develop and mature.
The checklist starts off with term clarification. Generative AI is defined as a type of machine learning (ML) that focuses on creating new data, while LLMs are a type of AI model used to process and generate human-like text.
There are various threat categories, captured in the image below:
The checklist also lays out a six-step practical approach for organizations to develop their LLM strategy:
There are also various LLM deployment types, each with unique considerations. The types range from public application programming (AP) access and licensed models, all the way to custom models:
With those key considerations out of the way, let’s walk through the checklist areas and offer key takeaways from each.
Adversarial Risk
This area involves competitors and attackers and is focused not only on the attack landscape, but also on the business landscape. This includes understanding how competitors are using AI to drive business outcomes, as well as updating internal processes and policies such as incident response plans (IRPs) to account for GenAI attacks and incidents.
Threat Modeling
Threat modeling is a security technique advocated for by CISA and others that continues to gain traction in the broader push for secure-by-design systems. Threat modeling involves thinking through how attackers can use LLMs and GenAI to accelerate exploitation and considering the business’s ability to detect malicious LLM use. Threat modeling also examines if the organization can safeguard connections to LLM and GenAI platforms from internal systems and environments.
AI Asset Inventory
The adage “you can’t protect what you don’t know you have” applies to GenAI and LLMs. This checklist area involves having an AI asset inventory for both internally developed offerings and external tools and platforms. It’s crucial to understand not only the tools and services being used by the organization, but also the “ownership” in terms of who will be accountable for their use.
There’s also recommendations to include AI components in software bills of material (SBOMs) and catalog AI data sources and their respective sensitivity as well. In addition to having an inventory of existing tools in use, there also should be a process to onboard and off-board future tools and services from the organizational inventory securely.
One of the leading SBOM formats is CycloneDX by OWASP, and in 2023, it announced in its support for “ML-BOMs” (machine learning bills of materials.)
AI Security and Privacy Training
An organization can properly integrate AI security and privacy training into its GenAI and LLM adoption journey. Doing so involves helping staff understand existing GenAI/LLM initiatives, as well as the broader technology, how it functions, and key security considerations, such as data leakage.
Further, it’s essential to establish a culture of trust and transparency so staff feel comfortable sharing what GenAI and LLM tools and services are being used and how. Building this trust and transparency within the organization is crucial to avoiding “shadow AI” usage. Otherwise, people will continue to use these platforms without informing IT and security teams for fear of consequences or punishment.
Establish Business Cases
This one may be surprising, but much like with cloud before it, most organizations don’t establish coherent strategic business cases for using new innovative technologies, including GenAI and LLM. It is easy for businesses to get caught in the hype and feel they need to join the race or get left behind. But without a sound business case, the organization risks poor outcomes, increased risks, and opaque goals.
Ask AI Ecosystem Copilot about this analysis
Governance
Without governance, accountability and clear objectives are nearly impossible. This area of the checklist involves establishing an AI RACI (responsible, accountable, consulted, and informed) chart for the organization’s AI efforts, documenting and assigning who will be responsible for risks and governance and establishing organizational-wide AI policies and processes.
Legal
This area involves an extensive list of activities, such as product warranties involving AI, AI EULAs (end user license agreements), ownership rights for code developed with AI tools, IP risks, and contract indemnification provisions. Businesses should be sure to engage their legal team or experts to determine the various legal-focused activities the organization should be undertaking as part of their adoption and use of GenAI and LLMs.
Regulatory
Regulations such as the EU’s AI Act are rapidly advancing, with others undoubtedly soon to follow. Organizations should be determining their country, state, and government AI compliance requirements, consent around AI use for specific purposes such as employee monitoring and clearly understanding how their AI vendors store and delete data as well as regulate its use.
Using or Implementing LLM Tools
Using LLM tools requires specific risk considerations and controls. The checklist calls out items such as access control, training pipeline security, mapping data workflows, and understanding existing or potential vulnerabilities in the LLM models and supply chains. Additionally, there is a need to request third-party audits, penetration testing, and even code reviews for suppliers, both initially and on an ongoing basis.
Testing, Evaluation, Verification, and Validation (TEVV)
Another key consideration is having sufficient testing, evaluation, and verification throughout the AI model lifecycle. This is why the OWASP AI checklist recommends the NIST TEVV process. It involves establishing continuous testing, evaluation, verification, and validation throughout AI Model lifecycles as well as providing executive metrics on AI model functionality, security, and reliability.
Model Cards and Risk Cards
To ethically deploy LLMs, the checklist calls for the use of model and risk cards, which can be used to let users understand and trust the AI systems as well to openly address potentially negative consequences such as biases and privacy. These cards can include items such as model details, architecture, training data methodologies and performance metrics. There is also emphasis on accounting for responsible AI considerations and concerns around fairness and transparency.
RAG: LLM Optimizations
Retrieval-augmented generation (RAG) is a way to optimize LLM capabilities when it comes to retrieving relevant data from specific sources. It is a part of optimizing pre-trained models or re-training existing models on new data to improve performance. The checklist recommended implementing RAG to maximize the value and effectiveness of LLMs for organizational purposes.
AI Red Teaming
Lastly, the checklist calls out the use of AI red teaming, which is emulating adversarial attacks of AI systems to identify vulnerabilities and validate existing controls and defenses. It does emphasize that red teaming alone isn’t a comprehensive answer or approach to securing GenAI and LLMs but should be part of a comprehensive approach to secure GenAI and LLM adoption.
That said, it is worth noting that organizations need to clearly understand the requirements and ability to red team services and systems of external GenAI and LLM vendors to avoid violating policies or finding themselves in legal trouble.
AI red teaming and penetration testing is also called for in other sources such as those by NIST discussed above, as well as in the EU AI guidance
Conclusion
While not exhausting all potential GenAI and LLMs threats and risk considerations, the OWASP LLM AI Cybersecurity & Governance Checklist represents a concise resource for organizations and security leaders. It can aid practitioners in identifying key threats and ensuring the organization has fundamental security controls in place to help secure and enable the business as it matures in its approach of adopting GenAI and LLM tools, services, and products.