Starter Kit Cookbooks
The Starter Kit for LLM-based App Testing (Starter-Kit) is a set of voluntary guidelines developed by IMDA that coalesce rapidly emerging best practices and methodologies for LLM App testing. It covers four key risks commonly encountered in LLM Apps today – hallucination, undesirable content, data disclosure and vulnerability to adversarial prompts.
The Starter Kit includes two parts:
-
Testing guidance: Guidance on testing methodologies and guidance on selecting/designing meaningful tests. Also incorporates practical learnings from The Global AI Assurance Pilot, Industry workshops and inputs from the Cyber Security Agency of Singapore (CSA) and the Government Technology Agency of Singapore (GovTech), who have developed and conducted AI tests for government agencies and the industry.
-
Recommended tests: An evolving list of benchmark tests, which are being incorporated here on Moonshot iteratively. Please refer to the Starter Kit document for more details. The next few sections include guidance on running some of the publicly available tests.
We recommend using the Starter Kit cookbooks in Moonshot in conjunction with the Starter Kit document for an effective understanding of considerations like test applicability, how to interpret the results, and so on.
This section includes further details on the recommended tests in Starter Kit that are currently available in Moonshot Cookbooks. Upcoming updates are available in the respective risk sections.
Currently, each risk area has a cookbook associated with it:
- Hallucination Cookbook
- Undesirable Content Cookbook
- Data Disclosure Cookbook
- Adversarial Prompts Cookbook
The starter kit cookbooks follow the same structure as other cookbooks - for a refresher on cookbooks, recipes, datasets and metrics, please visit the components of moonshot section here.
Hallucination Cookbook
A set of tests that assess factual accuracy, namely general knowledge across different domains (e.g. math, law, sociology) and local facts about Singapore, such as iconic places and political history. Answers are expected to be multiple-choice or short-form. Currently, this cookbook includes the following tests:
Test | Description | Expected format for the test (prompt/response) | Number of prompts |
---|---|---|---|
MMLU | Measures ability to answer accurately across 57 tasks including elementary mathematics, US history, computer science, law, and more. The higher the grade, the better the system performs. | MCQ question with categorical response | 17,487 |
Facts About Singapore | Tests knowledge related to Singapore across 7 topics including political history, public housing, transport etc. Curated in-house | True/False and MCQ format with categorical response | 276 |
Coming soon
Upcoming tests to be included in this cookbook include SimpleQA, CNN/DailyMail and others.
Undesirable Content Cookbook
A set of tests that assess tendency to generate undesirable content, meaning content that inflicts harm on individuals, communities, or the public interest, such as socially harmful or legally prohibited content. It includes tests relating to toxicity, information on committing crimes, etc.
Test | Description | Expected format for the test (prompt/response) | Number of prompts |
---|---|---|---|
MLCommons AILuminate - Violent Crimes | Measures ability to respond without enabling, encouraging, or endorsing the commission of violent crimes. A high grade indicates that the system under test handles the response well. | Conversation questions/ instruction with open-ended responses, which are assessed using LLM-as-a-judge. For the current cookbook implementation, GPT4o as LLM-as-a-judge and there is a requirement to provide an OpenAI API key to run the recipe/cookbook. Please refer to the FAQ section for more details | 100 |
MLCommons AILuminate - Non Violent Crimes | Measures ability to respond without enabling, encouraging, or endorsing the commission of non violent crimes. A high grade indicates that the system under test handles the response well. | Conversation questions/ instruction with open-ended responses, which are assessed using LLM-as-a-judge. For the current cookbook implementation, GPT4o as LLM-as-a-judge and there is a requirement to provide an OpenAI API key to run the recipe/cookbook. Please refer to the FAQ section for more details | 100 |
Singapore Safety Questions | Tests ability to respond safely to prompts related to culturally and legally sensitive situations in Singapore | Open-ended, assessed using llamaguardannotator | 59 |
Real Toxicity Prompts | Measures ability to identify toxic sentences. The higher the grade, the less toxic the system is. | Open-ended, assess using toxicity classifier | 198,884 |
Coming soon
Upcoming tests to be included in this cookbook include MLCommons AILuminate Hate, XSTest and others.
Data Disclosure Cookbook
A set of tests that assess the tendency to disclose information that is commonly considered to be sensitive or aid requests to obtain such information in an unauthorised manner. It covers a variety of sensitive information types in realistic conversational style over various sentence formats.
Test | Description | Expected format for the test (prompt/response) | Number of prompts |
---|---|---|---|
MLCommons AILuminate - Privacy | Same approach as violent crimes | Conversation questions/ instruction with open-ended responses, which are assessed using LLM-as-a-judge. For the current cookbook implementation, GPT4o as LLM-as-a-judge and there is a requirement to provide an OpenAI API key to run the recipe/cookbook. Please refer to the FAQ section for more details | 100 |
Coming soon
Upcoming tests to be included in this cookbook include XSTest, SorryBench and others.
Adversarial Prompts Cookbook
Tests for susceptibility to producing unsafe outputs (which may include incorrect content, undesirable content and/or sensitive information) when presented with intentional prompt attacks. It covers a range of prompt attack techniques across different risk categories.
Test | Description | Expected format for the test (prompt/response) | Number of prompts |
---|---|---|---|
CyberSecEval - Prompt Injections 3 | Measures model's susceptibility to prompt injections, adapted from Purple Llama CyberSecEval benchmark NOTE: This test would soon be upgraded to match the CyberSecEval Prompt Injections v4 from PurpleLlama | Conversation questions/ instruction with open-ended responses, which are assessed using LLM-as-a-judge. For the current cookbook implementation, GPT4o as LLM-as-a-judge and there is a requirement to provide an OpenAI API key to run the recipe/cookbook. Please refer to the FAQ section for more details | 251 |
Coming soon
Upcoming tests to be included in this cookbook include Cyberseceval 4 (version upgrade, Microsoft BIPIA and others.
Refer to getting started guide here on how to run these tests for your model/application.