Run Moonshot using Command Line (CLI)
(CLI) How to Create Custom Benchmark Tests
In this detailed guide section, you will learn how to run a benchmark in Moonshot. Benchmarks are a set of "exam questions" that can help to evaluate and assess the capabilities and safety of the AI system.
-
Change directory to the root directory of Moonshot.
-
Enter the following command to enter the CLI interactive mode:
python -m moonshot cli interactive
-
Choose a benchmark type to run and view help:
Warning
Important information before running your benchmark:
Certain benchmarks may require metrics that connect to a particular model (i.e. MLCommons cookbooks and recipes like mlc-cae use the metric llamaguardannotator, which requires the API token of together-llama-guard-7b-assistant endpoint).
Refer to this list for the requirements.
-
Recipe
To find out more about the required fields to create a recipe:
run_recipe -h
To run the help example, enter:
run_recipe "my new recipe runner" "['bbq','mmlu']" "['openai-gpt35-turbo']" -n 1 -r 1 -s "You are an intelligent AI"
-
Cookbook:
To find out more about the required fields to create a cookbook:
run_cookbook -h
To run the help example, enter:
run_cookbook "my new cookbook runner" "['chinese-safety-cookbook']" "['openai-gpt35-turbo']" -n 1 -r 1 -s "You are an intelligent AI"
-
-
View the results:
-
Recipe:
-
Cookbook:
-
CLI modes
Two modes are available on the Moonshot CLI: Command-Based Mode and Interactive Mode.
Full list of commands in Moonshot
Initialisation
======================================================================================================
interactive Run the interactive shell.
list_connect_types Get a list of available Language Model (LLM) connection types.
list_endpoints Get a list of available Language Model (LLM) endpoints.
version Get the version of the application.
Moonshot Benchmarking
======================================================================================================
add_cookbook Add a new cookbook.
add_endpoint Add a new endpoint.
add_recipe Add a new recipe.
list_cookbooks Get a list of available cookbooks.
list_recipes Get a list of available recipes.
list_results Get a list of available results.
list_runs Get a list of available runs.
resume_run Resume an interrupted run.
run_cookbook Run a cookbook.
run_recipe Run a recipe.
view_cookbook View a cookbook.
view_results View a results file.
Moonshot RedTeaming
=======================================================================================================
end_session End the current session.
list_prompt_templates List all prompt templates available.
list_sessions List all available sessions.
new_session Add a new red teaming session.
use_context_strategy Use a context strategy.
use_prompt_template Use a prompt template.
use_session Use an existing red teaming session.
Uncategorized
======================================================================================================
alias Manage aliases
edit Run a text editor and optionally open a file with it
help List available commands or provide detailed help for a specific command
history View, run, edit, save, or clear previously entered commands
macro Manage macros
quit Exit this application
run_pyscript Run a Python script file inside the console
run_script Run commands in script file that is encoded as either ASCII or UTF-8 text
set Set a settable parameter or show current settings of parameters
shell Execute a command as if at the OS prompt
shortcuts List available shortcuts
Command-based Mode
In the command-based mode, run commands by prepending python -m moonshot cli
.
For example,
- To list all the available commands:
python -m moonshot cli help
- To list the connector types available:
python -m moonshot cli list_connect_types
Interactive Mode
We recommend the interactive mode for a more efficient experience, especially if you are using Moonshot to red-team.
To enter interactive mode: python -m moonshot cli interactive
(You should see the command prompt change to moonshot >
) For example,
- To list all the available commands:
- To list the connector types available:
Add Your Own Benchmark Tests
In this section, we will be going through the steps required to add new test using CLI.
You will learn how to:
- Add a new dataset into Moonshot
- Add a new recipe to run a benchmark
- Add a new cookbook to run a set of benchmarks
Launch Moonshot CLI
You can launch Moonshot CLI by running the following command:
Create a New Cookbook
We can also create a new cookbook with our new recipe. A cookbook in Moonshot is a curated collection of recipes. A cookbook is very useful when the user wants to group a certain type of tests into a single execution.
Add Cookbook
To add a new cookbook, simply run the following command:
The fields are as follows for this example:
- Name (A unique name for the cookbook):
My new cookbook
- Description (A detailed explanation of the cookbook's purpose and the types of recipes it contains):
I am cookbook description
- Recipes (A list of recipe names that are included in the cookbook. Each recipe represents a specific test or benchmark):
['my-new-recipe','auto-categorisation']
You can also view the description of this command using the following command:
Use the following command to create a new cookbook with your newly created recipe:
add_cookbook 'My new cookbook' 'I am cookbook description' "['my-new-recipe','auto-categorisation']"
View Cookbook
Enter the following command to view your newly created cookbook:
Create a New Recipe
To run the new Moonshot-compatible dataset that you have created in moonshot-data/datasets
, we must first create a new recipe.
Note
A recipe contains all the details required to run a benchmark. A recipe guides Moonshot on what data to use, and how to evaluate the model's responses.
Add Recipe
In Moonshot CLI, the user can use add_recipe
to add a new recipe in Moonshot. The parameters of the command are shown below:
- Name (A unique name for the recipe):
My new recipe
- Description (An explanation of what the recipe does and what it's for):
I am recipe description
- Categories (Broader classifications that help organize recipes into collections):
['category1','category2']
- Datasets (The data that will be used when running the recipe. This could be a set of prompts, questions, or any input that - the model will respond to):
['bbq-lite-age-ambiguous']
- Metrics (Criteria or measurements used to evaluate the model's responses, such as accuracy, fluency, or adherence to a - prompt):
['bertscore','bleuscore']
- Prompt Templates (Optional pre-prompt or post-prompt):
['analogical-similarity','mmlu']
- Tags (Optional keywords that categorize the recipe, making it easier to find and group with similar recipes):
['tag1','tag2']
- Attack Strategies (Optional components that introduce adversarial testing scenarios to probe the model's robustness):
['charswap_attack']
- Grading Scale (Optional set of thresholds or criteria used to grade or score the model's performance):
{'A':[80,100],'B':[60,79],'C':[40,59],'D':[20,39],'E':[0,19]}
You can also view the description of this command using the following command:
Add a new recipe using the dataset that you have created in the previous section using the following command:
My new recipe' 'I am recipe description' "['category1','category2']" "['bbq-lite-age-ambiguous']" "['bertscore','bleuscore']" -p "['analogical-similarity','mmlu']" -t "['tag1','tag2']" -g "{'A':[80,100],'B':[60,79],'C':[40,59],'D':[20,39],'E':[0,19]}"
View Recipe
Once created, you can view your recipe using view_recipe
.
Note
The ID of the recipe is created by slugifying the name. In this case, the ID of this recipe is my-new-recipe
.
Create a New Dataset
You can convert your raw dataset into Moonshot-compatible dataset using the following schema. Once you have created the new dataset, you can save the file in moonshot-data/datasets
for Moonshot to access this dataset.
Use your favourite text editor and save the following json data in moonshot-data/datasets/example-dataset.json
.
{
"name": "Fruits Dataset",
"description":"Measures whether the model knows what is a fruit",
"license": "MIT license",
"reference": "",
"examples": [
{
"input": "Is Lemon a Fruit? Answer Yes or No.",
"target": "Yes."
},
{
"input": "Is Apple a Fruit? Answer Yes or No.",
"target": "Yes."
},
{
"input": "Is Bak Choy a Fruit? Answer Yes or No.",
"target": "No."
},
{
"input": "Is Bak Kwa a Fruit? Answer Yes or No.",
"target": "No."
},
{
"input": "Is Dragonfruit a Fruit? Answer Yes or No.",
"target": "Yes."
},
{
"input": "Is Orange a Fruit? Answer Yes or No.",
"target": "Yes."
},
{
"input": "Is Coke Zero a Fruit? Answer Yes or No.",
"target": "No."
}
]
}
The name of the dataset is the unique identifier for the dataset. This will be used in the recipes.
Note
You can also refer to this Jupyter notebook example for more details how a dataset can be created.
Connecting Endpoints
In this section, we will be going through the steps required to create a connector endpoint.
Before we jump into executing tests and performing red teaming on LLMs, we have to first create a connector endpoint. This connector endpoint will help us to connect to a specific LLM.
For the following steps, they will be done in interactive mode in CLI. To activate interactive mode, enter:
python -m moonshot cli interactive
Using an Existing Connector Endpoint
- To view the connector endpoint available, enter:
list_endpoints
You will see a list of available connector endpoints that we have created beforehand:
-
If there is no connector endpoint for you here, you create your own connector endpoint here. Otherwise, enter the following command to modify the connector endpoint you want to use (e.g., adding your own API key):
update_endpoint -h
You should see a help example:
update_endpoint openai-gpt4 "[('name', 'my-special-openai-endpoint'), ('uri', 'my-uri-loc'), ('token', 'my-token-here'), ('params', {'hello': 'world'})]"
Here, we are updating a connector endpoint with the ID
openai-gpt4
. The keys and values to be updated are tuples in a list (i.e. update the keyname
with the valuemy-special-openai-endpoint
) -
After you have used the
update_endpoint
command to update your connector endpoint. Enter the following command to view your updated connector endpoint:view_endpoint openai-gpt4
Creating a Connector Endpoint
-
Enter the following command to understand more on how to create a connector endpoint
add_endpoint -h
You should see a help example:
add_endpoint openai-connector 'OpenAI GPT3.5 Turbo 1106' MY_URI ADD_YOUR_TOKEN_HERE 1 2 'gpt-3.5-turbo-1106' "{'temperature': 0.5}"
In this example, we are creating a connector endpoint for the
openai-connector
connector type:- Name of the Connector you want to use:
my-openai-connector
- Name of your new Connector Endpoint:
OpenAI GPT3.5 Turbo 1106
- Uri:
ADD_YOUR_TOKEN_HERE
- API token:
ADD_YOUR_TOKEN_HERE
- Max number of calls made to the endpoint per second:
1
- Max concurrency of the endpoint:
2
- Model of the endpoint you want to connect to
'gpt-3.5-turbo-1106'
-
Other parameters that this endpoint may need:
- Temperature: 0.5
To view the list of connector types, enter
list_connector_types
:
- Name of the Connector you want to use:
-
After you have used the
add_endpoint
command to create your endpoint, enter the following command to view your newly created connector endpoint:view_endpoint openai-gpt3-5-turbo-1106
NOTE: The ID (openai-gpt3-5-turbo-1106) of the connector endpoint is created by slugifying the name.
Run Red Teaming Sessions
In this section, we will be going through the steps required to run red teaming sessions.
To run a test, you will need:
- Connector Endpoint - a configuration file to connect to your desired LLM endpoint
- Session - a session allows users to perform manual and automated red teaming on the LLMs, and stores the prompts and responses to and fro.
- Prompt - a prompt that you will be sending to LLMs in manual red teaming/ a starting prompt to input in attack modules before sending to the LLMs
For the following steps, they will be done in interactive mode in CLI. To activate interactive mode, enter:
python -m moonshot cli interactive
Create a Connector Endpoint
If you have not already created a connector endpoint, check out the guide here.
Create a Session
Once your connector endpoint is created, we can start creating our session for red teaming.
Every session must reside in a runner. Before we create a session, enter the following command to view a list of runners currently available by entering:
list_runners
There are two options to create a session: you can either use an existing runner, or create a new runner with a session. To better understand its usage, enter the following command:
new_session -h
-
Use existing runner.
-
Example:
new_session my-test-mrt -c add_previous_prompt -p mmlu
- Runner ID:
my-test-mrt
- Context strategy:
add_previous_prompt
- Prompt template:
mmlu
- Runner ID:
NOTE: Context strategy and prompt template are optional and can be set later so you can omit the
-c -p
flags if you do not need them -
-
Create new runner.
-
Example:
new_session my-new-runner-test-mrt -e "['openai-gpt35-turbo','openai-gpt4']" -p phrase-relatedness
- Runner ID:
my-new-runner-test-mrt
- Endpoint:
['openai-gpt35-turbo','openai-gpt4']
- Prompt template:
phrase-relatedness
- Runner ID:
-
Once you have a session created and activated, we can proceed with red teaming. There are two ways to perform red teaming: manual red teaming and using attack modules to perform automated attacks.
Manual Red Teaming
From the previous section, you should have a session created and activated. For manual red teaming, you can start by typing something in the session and that prompt will be sent to all the LLMs in that session.
NOTE: Anything entered in a session that is not a command will be considered a prompt and sent to the LLMs in that session!
Automated Red Teaming (Run Attack Modules)
We will use the same session from manual red teaming in this example. Enter the following command to find out more about running attack modules:
run_attack_module -h
You should see a help example:
run_attack_module charswap_attack "The grandpa and his son went to does not know how to use a phone?\na)Grandpa\nb)Son."
The fields are as follows for this example:
- Attack module ID:
charswap_attack
- Prompt:
The grandpa and his son went to does not know how to use a phone?\na)Grandpa\nb)Son.
Refer to this section for more information on automated red teaming
Ending and Resuming a Session
End session - Once you are done with red teaming, you can close the session by entering:
end_session
View sessions - You can view your sessions by entering:
list_sessions
Resume session - You can resume a session by entering:
use_session <desired session id>
where <desired session id>
is an id
in list_sessions
. When you resume a session, the state of your previous red teaming attempts will be restored.
Configurations in a Session
-
These are the configurations you can set in a session:
-
Context strategy: a Python module that helps to add context to the current prompt (i.e. add in the previous five prompts sent.)
To use a context strategy:
use_context_strategy <desired context strategy id>
You can use the following command to view the list of context strategies available:
list_context_strategies
The
<desired context strategy id>
should correspond to anId
inlist_context_strategies
.- It is also possible to set the number of previous prompts to use with a context strategy. For example, to add
8
previous prompts as context using theadd_previous_prompt
, use the command:use_context_strategy add_previous_prompt -n 8
To clear a context strategy in a session, use:
clear_context_strategy
- It is also possible to set the number of previous prompts to use with a context strategy. For example, to add
-
Prompt template: a JSON file which contains static texts that is appended to every prompt before they are sent to the LLMs.
To use a prompt template:
use_prompt_template <desired prompt template id>
You can use the following command to view the list of prompt templates available:
list_prompt_templates
The
<desired prompt template id>
should correspond to anId
inlist_prompt_templates
.To clear a prompt template in a session, use:
clear_prompt_template
-
More About Automated Red Teaming
Currently, automated red teaming heavily relies on the attack module being used. We have created a class, AttackModule, which serves as the base class for creating custom attack modules within the Moonshot framework. This class provides a structure that red teamers can extend to implement their own adversarial attack strategies.
In the AttackModule class, we have simplified the process for red teamers by providing easy access to necessary components for red teaming, such as connector endpoints and a function to automatically wrap the prompt template and context strategy contents around the provided prompt.
The design is very free-form, thus it is entirely up to the attack module developers whether they want to use the functions we have prepared. For instance, they may choose not to use the context strategy and prompt template at all in the attack module, even though these may be set in the session.
List of CLI Commands
Command | Description | Parameters | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
add_cookbookadd_cookbook 'My new cookbook' 'I am cookbook description' "['analogical-similarity','auto-categorisation']" |
Add a new cookbook. The 'name' argument will be slugified to create a unique identifier. |
|
||||||||||||||||
add_recipeadd_recipe 'My new recipe' 'I am recipe description' "['category1','category2']" "['bbq-lite-age-ambiguous']" "['bertscore','bleuscore']" -p "['analogical-similarity','mmlu']" -t "['tag1','tag2']" -g "{'A':[80,100],'B':[60,79],'C':[40,59],'D':[20,39],'E':[0,19]}" |
Add a new recipe. The 'name' argument will be slugified to create a unique identifier. |
|
||||||||||||||||
delete_cookbookdelete_cookbook my-new-cookbook |
Delete a cookbook. |
|
||||||||||||||||
delete_datasetdelete_dataset bbq-lite-age-ambiguous |
Delete a dataset. |
|
||||||||||||||||
delete_metricdelete_metric my-new-metric |
Delete a metric. |
|
||||||||||||||||
delete_recipedelete_recipe my-new-recipe |
Delete a recipe. |
|
||||||||||||||||
delete_resultdelete_result my-new-cookbook-runner |
Delete a result. |
|
||||||||||||||||
delete_runnerdelete_runner my-new-cookbook-runner |
Delete a runner. |
|
||||||||||||||||
list_cookbookslist_cookbooks -f "risk" |
List all cookbooks. |
|
||||||||||||||||
list_datasetslist_datasets -f "bbq" |
List all datasets. |
|
||||||||||||||||
list_metricslist_metrics -f "exact" |
List all metrics. |
|
||||||||||||||||
list_recipeslist_recipes -f "mmlu" |
List all recipes. |
|
||||||||||||||||
list_resultslist_results -f "my-runner" |
List all results. |
|
||||||||||||||||
list_runnerslist_runners |
List all runners. | - | ||||||||||||||||
list_runslist_runs -f "my-run" |
List all runs. |
|
||||||||||||||||
run_cookbookrun_cookbook "my new cookbook runner" "['chinese-safety-cookbook']" "['openai-gpt35-turbo']" -n 1 -r 1 -s "You are an intelligent AI" |
Run a cookbook. |
|
||||||||||||||||
run_reciperun_recipe "my new recipe runner" "['bbq','mmlu']" "['openai-gpt35-turbo']" -n 1 -r 1 -s "You are an intelligent AI" |
Run a recipe. |
|
||||||||||||||||
update_cookbookupdate_cookbook my-new-cookbook "[('name', 'Updated Cookbook Name'), ('description', 'Updated description'), ('recipes', ['analogical-similarity'])]" |
Update a cookbook. |
|
||||||||||||||||
update_recipeupdate_recipe my-new-recipe "[('name', 'Updated Recipe Name'), ('tags', ['fairness', 'bbq'])]" |
Update a recipe. |
|
||||||||||||||||
view_cookbookview_cookbook my-new-cookbook |
View a cookbook. |
|
||||||||||||||||
view_datasetview_dataset bbq-lite-age-ambiguous |
View a dataset file. |
|
||||||||||||||||
view_metricview_metric my-new-metric |
View a metric file. |
|
||||||||||||||||
view_recipeview_recipe my-new-recipe |
View a recipe. |
|
||||||||||||||||
view_resultview_result my-new-cookbook-runner |
View a result file. |
|
||||||||||||||||
view_runview_run my-new-cookbook-runner |
View a runner runs. |
|
||||||||||||||||
view_runnerview_runner my-new-cookbook-runner |
View a runner. |
|
||||||||||||||||
add_endpointadd_endpoint openai-connector 'OpenAI GPT3.5 Turbo 1106' MY_URI ADD_YOUR_TOKEN_HERE 1 1 "{'temperature': 0.5, 'model': 'gpt-3.5-turbo-1106'}" |
Add a new endpoint. The 'name' argument will be slugified to create a unique identifier. |
|
||||||||||||||||
convert_datasetconvert_dataset 'dataset-name' 'A brief description' 'http://reference.com' 'MIT' '/path/to/your/file.csv' |
Convert your dataset. The 'name' argument will be slugified to create a unique identifier. |
|
||||||||||||||||
delete_endpointdelete_endpoint openai-gpt4 |
Delete an endpoint. |
|
||||||||||||||||
delete_prompt_templatedelete_prompt_template squad-shifts |
Delete a prompt template. |
|
||||||||||||||||
download_datasetdownload_dataset 'dataset-name' 'A brief description' 'http://reference.com' 'MIT' "{'dataset_name': 'cais/mmlu', 'dataset_config': 'college_biology', 'split': 'dev', 'input_col': ['question','choices'], 'target_col': 'answer'}" |
Download dataset from Hugging Face. The 'name' argument will be slugified to create a unique ID. |
|
||||||||||||||||
delete_endpointdelete_endpoint openai-gpt4 |
Delete an endpoint. |
|
||||||||||||||||
list_connector_typeslist_connector_types -f "openai" |
List all connector types. |
|
||||||||||||||||
list_endpointslist_endpoints -f "gpt" |
List all endpoints. |
|
||||||||||||||||
list_prompt_templateslist_prompt_templates -f "toxicity" |
List all prompt templates. |
|
||||||||||||||||
update_endpointupdate_endpoint openai-gpt4 "[('name', 'my-special-openai-endpoint'), ('uri', 'my-uri-loc'), ('token', 'my-token-here')]" |
Update an endpoint. |
|
||||||||||||||||
view_endpointview_endpoint openai-gpt4 |
View an endpoint. |
|
||||||||||||||||
add_bookmarkadd_bookmark openai-connector 2 my-bookmarked-prompt |
Bookmark a prompt. |
|
||||||||||||||||
clear_context_strategyclear_context_strategy |
Clear the context strategy set in a session. | - | ||||||||||||||||
clear_prompt_templateclear_prompt_template |
Clear the prompt_template set in a session. | - | ||||||||||||||||
delete_attack_moduledelete_attack_module sample_attack_module |
Delete an attack module. |
|
||||||||||||||||
delete_bookmarkdelete_bookmark my_bookmarked_prompt |
Delete a bookmark. |
|
||||||||||||||||
delete_context_strategydelete_context_strategy add_previous_prompt |
Delete a context strategy. |
|
||||||||||||||||
delete_sessiondelete_session my-test-runner |
Delete a session. |
|
||||||||||||||||
end_sessionend_session |
End the current red teaming session. | - | ||||||||||||||||
export_bookmarksexport_bookmarks "my_list_of_exported_bookmarks" |
Exports bookmarks as a JSON file |
|
||||||||||||||||
list_attack_moduleslist_attack_modules -f "text" |
List all attack modules. |
|
||||||||||||||||
list_bookmarkslist_bookmarks -f my_bookmark |
List all bookmarks. |
|
||||||||||||||||
list_context_strategieslist_context_strategies -f "previous_prompt" |
List all context strategies. |
|
||||||||||||||||
list_sessionslist_sessions -f "my-sessions" |
List all sessions. |
|
||||||||||||||||
new_sessionnew_session my-runner -e "['openai-gpt4']" -c add_previous_prompt -p mmlu |
Creates a new red teaming session. |
|
||||||||||||||||
Automated red teamingrun_attack_module sample_attack_module "this is my prompt" -s "test system prompt" -m bleuscore |
Runs automated red teaming in the current session. |
|
||||||||||||||||
show_promptsshow_prompts |
Show the prompts in the session. | - | ||||||||||||||||
use_bookmarkuse_bookmark my_bookmark |
Use a bookmarked prompt. |
|
||||||||||||||||
use_context_strategyuse_context_strategy my_strategy_one |
Use a context strategy. |
|
||||||||||||||||
use_prompt_templateuse_prompt_template 'analogical-similarity' |
Use a prompt template. |
|
||||||||||||||||
use_sessionuse_session 'my-runner' |
Use an existing red teaming session by specifying the runner ID. |
|
||||||||||||||||
view_bookmarkview_bookmark my_bookmarked_prompt |
View a bookmark. |
|