Skip to content

Run Moonshot using Command Line (CLI)

(CLI) How to Create Custom Benchmark Tests

In this detailed guide section, you will learn how to run a benchmark in Moonshot. Benchmarks are a set of "exam questions" that can help to evaluate and assess the capabilities and safety of the AI system.

  1. Change directory to the root directory of Moonshot.

  2. Enter the following command to enter the CLI interactive mode:

    python -m moonshot cli interactive
    
  3. Choose a benchmark type to run and view help:

    Warning

    Important information before running your benchmark:

    Certain benchmarks may require metrics that connect to a particular model (i.e. MLCommons cookbooks and recipes like mlc-cae use the metric llamaguardannotator, which requires the API token of together-llama-guard-7b-assistant endpoint).

    Refer to this list for the requirements.

    • Recipe

      To find out more about the required fields to create a recipe:

      run_recipe -h
      

      To run the help example, enter:

      run_recipe "my new recipe runner" "['bbq','mmlu']" "['openai-gpt35-turbo']" -n 1 -r 1 -s "You are an intelligent AI"
      
    • Cookbook:

      To find out more about the required fields to create a cookbook:

      run_cookbook -h
      

      To run the help example, enter:

      run_cookbook "my new cookbook runner" "['chinese-safety-cookbook']" "['openai-gpt35-turbo']" -n 1 -r 1 -s "You are an intelligent AI"
      
  4. View the results:

    • Recipe:

      recipe results

    • Cookbook:

      cookbook results

CLI modes

Two modes are available on the Moonshot CLI: Command-Based Mode and Interactive Mode.

Full list of commands in Moonshot
Initialisation
======================================================================================================
interactive           Run the interactive shell.                                                      
list_connect_types    Get a list of available Language Model (LLM) connection types.                  
list_endpoints        Get a list of available Language Model (LLM) endpoints.                         
version               Get the version of the application.                                             

Moonshot Benchmarking
======================================================================================================
add_cookbook          Add a new cookbook.                                                             
add_endpoint          Add a new endpoint.                                                             
add_recipe            Add a new recipe.                                                               
list_cookbooks        Get a list of available cookbooks.                                              
list_recipes          Get a list of available recipes.                                                
list_results          Get a list of available results.                                                
list_runs             Get a list of available runs.                                                   
resume_run            Resume an interrupted run.                                                      
run_cookbook          Run a cookbook.                                                                 
run_recipe            Run a recipe.                                                                   
view_cookbook         View a cookbook.                                                                
view_results          View a results file.                                                            

Moonshot RedTeaming
=======================================================================================================
end_session            End the current session.                                                        
list_prompt_templates  List all prompt templates available.                                            
list_sessions          List all available sessions.                                                    
new_session            Add a new red teaming session.                                                  
use_context_strategy   Use a context strategy.                                                         
use_prompt_template    Use a prompt template.                                                          
use_session            Use an existing red teaming session.                                            

Uncategorized
======================================================================================================
alias                 Manage aliases                                                                  
edit                  Run a text editor and optionally open a file with it                            
help                  List available commands or provide detailed help for a specific command         
history               View, run, edit, save, or clear previously entered commands                     
macro                 Manage macros                                                                   
quit                  Exit this application                                                           
run_pyscript          Run a Python script file inside the console                                     
run_script            Run commands in script file that is encoded as either ASCII or UTF-8 text       
set                   Set a settable parameter or show current settings of parameters                 
shell                 Execute a command as if at the OS prompt                                        
shortcuts             List available shortcuts                                                

Command-based Mode

In the command-based mode, run commands by prepending python -m moonshot cli.

For example,

  • To list all the available commands: python -m moonshot cli help
  • To list the connector types available: python -m moonshot cli list_connect_types

Interactive Mode

We recommend the interactive mode for a more efficient experience, especially if you are using Moonshot to red-team.

To enter interactive mode: python -m moonshot cli interactive (You should see the command prompt change to moonshot > ) For example,

  • To list all the available commands:
    moonshot > help
    
  • To list the connector types available:
    moonshot > list_connect_types
    

Add Your Own Benchmark Tests

In this section, we will be going through the steps required to add new test using CLI.

You will learn how to:

  • Add a new dataset into Moonshot
  • Add a new recipe to run a benchmark
  • Add a new cookbook to run a set of benchmarks

Launch Moonshot CLI

You can launch Moonshot CLI by running the following command:

python -m moonshot cli interactive

Create a New Cookbook

We can also create a new cookbook with our new recipe. A cookbook in Moonshot is a curated collection of recipes. A cookbook is very useful when the user wants to group a certain type of tests into a single execution.

Add Cookbook

To add a new cookbook, simply run the following command:

moonshot > add_cookbook [name] [description] [cookbooks]

The fields are as follows for this example:

  • Name (A unique name for the cookbook): My new cookbook
  • Description (A detailed explanation of the cookbook's purpose and the types of recipes it contains): I am cookbook description
  • Recipes (A list of recipe names that are included in the cookbook. Each recipe represents a specific test or benchmark): ['my-new-recipe','auto-categorisation']

You can also view the description of this command using the following command:

moonshot > add_cookbook -h

Use the following command to create a new cookbook with your newly created recipe:

add_cookbook 'My new cookbook' 'I am cookbook description' "['my-new-recipe','auto-categorisation']"

View Cookbook

Enter the following command to view your newly created cookbook:

view_cookbook my-new-cookbook

cookbook added

Create a New Recipe

To run the new Moonshot-compatible dataset that you have created in moonshot-data/datasets, we must first create a new recipe.

Note

A recipe contains all the details required to run a benchmark. A recipe guides Moonshot on what data to use, and how to evaluate the model's responses.

Add Recipe

In Moonshot CLI, the user can use add_recipe to add a new recipe in Moonshot. The parameters of the command are shown below:

  • Name (A unique name for the recipe): My new recipe
  • Description (An explanation of what the recipe does and what it's for): I am recipe description
  • Categories (Broader classifications that help organize recipes into collections): ['category1','category2']
  • Datasets (The data that will be used when running the recipe. This could be a set of prompts, questions, or any input that - the model will respond to): ['bbq-lite-age-ambiguous']
  • Metrics (Criteria or measurements used to evaluate the model's responses, such as accuracy, fluency, or adherence to a - prompt): ['bertscore','bleuscore']
  • Prompt Templates (Optional pre-prompt or post-prompt): ['analogical-similarity','mmlu']
  • Tags (Optional keywords that categorize the recipe, making it easier to find and group with similar recipes): ['tag1','tag2']
  • Attack Strategies (Optional components that introduce adversarial testing scenarios to probe the model's robustness): ['charswap_attack']
  • Grading Scale (Optional set of thresholds or criteria used to grade or score the model's performance): {'A':[80,100],'B':[60,79],'C':[40,59],'D':[20,39],'E':[0,19]}

You can also view the description of this command using the following command:

moonshot > add_recipe -h

Add a new recipe using the dataset that you have created in the previous section using the following command:

My new recipe' 'I am recipe description' "['category1','category2']" "['bbq-lite-age-ambiguous']" "['bertscore','bleuscore']" -p "['analogical-similarity','mmlu']" -t "['tag1','tag2']" -g "{'A':[80,100],'B':[60,79],'C':[40,59],'D':[20,39],'E':[0,19]}" 

View Recipe

Once created, you can view your recipe using view_recipe.

moonshot > view_recipe my-new-recipe

recipe added

Note

The ID of the recipe is created by slugifying the name. In this case, the ID of this recipe is my-new-recipe.

Create a New Dataset

You can convert your raw dataset into Moonshot-compatible dataset using the following schema. Once you have created the new dataset, you can save the file in moonshot-data/datasets for Moonshot to access this dataset.

Use your favourite text editor and save the following json data in moonshot-data/datasets/example-dataset.json.

{
    "name": "Fruits Dataset",
    "description":"Measures whether the model knows what is a fruit",
    "license": "MIT license",
    "reference": "",
    "examples": [
        {
            "input": "Is Lemon a Fruit? Answer Yes or No.",
            "target": "Yes."
        },
        {
            "input": "Is Apple a Fruit? Answer Yes or No.",
            "target": "Yes."
        },
        {
            "input": "Is Bak Choy a Fruit? Answer Yes or No.",
            "target": "No."
        },
        {
            "input": "Is Bak Kwa a Fruit? Answer Yes or No.",
            "target": "No."
        },
        {
            "input": "Is Dragonfruit a Fruit? Answer Yes or No.",
            "target": "Yes."
        },
        {
            "input": "Is Orange a Fruit? Answer Yes or No.",
            "target": "Yes."
        },
        {
            "input": "Is Coke Zero a Fruit? Answer Yes or No.",
            "target": "No."
        }
    ]
}

The name of the dataset is the unique identifier for the dataset. This will be used in the recipes.

Note

You can also refer to this Jupyter notebook example for more details how a dataset can be created.

Connecting Endpoints

In this section, we will be going through the steps required to create a connector endpoint.

Before we jump into executing tests and performing red teaming on LLMs, we have to first create a connector endpoint. This connector endpoint will help us to connect to a specific LLM.

For the following steps, they will be done in interactive mode in CLI. To activate interactive mode, enter:

python -m moonshot cli interactive

Using an Existing Connector Endpoint

  1. To view the connector endpoint available, enter:
    list_endpoints
    

You will see a list of available connector endpoints that we have created beforehand: list of endpoints

  1. If there is no connector endpoint for you here, you create your own connector endpoint here. Otherwise, enter the following command to modify the connector endpoint you want to use (e.g., adding your own API key):

    update_endpoint -h
    

    You should see a help example:

    update_endpoint openai-gpt4 "[('name', 'my-special-openai-endpoint'), ('uri', 'my-uri-loc'), ('token', 'my-token-here'), ('params', {'hello': 'world'})]"
    

    Here, we are updating a connector endpoint with the ID openai-gpt4. The keys and values to be updated are tuples in a list (i.e. update the key name with the valuemy-special-openai-endpoint)

  2. After you have used the update_endpoint command to update your connector endpoint. Enter the following command to view your updated connector endpoint:

    view_endpoint openai-gpt4
    

    endpoint updated

Creating a Connector Endpoint

  1. Enter the following command to understand more on how to create a connector endpoint

    add_endpoint -h
    

    You should see a help example:

    add_endpoint openai-connector 'OpenAI GPT3.5 Turbo 1106' MY_URI ADD_YOUR_TOKEN_HERE 1 2 'gpt-3.5-turbo-1106' "{'temperature': 0.5}"
    

    In this example, we are creating a connector endpoint for the openai-connector connector type:

    • Name of the Connector you want to use: my-openai-connector
    • Name of your new Connector Endpoint: OpenAI GPT3.5 Turbo 1106
    • Uri: ADD_YOUR_TOKEN_HERE
    • API token: ADD_YOUR_TOKEN_HERE
    • Max number of calls made to the endpoint per second: 1
    • Max concurrency of the endpoint:2
    • Model of the endpoint you want to connect to 'gpt-3.5-turbo-1106'
    • Other parameters that this endpoint may need:

      • Temperature: 0.5

      To view the list of connector types, enter list_connector_types: list of connector types

  2. After you have used the add_endpoint command to create your endpoint, enter the following command to view your newly created connector endpoint:

    view_endpoint openai-gpt3-5-turbo-1106
    

    NOTE: The ID (openai-gpt3-5-turbo-1106) of the connector endpoint is created by slugifying the name.

    endpoint connected

Run Red Teaming Sessions

In this section, we will be going through the steps required to run red teaming sessions.

To run a test, you will need:

  • Connector Endpoint - a configuration file to connect to your desired LLM endpoint
  • Session - a session allows users to perform manual and automated red teaming on the LLMs, and stores the prompts and responses to and fro.
  • Prompt - a prompt that you will be sending to LLMs in manual red teaming/ a starting prompt to input in attack modules before sending to the LLMs

For the following steps, they will be done in interactive mode in CLI. To activate interactive mode, enter:

python -m moonshot cli interactive

Create a Connector Endpoint

If you have not already created a connector endpoint, check out the guide here.

Create a Session

Once your connector endpoint is created, we can start creating our session for red teaming.

Every session must reside in a runner. Before we create a session, enter the following command to view a list of runners currently available by entering:

list_runners

list of runners

There are two options to create a session: you can either use an existing runner, or create a new runner with a session. To better understand its usage, enter the following command:

new_session -h
  1. Use existing runner.

    • Example:

      new_session my-test-mrt -c add_previous_prompt -p mmlu
      
      • Runner ID: my-test-mrt
      • Context strategy: add_previous_prompt
      • Prompt template: mmlu

    create session with existing runner

    NOTE: Context strategy and prompt template are optional and can be set later so you can omit the -c -p flags if you do not need them

  2. Create new runner.

    • Example:

      new_session my-new-runner-test-mrt -e "['openai-gpt35-turbo','openai-gpt4']" -p phrase-relatedness
      
      • Runner ID: my-new-runner-test-mrt
      • Endpoint: ['openai-gpt35-turbo','openai-gpt4']
      • Prompt template: phrase-relatedness

    create session with new runner

Once you have a session created and activated, we can proceed with red teaming. There are two ways to perform red teaming: manual red teaming and using attack modules to perform automated attacks.

Manual Red Teaming

From the previous section, you should have a session created and activated. For manual red teaming, you can start by typing something in the session and that prompt will be sent to all the LLMs in that session. manual red teaming pt

NOTE: Anything entered in a session that is not a command will be considered a prompt and sent to the LLMs in that session!

Automated Red Teaming (Run Attack Modules)

We will use the same session from manual red teaming in this example. Enter the following command to find out more about running attack modules:

run_attack_module -h

You should see a help example:

run_attack_module charswap_attack "The grandpa and his son went to does not know how to use a phone?\na)Grandpa\nb)Son."

The fields are as follows for this example:

  • Attack module ID: charswap_attack
  • Prompt: The grandpa and his son went to does not know how to use a phone?\na)Grandpa\nb)Son.

automated red teaming

Refer to this section for more information on automated red teaming

Ending and Resuming a Session

End session - Once you are done with red teaming, you can close the session by entering:

end_session

View sessions - You can view your sessions by entering:

list_sessions

Resume session - You can resume a session by entering:

use_session <desired session id>

where <desired session id> is an id in list_sessions. When you resume a session, the state of your previous red teaming attempts will be restored.

Configurations in a Session

  • These are the configurations you can set in a session:

    • Context strategy: a Python module that helps to add context to the current prompt (i.e. add in the previous five prompts sent.)

      To use a context strategy:

      use_context_strategy <desired context strategy id>
      

      You can use the following command to view the list of context strategies available:

      list_context_strategies
      

      The <desired context strategy id> should correspond to an Id in list_context_strategies.

      • It is also possible to set the number of previous prompts to use with a context strategy. For example, to add 8 previous prompts as context using the add_previous_prompt, use the command:
        use_context_strategy add_previous_prompt -n 8
        

      To clear a context strategy in a session, use:

      clear_context_strategy
      
    • Prompt template: a JSON file which contains static texts that is appended to every prompt before they are sent to the LLMs.

      To use a prompt template:

      use_prompt_template <desired prompt template id>
      

      You can use the following command to view the list of prompt templates available:

      list_prompt_templates
      

      The <desired prompt template id> should correspond to an Id in list_prompt_templates.

      To clear a prompt template in a session, use:

      clear_prompt_template
      

More About Automated Red Teaming

Currently, automated red teaming heavily relies on the attack module being used. We have created a class, AttackModule, which serves as the base class for creating custom attack modules within the Moonshot framework. This class provides a structure that red teamers can extend to implement their own adversarial attack strategies.

In the AttackModule class, we have simplified the process for red teamers by providing easy access to necessary components for red teaming, such as connector endpoints and a function to automatically wrap the prompt template and context strategy contents around the provided prompt.

The design is very free-form, thus it is entirely up to the attack module developers whether they want to use the functions we have prepared. For instance, they may choose not to use the context strategy and prompt template at all in the attack module, even though these may be set in the session.

List of CLI Commands

Command Description Parameters
add_cookbook
add_cookbook 'My new cookbook' 'I am cookbook description' "['analogical-similarity','auto-categorisation']"
Add a new cookbook. The 'name' argument will be slugified to create a unique identifier.
name (str)Name of the new cookbook
example: 'My new cookbook'
description (str)Description of the new cookbook
example: I am cookbook description'
recipes (str)List of recipes to be included in the new cookbook
example: "['analogical-similarity','auto-categorisation']"
add_recipe
add_recipe 'My new recipe' 'I am recipe description' "['category1','category2']" "['bbq-lite-age-ambiguous']" "['bertscore','bleuscore']" -p "['analogical-similarity','mmlu']" -t "['tag1','tag2']" -g "{'A':[80,100],'B':[60,79],'C':[40,59],'D':[20,39],'E':[0,19]}"
Add a new recipe. The 'name' argument will be slugified to create a unique identifier.
name (str)Name of the new recipe
example: 'My new recipe'
description (str)Description of the new recipe
example: 'I am recipe description'
categories (str) List of categories to be included in the new recipe (currently in string format). It will be converted into a list in the backend
example: "['category1','category2']"
-t, --tags (str)List of tags to be included in the new recipe
example: "['tag1','tag2']"
datasets (str)The dataset to be used
example: "['bbq-lite-age-ambiguous']"
-p, --prompt_templates (str)List of prompt templates to be included in the new recipe
example: "['analogical-similarity','mmlu']"
metrics (str)List of metrics to be included in the new recipe
example: "['bertscore','bleuscore']"
-g, --grading_scale (str)Dict of grading scale for the metric to be included in the new recipe
example: "{'A':[80,100],'B':[60,79],'C':[40,59],'D':[20,39],'E':[0,19]}"
delete_cookbook
delete_cookbook my-new-cookbook
Delete a cookbook.
cookbook (str)Id of the cookbook
example: my-new-cookbook
delete_dataset
delete_dataset bbq-lite-age-ambiguous
Delete a dataset.
dataset (str)Name of the dataset
example: bbq-lite-age-ambiguous
delete_metric
delete_metric my-new-metric
Delete a metric.
metric (str)Name of the metric
example: my-new-metric
delete_recipe
delete_recipe my-new-recipe
Delete a recipe.
recipe (str)Id of the recipe
example: my-new-recipe
delete_result
delete_result my-new-cookbook-runner
Delete a result.
result (str)Name of the result
example: my-new-cookbook-runner
delete_runner
delete_runner my-new-cookbook-runner
Delete a runner.
runner (str)Name of the runner
example: my-new-cookbook-runner
list_cookbooks
list_cookbooks -f "risk"
List all cookbooks.
-f, --find (str)Optional field to find cookbook(s) with keyword
example: risk
-p, --pagination (str)Optional tuple to paginate cookbook(s). E.g. (2,10) returns 2nd page with 10 items in each page.
example: (2,10)
list_datasets
list_datasets -f "bbq"
List all datasets.
-f, --find (str)Optional field to find dataset(s) with keyword
example: bbq
-p, --pagination (str)Optional tuple to paginate dataset(s). E.g. (2,10) returns 2nd page with 10 items in each page.
example: (2,10)
list_metrics
list_metrics -f "exact"
List all metrics.
-f, --find (str)Optional field to find metric(s) with keyword
example: "exact"
-p, --pagination (str)Optional tuple to paginate metric(s). E.g. (2,10) returns 2nd page with 10 items in each page.
example: (2,10)
list_recipes
list_recipes -f "mmlu"
List all recipes.
-f, --find (str)Optional field to find recipe(s) with keyword
example: mmlu
-p, --pagination (str)Optional tuple to paginate recipes(s). E.g. (2,10) returns 2nd page with 10 items in each page.
example: (2,10)
list_results
list_results -f "my-runner"
List all results.
-f, --find (str)Optional field to find result(s) with keyword
example: my-runner
-p, --pagination (str)Optional tuple to paginate result(s). E.g. (2,10) returns 2nd page with 10 items in each page.
example: (2,10)
list_runners
list_runners
List all runners. -
list_runs
list_runs -f "my-run"
List all runs.
-f, --find (str)Optional field to find run(s) with keyword
example: my-run
-p, --pagination (str)Optional tuple to paginate run(s). E.g. (2,10) returns 2nd page with 10 items in each page.
example: (2,10)
run_cookbook
run_cookbook "my new cookbook runner" "['chinese-safety-cookbook']" "['openai-gpt35-turbo']" -n 1 -r 1 -s "You are an intelligent AI"
Run a cookbook.
name (str)Name of cookbook runner
example: "my new cookbook runner"
cookbooks (str)List of cookbooks to run
example: "['chinese-safety-cookbook']"
endpoints (str)List of endpoints to run
example: "['openai-gpt35-turbo']"
-n, --prompt_selection_percentage (int)Percentage of prompts to run
example: 1
-r, --random_seed (int)Random seed number
example: 1
-s, --system_prompt (str)System Prompt to use
example: "You are an intelligent AI"
-l, --runner_proc_module (str)Runner processing module to use. Defaults to use the benchmarking module
-o, --result_proc_module (str)Result processing module to use. Defaults to use the benchmarking-result module
run_recipe
run_recipe "my new recipe runner" "['bbq','mmlu']" "['openai-gpt35-turbo']" -n 1 -r 1 -s "You are an intelligent AI"
Run a recipe.
name (str)Name of recipe runner
example: "my new recipe runner"
recipes (str)List of recipes to run
example: "['bbq','mmlu']"
endpoints (str)List of endpoints to run
example: "['openai-gpt35-turbo']"
-n, --prompt_selection_percentage (int)Percentage of prompts to run
example: 1
-r, --random_seed (int)Random seed number
example: 1
-s, --system_prompt (str)System Prompt to use
example: "You are an intelligent AI"
-l, --runner_proc_module (str)Runner processing module to use. Defaults to use the benchmarking module
-o, --result_proc_module (str)Result processing module to use. Defaults to use the benchmarking-result module
update_cookbook
update_cookbook my-new-cookbook "[('name', 'Updated Cookbook Name'), ('description', 'Updated description'), ('recipes', ['analogical-similarity'])]"
Update a cookbook.
cookbook (str)Id of the cookbook
example: my-new-cookbook
update_values (str)Update cookbook key/value
example: "[('name', 'Updated Cookbook Name'), ('description', 'Updated description'), ('recipes', ['analogical-similarity'])]"
update_recipe
update_recipe my-new-recipe "[('name', 'Updated Recipe Name'), ('tags', ['fairness', 'bbq'])]"
Update a recipe.
recipe (str)Id of the recipe
example: my-new-recipe
update_values (str)Update recipe key/value
example: "[('name', 'Updated Recipe Name'), ('tags', ['fairness', 'bbq'])]"
view_cookbook
view_cookbook my-new-cookbook
View a cookbook.
cookbook (str)Id of the cookbook
example: my-new-cookbook
view_dataset
view_dataset bbq-lite-age-ambiguous
View a dataset file.
dataset_filename (str)Name of the dataset file
example: bbq-lite-age-ambiguous
view_metric
view_metric my-new-metric
View a metric file.
metric_filename (str)Name of the metric file
example: my-new-metric
view_recipe
view_recipe my-new-recipe
View a recipe.
recipe (str)Id of the recipe
example: my-new-recipe
view_result
view_result my-new-cookbook-runner
View a result file.
result_filename (str)Name of the result file
example: my-new-cookbook-runner
view_run
view_run my-new-cookbook-runner
View a runner runs.
runner_id (str)Name of the runner
example: my-new-cookbook-runner
view_runner
view_runner my-new-cookbook-runner
View a runner.
runner (str)Name of the runner
example: my-new-cookbook-runner
add_endpoint
add_endpoint openai-connector 'OpenAI GPT3.5 Turbo 1106' MY_URI ADD_YOUR_TOKEN_HERE 1 1 "{'temperature': 0.5, 'model': 'gpt-3.5-turbo-1106'}"
Add a new endpoint. The 'name' argument will be slugified to create a unique identifier.
connector_type (str)Type of connection for the endpoint
example: openai-connector
name (str)Name of the new endpoint
example: 'OpenAI GPT3.5 Turbo 1106'
uri (str)URI of the new endpoint
example: MY_URI
token (str)Token of the new endpoint
example: ADD_YOUR_TOKEN_HERE
max_calls_per_second (int)Max calls per second of the new endpoint
example: 1
max_concurrency (int)Max concurrency of the new endpoint
example: 1
params (str)Params of the new endpoint
example: "{'temperature': 0.5, 'model': 'gpt-3.5-turbo-1106'}"
convert_dataset
convert_dataset 'dataset-name' 'A brief description' 'http://reference.com' 'MIT' '/path/to/your/file.csv'
Convert your dataset. The 'name' argument will be slugified to create a unique identifier.
name (str)Name of the new dataset
description (str)Description of the new dataset
reference (str)Reference of the new dataset
license (str)License of the new dataset
csv_file_path (str)Path to the your existing dataset
delete_endpoint
delete_endpoint openai-gpt4
Delete an endpoint.
endpoint (str)ID of the endpoint
example: openai-gpt4
delete_prompt_template
delete_prompt_template squad-shifts
Delete a prompt template.
prompt_template (str)The ID of the prompt template to delete
example: squad-shifts
download_dataset
download_dataset 'dataset-name' 'A brief description' 'http://reference.com' 'MIT' "{'dataset_name': 'cais/mmlu', 'dataset_config': 'college_biology', 'split': 'dev', 'input_col': ['question','choices'], 'target_col': 'answer'}"
Download dataset from Hugging Face. The 'name' argument will be slugified to create a unique ID.
name (str)Name of the new dataset
description (str)Description of the new dataset
reference (str)Reference of the new dataset
license (str)License of the new dataset
params (str)Params of the new dataset in dictionary format. For example: {'dataset_name': 'cais_mmlu', 'dataset_config': 'college_biology', 'split': 'test', 'input_col': ['questions','choices'], 'target_col': 'answer'}"
delete_endpoint
delete_endpoint openai-gpt4
Delete an endpoint.
endpoint (str)ID of the endpoint
example: openai-gpt4
list_connector_types
list_connector_types -f "openai"
List all connector types.
-f, --find (str)Optional field to find connector type(s) with keyword
example: openai
-p, --pagination (str)Optional tuple to paginate connector type(s). E.g. (2,10) returns 2nd page with 10 items in each page.
example: (2,10)
list_endpoints
list_endpoints -f "gpt"
List all endpoints.
-f, --find (str)Optional field to find endpoint(s) with keyword
example: gpt
-p, --pagination (str)Optional tuple to paginate endpoint(s). E.g. (2,10) returns 2nd page with 10 items in each page.
example: (2,10)
list_prompt_templates
list_prompt_templates -f "toxicity"
List all prompt templates.
-f, --find (str)Optional field to find prompt template(s) with keyword
example: toxicity
-p, --pagination (str)Optional tuple to paginate prompt template(s). E.g. (2,10) returns 2nd page with 10 items in each page.
example: (2,10)
update_endpoint
update_endpoint openai-gpt4 "[('name', 'my-special-openai-endpoint'), ('uri', 'my-uri-loc'), ('token', 'my-token-here')]"
Update an endpoint.
endpoint (str)ID of the endpoint. This field is not editable via CLI after creation.
example: openai-gpt4
update_kwargs (str)Update endpoint key/value
example: "[('name', 'my-special-openai-endpoint'), ('uri', 'my-uri-loc'), ('token', 'my-token-here')]"
view_endpoint
view_endpoint openai-gpt4
View an endpoint.
endpoint (str)ID of the endpoint
example: openai-gpt4
add_bookmark
add_bookmark openai-connector 2 my-bookmarked-prompt
Bookmark a prompt.
endpoint (str)Endpoint which the prompt was sent to.
example: openai-connector
prompt_id (int)ID of the prompt (the leftmost column)
example: 2
bookmark_name (str)Name of the bookmark
example: my-bookmarked-prompt
clear_context_strategy
clear_context_strategy
Clear the context strategy set in a session. -
clear_prompt_template
clear_prompt_template
Clear the prompt_template set in a session. -
delete_attack_module
delete_attack_module sample_attack_module
Delete an attack module.
attack_module (str)The ID of the attack module to delete
example: sample_attack_module
delete_bookmark
delete_bookmark my_bookmarked_prompt
Delete a bookmark.
bookmark_name (str)Name of the bookmark
example: my_bookmarked_prompt
delete_context_strategy
delete_context_strategy add_previous_prompt
Delete a context strategy.
context_strategy (str)The ID of the context strategy to delete
example: add_previous_prompt
delete_session
delete_session my-test-runner
Delete a session.
session (str)The runner ID of the session to delete
example: my-test-runner
end_session
end_session
End the current red teaming session. -
export_bookmarks
export_bookmarks "my_list_of_exported_bookmarks"
Exports bookmarks as a JSON file
bookmark_list_name (str)Name of the exported bookmarks JSON file you want to save as (without the .json extension)
example: my_list_of_exported_bookmarks
list_attack_modules
list_attack_modules -f "text"
List all attack modules.
-f, --find (str)Optional field to find attack module(s) with keyword
example: text
-p, --pagination (str)Optional tuple to paginate attack module(s). E.g. (2,10) returns 2nd page with 10 items in each page.
example: (2,10)
list_bookmarks
list_bookmarks -f my_bookmark
List all bookmarks.
-f, --find (str)Optional field to find bookmark(s) with keyword
example: my_bookmark
-p, --pagination (str)Optional tuple to paginate bookmark(s). E.g. (2,10) returns 2nd page with 10 items in each page.
example: (2,10)
list_context_strategies
list_context_strategies -f "previous_prompt"
List all context strategies.
-f, --find (str)Optional field to find context strategies with keyword
example: previous_prompt
-p, --pagination (str)Optional tuple to paginate context strategies(s). E.g. (2,10) returns 2nd page with 10 items in each page.
example: (2,10)
list_sessions
list_sessions -f "my-sessions"
List all sessions.
-f, --find (str)Optional field to find session(s) with keyword
example: my-sessions
-p, --pagination (str)Optional tuple to paginate session(s). E.g. (2,10) returns 2nd page with 10 items in each page.
example: (2,10)
new_session
new_session my-runner -e "['openai-gpt4']" -c add_previous_prompt -p mmlu
Creates a new red teaming session.
runner_id (str)ID of the runner. Creates a new runner if runner does not exist.
example: my-runner
-e, --endpoints (str)List of endpoint(s) for the runner that is only compulsory for creating a new runner.
example: "['openai-gpt4']"
-c, --context_strategy (str)Name of the context_strategy to be used - indicate context strategy here if you wish to use with the selected attack.
example: add_previous_prompt
-p, --prompt_template (str)Name of the prompt template to be used - indicate prompt template here if you wish to use with the selected attack.
example: mmlu
Automated red teaming
run_attack_module sample_attack_module "this is my prompt" -s "test system prompt" -m bleuscore
Runs automated red teaming in the current session.
attack_module_id (str)ID of the attack module.
example: sample_attack_module
prompt (str)Prompt to be used for the attack.
example: "this is my prompt"
-s, --system_prompt (str)System Prompt to be used for the attack. If not specified, the default system prompt will be used.
example: "test system prompt"
-c, --context_strategy (str)Name of the context strategy module to be used. If this is set, it will overwrite the context strategy set in the session while running this attack module.
example: add_previous_prompt
-n, --cs_num_of_prev_prompts (str)The number of previous prompts to use with the context strategy. If this is set, it will overwrite the number of previous prompts set in the session while running this attack module.
example: 5
-p, --prompt_template (str)Name of the prompt template to be used. If this is set, it will overwrite the prompt template set in the session while running this attack module.
example: mmlu
-m, --metric (str)Name of the metric module to be used.
example: bleuscore
-o, --optional_args (str)Optional parameters to input into the red teaming module.
example: "{'my_attack_module_custom_field': 1.0}"
show_prompts
show_prompts
Show the prompts in the session. -
use_bookmark
use_bookmark my_bookmark
Use a bookmarked prompt.
bookmark_name (str)Name of the bookmark
example: my_bookmark
use_context_strategy
use_context_strategy my_strategy_one
Use a context strategy.
context_strategy (str)The ID of the context strategy to use
example: my_strategy_one
-n, --num_of_prev_prompts (int)The number of previous prompts to use with the context strategy
example: 6
use_prompt_template
use_prompt_template 'analogical-similarity'
Use a prompt template.
prompt_template (str)Name of the prompt template
example: analogical-similarity
use_session
use_session 'my-runner'
Use an existing red teaming session by specifying the runner ID.
runner_id (str)The ID of the runner which contains the session you want to use.
example: my-runner
view_bookmark
view_bookmark my_bookmarked_prompt
View a bookmark.
bookmark_name (str)Name of the bookmark you want to view
example: my_bookmarked_prompt