Choosing Relevant Tests
- Click on ‘Get Started’
-
This page lists the cookbooks that Moonshot provides. Each cookbook contains tests of the same theme. Select the areas that are relevant to your use case. This is not final as you will be able to further curate the scope and scale of the tests in following steps.
View full list of cookbook details
Note
Some of these cookbooks contain scoring metrics that require connection to specific models.
MLCommons AI Safety Benchmarks v0.5 (Requires an API key for accessing Llama Guard via Together AI)
Facts about Singapore (Requires an API key for accessing Llama Guard via Together AI)
To provide the TogetherAI API key, edit the “Together Llama Guard 7B Assistant” endpoint. (Note that you don’t need to select this endpoint in the benchmarking session.) If you would like to use an alternative Llama Guard 7B assistant, see the FAQ on how to do so.
-
When done, click on the next button.
-
The total number of prompts in the cookbooks selected is displayed. Later on, you can specify the number of prompts per dataset that will be executed. Click on ‘these cookbooks’ to see in greater detail what tests will be run.
-
This page shows you the cookbooks available in Moonshot, categorised according to Capability, Trust & Safety, Quality and Others (for cookbooks without any categories).
You can click on ‘About’ for each cookbook to see what recipes it contains.
Check the ‘Run this cookbook’ checkbox if you wish to run any of the cookbooks. Click on ‘X’ to close the pop-up.
You can also unselect cookbooks if you do not wish to run them.
Click on ‘OK’ once you are satisfied with the cookbooks to be run. The total number of prompts to be sent should be updated. (There will be a step later on in the workflow for you to run a smaller number of prompts)
Click on the next button.