LLM apps are tested across hundreds of scenarios that are critical for your use-case. You can specify these scenarios in your test dataset.

Each dataset consists of samples. Each sample has

  • id (string; optional): Unique identifier for the sample
  • inputs (object): Input parameters that define a scenario. The key is the name of the parameter, and the value is the value of the parameter for the sample
  • expected (string; optional): Used as the ground truth for this sample

Supported methods

Datasets can be specified in the configuration file directly, or imported from a file or HTTP endpoint.

Specify dataset directly

In this example, the LLM is asked to extract the user’s name from an incoming message. The message is provided under the user_message parameter in the dataset.

The LLM prompt can use this value in the prompt through the {{user_message}} placeholder.

"dataset": {
  "samples": [
    {
      "inputs": {
        "user_message": "Hi my name is John Doe."
      },
      "expected": "John Doe"
    },
    {
      "inputs": {
        "user_message": "This is Alice. Is anybody here?"
      },
      "expected": "Alice"
    }
  ]
}