If you are maintaining your test dataset in a file, chances are you can import it directly into Empirical.

Import from Google Sheets

Specify a path to the Google sheet in the configuration file. If your file has multiple sheets, be sure to use the url that points to the correct sheet (specified with the gid query parameter).

empiricalrc.json
"dataset": {
  "path": "https://docs.google.com/spreadsheets/d/1AsMekKCG74m1PbBZQN_sEJgaW0b9Xarg4ms4mhG3i5k"
}

Refer to our chatbot example which uses this dataset.

The sheet should contain column headers. The rows of the file are converted into dataset inputs with column header names as the name of the parameter. For example:

| name | age |
| ---- | --- |
| John | 25  |

The above table in the sheet gets converted into the following dataset object:

"dataset": {
  "samples": [
    {
      "inputs": {
        "name": "John",
        "age": "25"
      }
    }
  ]
}

The above conversion enables you to create prompt with placeholders. For example:

empiricalrc.json
{
  "prompt": "Your name is {{name}} and you are a helpful assistant..."
}

Import from JSONL file

Specify a path to the JSONL file. Each line of the file should be a valid JSON object. On import, the keys of this JSON will be converted into inputs of the sample.

If using relative paths, the path is treated relative to the configuration file.

"dataset": {
  "path": "HumanEval.jsonl"
}

Import from JSON

Specify a path to the JSON file. The file should contain array of objects. On import, the object keys will be converted into inputs of the sample.

If using relative paths, the path is treated relative to the configuration file.

"dataset": {
  "path": "dataset.json"
}

Refer to tool call example which uses this dataset.

Import from CSV

Specify a path to the CSV file in the empiricalrc.json. If using relative paths, the path is treated relative to the configuration file.

"dataset": {
  "path": "foo.csv"
}

The CSV file should contain headers.

The lines of the file are converted into dataset inputs with column header names as the name of the parameter. For example:

foo.csv
name,age
John,25

The above CSV gets converted into the following dataset object:

"dataset": {
  "samples": [
    {
      "inputs": {
        "name": "John",
        "age": "25"
      }
    }
  ]
}

The above conversion enables you to create a prompt with placeholders. For example:

{
  "prompt": "Your name is {{name}} and you are a helpful assistant..."
}