Empirical can test how different models and model configurations work for your application. You can define which models and configurations to test in the configuration file.

Empirical supports a few types of model providers:

Hosted models

Popular models hosted by inference platforms (e.g. GPT-4o by OpenAI) can be directly specified in the Empirical run configuration with type as model.

"runs": [
  {
    "type": "model",
    "provider": "openai",
    "model": "gpt-4o",
    "prompt": "Hey I'm {{user_name}}"
  }
]

Custom scripts

For mature applications, or for those that require pre or post-processing around the model API call, it is recommended to write a custom script provider. That way, you can reference/import parts of your application and sharing code between your app and tests.

  • See the Python guide to configure models or apps defined as a Python module, with type py-script