Evaluating Workflow
Trigger a test run
Once you've configured your workflow, datasets, and evaluators, you can effortlessly test your workflow on the designated test dataset with a single click from the workflows tabs.
To execute a test run for the workflow, follow these steps:
- Add your API endpoint.
- Trigger the workflow by pressing the Run button.
- Map the output you want to evaluate (e.g.,
data.response
). - Click the "Test" button in the top right corner.
- Add a dataset.
- Add evaluators from the evaluator store. Note you have to add evaluators to your workspace before you can use them. Read more here.
- Toggle on the evaluators you want to test the workflow on.
- Finally, click "Trigger Test Run".
- If you have added human raters you would be prompted to enter the email IDs of the raters. Read more about human evaluation here
You will automatically be redirected to the test runs page where it will fetch output for each of the inputs in your test dataset and evaluate based on your chosen evaluators. Feel free to grab a coffee since this might take time depending on the number of entries in the dataset. You will always have access to this test and all other test runs in the Run tab in the left navigation menu.