Integrating Test Runs with GitHub Actions

Learn how to integrate test runs with Maxim's GitHub Action.

GitHub actions enable you to automate your CI/CD pipeline. They provide a powerful way to run tests, build, and deploy your application. Our GitHub Action can seamlessly integrate with your existing deployment workflows, allowing you to ensure that your LLM is functioning as you expect.

Quick Start

In order to add the GitHub Action to your workflow, you can start by adding a step that uses maximhq/actions/test-runs@v1 as follows:

Please ensure that you have the following setup:

  • in GitHub action secrets
    • MAXIM_API_KEY
  • in GitHub action variables
    • WORKSPACE_ID
    • DATASET_ID
    • WORKFLOW_ID
.github/workflows/test-runs.yml
name: Run Test Runs with Maxim
 
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]
 
env:
  TEST_RUN_NAME: "Test Run via GitHub Action"
  CONTEXT_TO_EVALUATE: "context"
  EVALUATORS: "bias, clarity, faithfulness"
 
jobs:
  test_run:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout Repository
        uses: actions/checkout@v2
      - name: Running Test Run
        id: test_run
        uses: maximhq/actions/test-runs@v1
        with:
          api_key: ${{ secrets.MAXIM_API_KEY }}
          workspace_id: ${{ vars.WORKSPACE_ID }}
          test_run_name: ${{ env.TEST_RUN_NAME }}
          dataset_id: ${{ vars.DATASET_ID }}
          workflow_id: ${{ vars.WORKFLOW_ID }}
          context_to_evaluate: ${{ env.CONTEXT_TO_EVALUATE }}
          evaluators: ${{ env.EVALUATORS }}
      - name: Display Test Run Results
        if: success()
        run: |
          printf '%s\n' '${{ steps.test_run.outputs.test_run_result }}'
          printf '%s\n' '${{ steps.test_run.outputs.test_run_failed_indices }}'
          echo 'Test Run Report URL: ${{ steps.test_run.outputs.test_run_report_url }}'

This will trigger a test run on the platform and wait for it to complete before proceeding. The progress of the test run will be displayed in the Running Test Run section of the GitHub Action's logs as displayed below:

GitHub Action Running Test Run Logs

Inputs

The following are the inputs that can be used to configure the GitHub Action:

NameDescriptionRequired
api_keyMaxim API keyYes
workspace_idWorkspace ID to run the test run inYes
test_run_nameName of the test runYes
dataset_idDataset ID for the test runYes
workflow_idWorkflow ID to run for the test run (do not use with prompt_version_id)Yes (No if prompt_version_id is provided)
prompt_version_idPrompt version ID to run for the test run (do not use with workflow_id)Yes (No if workflow_id is provided)
context_to_evaluateVariable name to evaluate; could be any variable used in the workflow / prompt or a column nameNo
evaluatorsComma separated list of evaluator namesNo
human_evaluation_emailsComma separated list of emails to send human evaluations toNo (required in case there is a human evaluator in evaluators)
human_evaluation_instructionsOverall instructions for human evaluatorsNo
concurrencyMaximum number of concurrent test run entries runningNo (defaults to 10)
timeout_in_minutesFail if test run overall takes longer than this many minutesNo (defaults to 15 minutes)

Outputs

The outputs that are provided by the GitHub Action in case it doesn't fail are:

NameDescription
test_run_resultResult of the test run
test_run_report_urlURL of the test run report
test_run_failed_indicesIndices of failed test run entries

On this page