EvaluatorsCustom Evaluator Examples

Custom API evaluator

To demonstrate this feature, we have developed a demo model for sentiment analysis.

The model assigns a score to a given text where a negative score indicates negative sentiment, a positive score indicates positive sentiment and a score of 0 indicates a balanced sentiment. For example:

  • "I hate you" has a score of -0.81, indicating a negative sentiment.
  • "I am so happy to see this rainbow" has a score of 0.51, indicating a positive sentiment.

In the following steps, we will integrate this model into Maxim using the API evaluator.

Analyzer
import nltk
from textblob import TextBlob
 
# TextBlob for sentiment analysis
def analyze_sentiment(text):
    blob = TextBlob(text)
    polarity = blob.sentiment.polarity
    return polarity

In the above code, we use Textblob which is a library built on top of NLTK and Pattern and measures polarity as part of its sentiment analysis capabilities. Polarity indicates the sentiment expressed in a text, ranging from -1 to 1, where:

  • 1 represents a very negative sentiment,
  • 0 represents a neutral sentiment,
  • +1 represents a very positive sentiment.

TextBlob analyzes the text by breaking it down into individual words and phrases, then evaluates the sentiment based on a predefined lexicon where words are scored for their positive or negative sentiment. The overall polarity score is the average of the individual scores of words and phrases in the text.

API-based evaluator
from flask import Flask
from flask import request
import nltk
from textblob import TextBlob
app = Flask(__name__)
 
def analyze_sentiment(text):
    blob = TextBlob(text)
    polarity = blob.sentiment.polarity
    return polarity
 
@app.route('/model',methods=['POST'])
def sentiment():
    body = request.get_json()
    print(body)
    output = analyze_sentiment(body['query'])
    response = {'response':output}
    return response
 
 
# main driver function
if __name__ == '__main__':
 
    # run() method of Flask class runs the application
    # on the local development server.
    app.run(port=8000)

We then created a Flask app and exposed it to the public via an API using Ngrok. This can now be utilized on the Maxim platform for evaluation through the API evaluator.

To create an API-based evaluator you will have to follow these steps :

Create a new evaluator

  • Select "Evaluators" from the left panel.
  • Click on the "+" icon.
  • Choose "API-based"

Attach your API endpoint

  • Enter the API endpoint as done in the workflow.
  • In the Body, map the query to the output of your application.

Configure Grading

  • Map the score to the endpoint output.
  • Optionally, map the reasoning.
  • Select the pass criteria for each query and for every run.
  • Finally, save the evaluator to use it in the workflow.

Test run report

As you can see in the report, when each output is sent to the API evaluator, it returns a polarity score. A score greater than 0 indicates positive sentiment, while a score less than 0 indicates negative sentiment.

On this page