Quick start

Distributed tracing using Maxim

Learn how to use Maxim's distributed tracing solution for your GenAI application.

For this section, let's consider that you are building an enterprise search chatbot (similar to Glean).

  • Companies connect their data sources like Google Drive, Dropbox, etc.
  • Company employees can search for all this information using natural language on Slack or a custom search page.

System architecture

The diagram below represents the system architecture.

System architecture

This system consists of 5 micro-services

  1. API gateway that does
    • User authentication
    • API routing
  2. Planner
    • Plans execution for incoming query
  3. Intent detector
    • Detects the intent of the given query
  4. Answer generator
    • Generates the prompt for fetching answers based on planner instruction and context fetched from the RAG system
  5. RAG
    • This is the RAG pipeline that fetches a relevant chunk of information from a vector database.

Setting up the Maxim dashboard

Create a new repository; let's call it 'Chatbot Production'.

Go to Settings -> API Keys, and generate a new API Key. Copy the API key and store it somewhere safe.

Set up SDK in your repository.

npm install @maximai/maxim-js
Python
pip install maxim-py
Go
go get github.com/maximhq/maxim-go
Java
compileOnly("ai.getmaxim:sdk:0.1.3")

Initialize Maxim logger. We will use the generated API key and the repository id that you have created in the previous steps.

import { Maxim } from "@maximai/maxim-js";
 
const maxim = new Maxim({ apiKey: "api-key" });
const logger = await maxim.logger({ id: "log-repository-id" });

We will initialize logger in each service using the same repository id.

You can place this code in the main file of your application. Once initialized, you can use the logger to log your application's events.

In your route handler (api gateway service), lets create a trace. We will use cf-request-id as our trace id.

const trace = logger.trace({
    id: req.headers["cf-request-id"],
    name: "user-query",
    tags: {
        userId: req.body.userId,
        accountId: req.body.accountId
    },
});

Once this trace is created, you can manipulate it in two ways.

  1. Using logger and trace id
// Adding a new tag
logger.traceTag(req.getHeader("cf-request-id"),"newTag","newValue");
logger.traceEnd(req.getHeader("cf-request-id"));
  1. Creating trace object again with the same trace id
const trace = logger.trace({id: req.headers["cf-request-id"]});
// Adding a new tag
trace.addTag("newTag","newValue");
trace.end();

Here you don't have to pass all the parameters again. You can just reuse the trace object and update the tags.

You can manipulate every component of Maxim observability framework (Span, Generation, Retrieval, Event) in the same way.

Now lets create a span for planner, in planner service. We will use cf-request-id as our trace id.

// Getting hold of trace using request id / trace id
const trace = logger.trace({id: req.headers["cf-request-id"]});
// Creating a new span
const span = trace.span({
    id: uuid(),
    name: "plan-query",
    tags: {
        userId: req.body.userId,
        accountId: req.body.accountId
    },
});

Now lets add LLM call i.e. Generation in the planner span.

// Creating a new generation
const generation = span.generation({
    id: uuid(),
    name: "plan-query",
    provider: "openai",
    model: "gpt-3.5-turbo-16k",
    modelParameters: { temperature: 0.7 },
    tags: {
        userId: req.body.userId,
        accountId: req.body.accountId
    },
});

Once you receive a response, you can log it as follows.

generation.result({
    id: uuid(),
    object: "chat.completion",
    created: Date.now(),
    model: "gpt-3.5-turbo-16k",
    choices: [{
        index: 0,
        message: {
            role: "assistant",
            content: "response"
        },
        finish_reason: "stop"
    }],
    usage: {
        prompt_tokens: 100,
        completion_tokens: 50,
        total_tokens: 150
    }
});

Maxim currently supports OpenAI messaging format. We have helper methods to convert other messaging formats to OpenAI format in the SDK. Calling .result on a generation, also ends the generation.

Using the trace object initialized with the same cf-request-id, you can create additional spans across your entire request lifecycle in any service. Each span represents a distinct operation or service call in your application, such as database queries, external API calls, or processing steps.

When creating spans, consider adding relevant tags that provide context about the operation being performed. These tags help in filtering and analyzing traces later. Remember to end each span once its operation completes to ensure accurate timing measurements.

As these operations are executed across your services, the logs will appear within a few seconds on your Maxim dashboard. The dashboard provides a comprehensive view of your entire request trace, including all spans, generations, and retrievals, giving you end-to-end visibility into your application's behavior and performance.

On this page