Concepts

Learn about the key concepts of Maxim's AI Observability

Maxim's observability platform builds upon established distributed tracing principles while extending them for GenAI-specific monitoring. Our architecture leverages proven concepts and enhances them with specialized components for AI applications.

Log Repository

A log repository in Maxim is a dedicated storage and management system for collecting, organizing, and maintaining logs from GenAI services. Each repository functions as a discrete unit within the platform's observability framework.

Single vs. Multiple Repositories

  • Single Repository: Consolidated logging for all system components into one single repository.
  • Multiple Repositories: Separate repositories per service.

We recommend implementing separate log repositories for services managed by distinct teams.

Overview Tab

The Overview tab provides a comprehensive snapshot of activity within your log repository for your specified timeframe. You can slice and dice these metrics using filters.

Logs

The Logs tab displays detailed records of AI interactions in a tabular format.

Each log entry contains:

FieldDescription
Timestamp (IST)Records the date and time of each interaction
InputShows the user's prompt or query
OutputDisplays the AI model's response in JSON format with intent classification
ModelIndicates the AI model used (e.g., gpt-4o in this instance)
TokensShows token count per interaction
CostLists the cost per interaction in USD
LatencyRecords response time in milliseconds
User feedbackUser feedback that is recorded for this trace (if available)

The interface includes filtering capabilities and time range selection via the 'Last 1 hour' dropdown and 'Filter' button. Users can toggle between Traces and Sessions views for different analytical perspectives.

Evaluation

The Evaluation tab lets you go through all the online evaluations as they happen. You can configure your log repositories to be evaluated in real-time based on a sampling rate, a set of rules and a set of evaluators. Learn more about it here.

Alerts

Once you start logging, you can set up alerts to monitor specific metrics. Alerts are triggered when the metric exceeds a certain threshold, and they can be configured to notify you via slack, pagerduty or opsgenie.

Components of logs

Following are the components of logs:

Session

Session is a top level entity that captures all the multi-turn interactions of your system. For example, if you are building a chatbot, a session in Maxim logger is an entire conversation of a user with your chatbot.

Sessions are long running entities - you can keep on adding different traces to it over the course of time unless you want to explicitly close the session.

Trace

trace

In distributed tracing, a trace is the complete processing of a request through a distributed system, including all the actions between the request and the response.

PropertyDescription
idunique identifier for the trace. this usually can be your request id
name (optional)name of the trace. you can keep it same as your API call. i.e. chatQuery
tags (optional)these are key-value pairs which you can use for filtering etc on the dashboard. There is no limit on the number of tags and size of the string - but lower is better and faster for search performance specific to your repo.
input (optional)this could be the user message that was sent to the API call. We use this input to show it on our logs dashboard.
For example: in the example dashboard - trace.input = About Slack connect approve.
output (optional)this could be the final response your API sends back. We use this output to show it on our logs dashboard.
For example: in the example dashboard - trace.out =

Span

span

Spans are fundamental building blocks of distributed tracing. A single trace in distributed tracing consists of a series of tagged time intervals known as spans. Spans represent a logical unit of work in completing a user request or transaction.

Sub-spans

A span can have other spans as children. You can create as many subspans as you want to logically group flows within a span.

PropertyDescription
idunique identifier for the span. it can be uuid() for each new trace. This has to be unique across all elements in a trace. If you these are duplicated, data gets overwritten.
namename of the trace. you can keep it same as your API call. i.e. chatQuery
tagsThese are span specific tags.
spanIdParent span id. These are automatically generated when you call span.addSpan().

Event

event

Events mark significant points within a span or a trace recording instantaneous occurrences that provide additional context for understanding system behavior.

PropertyDescription
idunique identifier for the event. this has to be unique in a trace across all elements
namename of the trace. you can keep it same as your API call. i.e. chatQuery
tagsthese are event specific tags.

Generation

A Generation represents a single Large Language Model (LLM) call within a trace or span. Multiple generations can exist within a single trace/span.

Structure

  • Maxim SDK uses OpenAI's LLM call structure as the standard format.
  • All incoming LLM calls are automatically converted to match OpenAI's structure via SDK.
  • This standardization ensures consistent handling across different LLM providers.
PropertyDescription
idunique identifier for the generation. this has to be unique in a trace.
namename of the generation. it can be specific to your workflow intent detection or final summarization call .
tagskey-value pairs. these are generation specific tags.
messagesthe messages you are sending to LLM as input.
modelmodel that is being used for this LLM call.
model_parametersthe model parameters that you are setting up. This is a key-value pair - and you can pass any model parameter.
erroryou can pass LLM error if has occurred. you can filter all logs with LLM error on the dashboard using filters.
resultresult object coming from the LLM.

Retrieval

A Retrieval represents a query operation to fetch relevant context or information from a knowledge base or vector database within a trace or span. It is commonly used in RAG (Retrieval Augmented Generation) workflows, where context needs to be fetched before making LLM calls.

PropertyDescription
idunique identifier for the retrieval. this has to be unique in a trace.
namename of the retrieval. it can be specific to your workflow intent detection or final summarization call .
tagskey-value pairs. these are retrieval specific tags.
inputinput used to fetch relevant chunks from your knowledge base
outputarray of chunks returned by the knowledge base

On this page