Maxim Logo

Concepts

Learn about the key concepts of Maxim's AI Observability.

Maxim's observability platform builds upon established distributed tracing principles while extending them for GenAI-specific monitoring. Our architecture leverages proven concepts and enhances them with specialized components for AI applications.

Our observability features provide a comprehensive view of your AI applications through monitoring, troubleshooting, and alerting capabilities.

Log Repository

The most central component of Maxim's observability platform is the Log Repository, as it is where logs are ingested. It allows for ease of searching and analyzing.

How should you be structuring log repositories?

Split your logs across multiple repositories in your workspace based on your needs. For example, you might have one repository for production logs and another for development logs; or you could have a single repository for all logs, differentiated by tags indicating the environment.

We recommend implementing separate log repositories for separate applications, but also separate log repositories for separate services that managed by distinct teams. Trying to keep logs that are related to different applications or services in the same repository can lead to difficulties in analyzing and troubleshooting.

A Log Repository contains three main components:

Overview Tab

The Overview tab provides a comprehensive snapshot of activity within your Log Repository for your specified timeframe. The metrics that are displayed in the overview include:

  • Total traces
  • Total usage
  • Average user feedback
  • Traces over time (graph)
  • Total usage over time (graph)
  • Average user feedback over time (graph)
  • Latency over time (graph)
  • Error rate (graph)

![Screenshot of overview metrics]

There is also an overview for Evaluation results, which includes:

  • No of traces evaluated
  • Evaluation summary
  • Mean score over time (graph)
  • Pass rate over time (graph)

![Screenshot of evaluation results overview metrics]

We talk more about evaluation logs in How to evaluate logs section of the documentation.

Logs Tab

Logs Tab is where you'll find all the ingested logs in a tabular format, it is a detailed view where each log is displayed separately with it's own metrics.

![Screenshot of the logs tab]

For each log entry we display the following in the table:

FieldDescription
TimestampThe start time of the log
InputThe user's prompt or query
OutputThe final response
ModelThe AI models seen within the trace (e.g., gpt-4o)
TokensToken count for the whole trace
CostThe cost for the whole trace in USD
LatencyResponse time in milliseconds
User feedbackUser feedback for a trace (if available)
TagsTags associated with the log entry

Apart from the above fields, there are Evaluation score fields per evaluator too, these display any evaluation done on the trace and what score was obtained for that evaluation.

Each trace can be clicked to see an expanded view of the trace in a sheet. It displays detailed information on each component of the trace and also the evaluations done on each component.

Learn more about logging in the How to log your application section of the documentation.

Alerts Tab

Once you start logging, you can set up alerts to monitor specific metrics. Alerts are triggered when the metric exceeds a certain threshold, and they can be configured to notify you via slack, pagerduty or opsgenie.

![Screenshot of the alerts tab]

You can learn more on configuring alerts in the How to set up alerts section of the documentation.

Components of a log

Now that you know how to navigate through a Log Repository, let us discuss what each log contains and what each component of the log and their properties represent.

Session

Session is a top level entity that captures all the multi-turn interactions of your system. For example, if you are building a chatbot, a session in Maxim logger is an entire conversation of a user with your chatbot.

Sessions persist across multiple interactions and remain active until explicitly closed - you can keep on adding different traces to it over the course of time unless you want to explicitly close the session.

Trace

trace

In distributed tracing, a trace is the complete processing of a request through a distributed system, including all the actions between the request and the response.

PropertyDescription
idUnique identifier for the trace; this usually can be your request id.
name (optional)Name of the trace; you can keep it same as your API call. i.e. chatQuery
tags (optional)Key-value pairs which you can use for filtering on the dashboard.

There is no limit on the number of tags or the size of the string, although lower amounts are better and faster for search performance specific to your repo.
input (optional)The user's prompt or query.
output (optional)This is the final response your API sends back.

Span

span

Spans are fundamental building blocks of distributed tracing. A single trace in distributed tracing consists of a series of tagged time intervals known as spans. Spans represent a logical unit of work in completing a user request or transaction.

Sub-spans

A span can have other spans as children. You can create as many subspans as you want to logically group flows within a span.

PropertyDescription
idUnique identifier for the span; it can be uuid() for each new trace. This has to be unique across all elements in a trace. If you these are duplicated, data gets overwritten.
nameName of the span; you can keep it same as your API call. i.e. chatQuery
tagsThese are span specific tags.
spanIdParent span id; these are automatically generated when you call span.addSpan().

Event

event

Events mark significant points within a span or a trace recording instantaneous occurrences that provide additional context for understanding system behavior.

PropertyDescription
idUnique identifier for the event; this has to be unique in a trace across all elements
nameName of the event. you can keep it same as your API call. i.e. chatQuery
tagsThese are event specific tags.

Generation

A Generation represents a single Large Language Model (LLM) call within a trace or span. Multiple generations can exist within a single trace/span.

Structure

  • Maxim SDK uses OpenAI's LLM call structure as the standard format.
  • All incoming LLM calls are automatically converted to match OpenAI's structure via SDK.
  • This standardization ensures consistent handling across different LLM providers.
PropertyDescription
idUnique identifier for the generation; this has to be unique in a trace.
nameName of the generation; it can be specific to your workflow intent detection or final summarization call.
tagsThese are generation specific tags.
messagesThe messages you are sending to LLM as input.
modelModel that is being used for this LLM call.
model_parametersThe model parameters that you are setting up. This is a key-value pair; you can pass any model parameter.
errorYou can pass LLM error if it has occurred. You can filter all logs with LLM error on the dashboard using filters.
resultResult object coming from the LLM.

Retrieval

A Retrieval represents a query operation to fetch relevant context or information from a knowledge base or vector database within a trace or span. It is commonly used in RAG (Retrieval Augmented Generation) workflows, where context needs to be fetched before making LLM calls.

PropertyDescription
idUnique identifier for the retrieval; this has to be unique in a trace.
nameName of the retrieval; it can be specific to your workflow intent detection or final summarization call .
tagsThese are retrieval specific tags.
inputInput used to fetch relevant chunks from your knowledge base
outputArray of chunks returned by the knowledge base

Tool Call

Tool Call represents an external system or service call done based on an LLM response. Each tool call can be logged separately to track its input, output and latency.

PropertyDescription
idUnique identifier for the tool call; this has to be unique in a trace (can be found in the tool call reponse of the LLM).
nameName of the tool call.
descriptionDescription of the tool call.
argsArguments passed to the tool call, these are tool specific arguments.
resultResult returned by the tool call.

On this page