Learn about the key concepts of Maxim's AI Observability
Maxim's observability platform builds upon established distributed tracing principles while extending them for GenAI-specific monitoring. Our architecture leverages proven concepts and enhances them with specialized components for AI applications.
A log repository in Maxim is a dedicated storage and management system for collecting, organizing, and maintaining logs from GenAI services. Each repository functions as a discrete unit within the platform's observability framework.
Single vs. Multiple Repositories
Single Repository: Consolidated logging for all system components into one single repository.
Multiple Repositories: Separate repositories per service.
We recommend implementing separate log repositories for services managed by distinct teams.
The Overview tab provides a comprehensive snapshot of activity within your log repository for your specified timeframe. You can slice and dice these metrics using filters.
The Logs tab displays detailed records of AI interactions in a tabular format.
Each log entry contains:
Field
Description
Timestamp (IST)
Records the date and time of each interaction
Input
Shows the user's prompt or query
Output
Displays the AI model's response in JSON format with intent classification
Model
Indicates the AI model used (e.g., gpt-4o in this instance)
Tokens
Shows token count per interaction
Cost
Lists the cost per interaction in USD
Latency
Records response time in milliseconds
User feedback
User feedback that is recorded for this trace (if available)
The interface includes filtering capabilities and time range selection via the 'Last 1 hour' dropdown and 'Filter' button. Users can toggle between Traces and Sessions views for different analytical perspectives.
The Evaluation tab lets you go through all the online evaluations as they happen. You can configure your log repositories to be evaluated in real-time based on a sampling rate, a set of rules and a set of evaluators. Learn more about it here.
Once you start logging, you can set up alerts to monitor specific metrics. Alerts are triggered when the metric exceeds a certain threshold, and they can be configured to notify you via slack, pagerduty or opsgenie.
Session is a top level entity that captures all the multi-turn interactions of your system. For example, if you are building a chatbot, a session in Maxim logger is an entire conversation of a user with your chatbot.
Sessions are long running entities - you can keep on adding different traces to it over the course of time unless you want to explicitly close the session.
In distributed tracing, a trace is the complete processing of a request through a distributed system, including all the actions between the request and the response.
Property
Description
id
unique identifier for the trace. this usually can be your request id
name (optional)
name of the trace. you can keep it same as your API call. i.e. chatQuery
tags (optional)
these are key-value pairs which you can use for filtering etc on the dashboard. There is no limit on the number of tags and size of the string - but lower is better and faster for search performance specific to your repo.
input (optional)
this could be the user message that was sent to the API call. We use this input to show it on our logs dashboard. For example: in the example dashboard - trace.input = About Slack connect approve.
output (optional)
this could be the final response your API sends back. We use this output to show it on our logs dashboard. For example: in the example dashboard - trace.out =
Spans are fundamental building blocks of distributed tracing. A single trace in distributed tracing consists of a series of tagged time intervals known as spans. Spans represent a logical unit of work in completing a user request or transaction.
Sub-spans
A span can have other spans as children. You can create as many subspans as you want to logically group flows within a span.
Property
Description
id
unique identifier for the span. it can be uuid() for each new trace. This has to be unique across all elements in a trace. If you these are duplicated, data gets overwritten.
name
name of the trace. you can keep it same as your API call. i.e. chatQuery
tags
These are span specific tags.
spanId
Parent span id. These are automatically generated when you call span.addSpan().
Events mark significant points within a span or a trace recording instantaneous occurrences that provide additional context for understanding system behavior.
Property
Description
id
unique identifier for the event. this has to be unique in a trace across all elements
name
name of the trace. you can keep it same as your API call. i.e. chatQuery
A Retrieval represents a query operation to fetch relevant context or information from a knowledge base or vector database within a trace or span. It is commonly used in RAG (Retrieval Augmented Generation) workflows, where context needs to be fetched before making LLM calls.
Property
Description
id
unique identifier for the retrieval. this has to be unique in a trace.
name
name of the retrieval. it can be specific to your workflow intent detection or final summarization call .
tags
key-value pairs. these are retrieval specific tags.
input
input used to fetch relevant chunks from your knowledge base