Maxim AI Blog

Latest

Meta-Harness: What if we let an agent optimize the code around an LLM?

Meta-Harness: What if we let an agent optimize the code around an LLM?

There's a pattern that anyone who's shipped an LLM-powered product has run into: you pick a model, you wire it up, and then you spend the next three months discovering that the thing around the model matters as much as the model itself. What you

The Receipts Are Real, but So Is the Playbook: Making Sense of Anthropic's Mythos Moment

The Receipts Are Real, but So Is the Playbook: Making Sense of Anthropic's Mythos Moment

From Drowning in Logs to Conversing with Your Data: Introducing Maxmallow

From Drowning in Logs to Conversing with Your Data: Introducing Maxmallow

Computation Beyond Tool Use: Executing Programs Inside a Transformer

Computation Beyond Tool Use: Executing Programs Inside a Transformer

Attention Residuals: What If Your Network Could Choose Which Layer to Listen To?

Attention Residuals: What If Your Network Could Choose Which Layer to Listen To?

Speculative Speculative Decoding: How Researchers Are Teaching LLMs to Think Ahead of Themselves

Speculative Speculative Decoding: How Researchers Are Teaching LLMs to Think Ahead of Themselves

PostTrainBench: How Far Can AI Agents Go in Automating LLM Post-Training?

PostTrainBench: How Far Can AI Agents Go in Automating LLM Post-Training?

Research Paper

Meta-Harness: What if we let an agent optimize the code around an LLM?

Meta-Harness: What if we let an agent optimize the code around an LLM?

There's a pattern that anyone who's shipped an LLM-powered product has run into: you pick a model, you wire it up, and then you spend the next three months discovering that the thing around the model matters as much as the model itself. What you

Computation Beyond Tool Use: Executing Programs Inside a Transformer

Computation Beyond Tool Use: Executing Programs Inside a Transformer

Attention Residuals: What If Your Network Could Choose Which Layer to Listen To?

Attention Residuals: What If Your Network Could Choose Which Layer to Listen To?

Speculative Speculative Decoding: How Researchers Are Teaching LLMs to Think Ahead of Themselves

Speculative Speculative Decoding: How Researchers Are Teaching LLMs to Think Ahead of Themselves

PostTrainBench: How Far Can AI Agents Go in Automating LLM Post-Training?

PostTrainBench: How Far Can AI Agents Go in Automating LLM Post-Training?

Is AI Distillation Theft or Just How Knowledge Evolves?

Is AI Distillation Theft or Just How Knowledge Evolves?

Post-Training Doesn't Create Your Model's Character. It Inherits One

Post-Training Doesn't Create Your Model's Character. It Inherits One

Maxim Updates

December 2025 - Updates

Logging and observability overhaul, MCP gateway, Evals on file attachments, and more

🎙️ Feature spotlight 🔀 Collaborative conflict resolution for Prompt changes To help teams collaborate on prompts without accidentally overwriting each other’s work, we’ve introduced session conflict resolution in the prompt playground. Here’s what’s new: * You’ll now land on your last active session instead of the prompt’s

November 2025 Updates - Maxim AI

✨ Flexible data curation, Cost charts, Reasoning column, and more

Synthetic data generation, Retro evals, Workspace-level RBAC and more

Synthetic data generation, Retro evals, Workspace-level RBAC and more

✨ Audit logs, Guardrails, Responses API support, and more

✨ Audit logs, Guardrails, Responses API support, and more

✨ Voice simulation, Flexi evals, Adaptive load balancing, and more

✨ Voice simulation, Flexi evals, Adaptive load balancing, and more

✨ Prompt simulations, File attachments, Claude 4, and more

✨ Prompt simulations, File attachments, Claude 4, and more

✨ Bifrost, Voice agent support, CrewAI integration, and more

✨ Bifrost, Voice agent support, CrewAI integration, and more

Guides

xMemory: Why Top-k Retrieval Breaks for Agent Memory

xMemory: Why Top-k Retrieval Breaks for Agent Memory

Introduction LLM agents no longer begin and end in a single context window. We’re now in the era of cross-session, long-running agents. Products like Claude Code, OpenClaw, and other agentic workflows are built to carry context across days of work, not minutes. The bottleneck is not context

What are Online Evaluations and How to Set Them Up for Your AI System Using Maxim AI

What are Online Evaluations and How to Set Them Up for Your AI System Using Maxim AI

Building an AI Product Review Analyzer: Structured Outputs with Together AI and Maxim Observability

Building an AI Product Review Analyzer: Structured Outputs with Together AI and Maxim Observability

Building a Resume Checker with LlamaIndex and Maxim Observability

Building a Resume Checker with LlamaIndex and Maxim Observability

👀 Observing Tool Calls 🔨 and JSON Mode Responses from Fireworks AI

👀 Observing Tool Calls 🔨 and JSON Mode Responses from Fireworks AI

When Your AI Can't Tell the Difference Between "Fine" and Frustration

When Your AI Can't Tell the Difference Between "Fine" and Frustration

When Your AI Transcription Turns "Tasty Burger" Into "Nasty Murder"

When Your AI Transcription Turns "Tasty Burger" Into "Nasty Murder"

LLMs

More

The Discipline Layer: Harnesses as the Missing Piece in Autonomous Coding

Breaking the Context Window: How Recursive Language Models Handle Infinite Input

Are Small Language Models the Future of Agentic AI?

When Your AI Can't Tell the Difference Between "Fine" and Frustration

When Your AI Transcription Turns "Tasty Burger" Into "Nasty Murder"

Evaluation

More

Voice Simulation: Testing Voice Agents the Way Users Experience Them

Beyond the SDK: Why AI Teams Love HTTP Endpoint-Based Evals

Building a Customer Support AI Agent with AWS Bedrock and Testing It at Scale

What are Offline Evaluations and How to Set Them Up for Your AI System Using Maxim AI

What are Online Evaluations and How to Set Them Up for Your AI System Using Maxim AI

Observability

More

From Drowning in Logs to Conversing with Your Data: Introducing Maxmallow

Basics of AI Observability: Sessions, Traces, and Spans

Monitor AI Applications in Real-Time with Maxim's Enterprise-Grade LLM Observability Platform

Choosing the Right AI Agent Framework: A Comprehensive Guide

Building a Simple Second Brain AI Agent with Vercel AI SDK & Maxim AI