Introduction to Sammy
Sammy is an auto-configuring AI agent framework that scans your codebase, understands your APIs and business logic, and generates production-ready AI agents — all without requiring AI expertise.
Most AI agent frameworks require you to manually define agents, write tool schemas, wire up handlers, and decide on orchestration patterns. Sammy inverts this — it treats your codebase as the source of truth.
The pipeline: Scan your codebase to understand what it does, determine the optimal agent architecture, generate all tools and schemas, evaluate and refine in a closed loop, then deploy a production-ready system.
Quick Start
Go from an existing Next.js app to a working, tool-using AI assistant at /chat in about 10 minutes. The CLI scaffolds the wiring for you.
Prerequisites
- A Next.js project (App Router) with some API routes or server functions
- Node.js 18+
- A Sammy API key (create one in your dashboard)
Step 1: Install the SDK
The package is sammy-sdk; it ships the sammy CLI.
npm install sammy-sdkStep 2: Scan your codebase
Add your API key to .env.local, then run discovery — it writes a reviewable sammy.config.json.
# .env.local
SAMMY_API_KEY=sk_sammy_your_key_here
npx sammy initStep 3: Generate tools & agents
npx sammy generate # writes .sammy/tools + .sammy/agents
npx sammy eval # optional: self-evaluate and refineStep 4: Scaffold the wiring
This writes the server helper, the chat + history API routes, and a /chat page, and patches next.config, middleware, and .env.local. See for details.
npx sammy scaffold next # add --dry-run to preview firstStep 5: Install the chat UI
The scaffolded /chat page renders SammyChatApp, so install the frontend packages:
npm install sammy-sdk-ui sammy-sdk-react sammy-sdk-clientStep 6: Run it
For local demos without logging in, set a real seeded user id so write tools that persist foreign keys succeed (scaffold adds the placeholder):
# .env.local
SAMMY_DEV_USER_ID=<a-real-user-id-from-your-db>
npm run dev # then open http://localhost:3000/chatYour assistant is live at /chat with streaming, history, and a New chat button. Before shipping, swap the in-memory store for a real one and wire real auth — see and .
Scaffolding
sammy scaffold next writes the Next.js App Router wiring so you don't have to copy it by hand. It is idempotent and never overwrites your files — anything it can't edit safely is printed as a manual step.
npx sammy scaffold next
npx sammy scaffold next --dry-run # preview without writingWhat it creates
lib/sammy-server.ts
A single Sammy instance (via globalThis) plus resolveUserFromSession.
app/api/chat/route.ts
The chat endpoint (POST, JSON + SSE streaming).
app/api/chat/conversations/…
History routes: list, get one, delete.
app/chat/page.tsx
A ready-to-use page rendering SammyChatApp.
What it patches
next.config — adds serverExternalPackages: ["sammy-sdk", "tsx"] (and outputFileTracingRoot for nested monorepo apps).
middleware — if you have a PUBLIC_PATHS allowlist, it adds /chat and /api/chat so the UI isn't redirected to login in dev.
.env.local — adds a SAMMY_DEV_USER_ID= placeholder.
Auto-detection: scaffold detects your src/ layout, your @/ path alias, and your auth provider (NextAuth, Clerk, …) and adapts the generated code accordingly.
CLI Reference
All Sammy CLI commands and their options.
sammy initScan your codebase and generate sammy.config.json
sammy generateGenerate tools and agents from sammy.config.json
sammy scaffold nextScaffold Next.js App Router wiring (server, routes, chat page)
sammy evalRun the evaluation loop to test and refine agents
sammy devStart the Sammy dev server with hot-reload
Discovery Engine
The discovery engine is Sammy's core differentiator. It performs static analysis combined with LLM-powered semantic understanding to build a complete capability map of your application.
What Gets Scanned
Route Handlers
HTTP method, path, params, response shape, auth requirements, side effects
Server Actions
Function signatures, parameters, return types, side effects
Database Models
Prisma, Drizzle, Mongoose schemas — fields, relations, types
External Services
Stripe, SendGrid, Twilio, AWS — detected via env vars and imports
Auth System
Provider (NextAuth, Clerk, etc.), roles, permissions, session shape
Business Logic
Service files, shared types, constants, utility functions
How It Works
1. Framework Detection — Sammy reads your package.json and project structure to identify Next.js, Express, Fastify, etc.
2. Static Scanning — Framework-specific scanners extract routes, models, and services using AST parsing and pattern matching. No LLM calls — this is fast, free, and local.
3. LLM Analysis — The structured capability map is sent to a powerful-tier model which clusters capabilities into business domains, names tools, and recommends an agent architecture.
4. Config Output — Results are written to sammy.config.json — a human-readable, editable file you review before proceeding.
Tool Generation
The tool generator reads sammy.config.json and scaffolds executable tool files that agents can invoke at runtime.
What Gets Generated
Zod Schemas — Strict input validation for every tool parameter. Enums, date formats, email validation — all inferred from your existing types.
Handler Wrappers — Each tool wraps your existing function with error boundaries, permission checks, and structured output formatting.
Agent Definitions — Each specialist agent gets a definition file with assigned tools, a system prompt, and a model tier selection.
// .sammy/tools/billing/getInvoices.ts (auto-generated)
import { z } from "zod";
export const getInvoicesSchema = z.object({
customerId: z.string().describe("The customer ID"),
status: z.enum(["draft", "open", "paid", "void"]).optional(),
from: z.string().date().optional(),
to: z.string().date().optional(),
});
export const getInvoices = {
name: "getInvoices",
description: "List invoices for a customer with optional filters",
schema: getInvoicesSchema,
permission: "read",
handler: async (params, context) => {
const { listInvoices } = await import("@/services/billing/invoices");
return listInvoices(params);
},
};Flagged tools: If Sammy can't auto-wire a handler (e.g., multipart uploads, complex aggregations), it generates a stub and flags it. You fill in ~10-20 lines of glue code.
Runtime
The runtime engine is what serves your AI assistant in production. It handles message routing, agent orchestration, tool execution, and streaming responses.
Request Flow
1. Message In — User sends a message via your /api/chat endpoint.
2. Router — A fast-tier model classifies intent and picks the right specialist agent (or creates a multi-agent plan).
3. Specialist Agent — The selected agent processes the request using a balanced-tier model, decides which tools to call, and extracts parameters.
4. Tool Execution — Tools run locally against your database and APIs. No LLM call needed — just your existing code.
5. Response — The agent formats the tool results into natural language and streams back via SSE.
Modes
Single-Agent
All tools on one agent. Best for 1-2 domains. Simple, fast, lower latency.
Multi-Agent
Router + specialist agents per domain. Best for 3+ domains. Better accuracy, cross-agent plans.
Evaluation & Refinement
Sammy auto-generates test scenarios, runs them against your agents, scores across 6 dimensions, and self-refines until all thresholds pass.
The 6 Scoring Dimensions
Tool Selection
≥ 90%Did the agent pick the right tool?
Parameter Extraction
≥ 85%Were parameters correctly extracted?
Router Accuracy
≥ 95%Was the message routed to the right agent?
Response Quality
≥ 7.0/10Was the response helpful and well-formatted?
Safety Compliance
100%Were permission boundaries respected?
Conversation Coherence
≥ 80%Did multi-turn context hold up?
The Refinement Loop
Generate — 60+ test scenarios based on your actual tools and domains.
Run — Execute every scenario against the agent system and score.
Diagnose — Categorize failures (TOOL_DESCRIPTION, SCHEMA_MISMATCH, PROMPT_CLARITY, CODE_ISSUE).
Refine — Auto-fix what it can (tighten schemas, rewrite descriptions, add prompt rules).
Re-run — Loop until all dimensions pass or a guard triggers (max 5 iterations, diminishing returns).
sammy.config.json
The config file is the contract between discovery and everything downstream. It's auto-generated but designed to be human-readable and editable.
{
"$schema": "https://unpkg.com/sammy-sdk/schema.json",
"version": "1.0",
"project": {
"framework": "nextjs",
"frameworkVersion": "15.2",
"router": "app",
"orm": "prisma"
},
"domains": [
{
"name": "billing",
"description": "Manages subscriptions, invoices, and payments via Stripe",
"tools": [
{
"name": "getInvoices",
"type": "query",
"permission": "read",
"handler": "src/services/billing/invoices.ts:listInvoices"
}
]
}
],
"architecture": {
"type": "multi-agent",
"router": { "model": "fast" },
"agents": [
{ "name": "billing-agent", "domains": ["billing"], "model": "balanced" }
]
}
}You can edit anything in this file before running sammy generate: rename domains, remove tools, change model tiers, force a different architecture, or add custom system prompts.
Architecture Options
Sammy automatically recommends an architecture based on how many domains are discovered and their complexity.
Single-Agent
1–2 domainsAll tools on one agent. Simple, fast, lowest latency. Best for small projects with few capabilities.
Multi-Agent + Router
3–6 domainsA fast router classifies intent, then delegates to specialist agents — one per domain. Each agent has its own tools and system prompt.
Hierarchical
7+ domainsRouter delegates to sub-routers, which delegate to specialists. Scales to large orgs with many distinct business areas.
You can override the recommendation by setting architecture.type in your config to "single", "multi-agent", or "hierarchical".
Model Tiers
Sammy uses abstract model tiers instead of specific model names. This lets us upgrade models without changing your code.
| Tier | Used For | Speed | Quality |
|---|---|---|---|
| fast | Router, simple queries, classification | Fastest | Good |
| balanced | Specialist agents, most tool calls | Fast | Very Good |
| powerful | Discovery analysis, eval judging, complex reasoning | Moderate | Best |
Note: Sammy Cloud maps these tiers to specific models server-side. We can upgrade the underlying models without any changes to your code or configuration.
API Routes
generates everything below — this section documents what those files contain so you can customize them or wire them by hand.
One shared Sammy instance
Next.js can load each route in a separate module graph, so keep a single instance on globalThis — that way chat and history share one conversation store.
// lib/sammy-server.ts
import { Sammy } from "sammy-sdk";
import { auth } from "@/lib/auth";
const g = globalThis as typeof globalThis & { __sammy?: Sammy };
export function getSammy() {
if (!g.__sammy) {
g.__sammy = new Sammy({
systemPrompt: "You are a helpful assistant. Use tools for current data; be concise.",
// conversationStorage: new YourRedisStorage(...), // see Conversation Storage
});
}
return g.__sammy;
}
// Never trust a client-sent user in production — resolve it server-side.
export async function resolveUserFromSession(_req: Request) {
const session = await auth();
if (session?.user?.id) {
return { id: session.user.id, email: session.user.email ?? undefined };
}
// Dev only: FK-safe writes when demoing /chat without logging in.
const devId = process.env.SAMMY_DEV_USER_ID;
if (devId) return { id: devId };
return undefined;
}The chat endpoint
One handler — it handles JSON responses, SSE streaming, body validation, and errors.
// app/api/chat/route.ts
import { createSammyRouteHandler } from "sammy-sdk";
import { getSammy, resolveUserFromSession } from "@/lib/sammy-server";
export const runtime = "nodejs";
export const dynamic = "force-dynamic";
export const POST = createSammyRouteHandler(getSammy(), {
resolveUser: resolveUserFromSession,
});Required: next.config
The runtime loads your generated .sammy/*.ts tools via tsx at request time, so they must stay external to the bundle.
// next.config.mjs
const nextConfig = {
serverExternalPackages: ["sammy-sdk", "tsx"],
};
export default nextConfig;Using auth middleware?
If middleware protects every route, /chat and /api/chat must be reachable or the UI/API will redirect to login. In dev, allowlist them and let resolveUser fall back to SAMMY_DEV_USER_ID. In production, require login for /chat and resolve the user from the session only.
Conversation History
Three more handlers give you a persistent thread list, resumable conversations, and delete — all scoped to the resolved user, so users never see each other's threads.
// app/api/chat/conversations/route.ts
import { createSammyListConversationsHandler } from "sammy-sdk";
import { getSammy, resolveUserFromSession } from "@/lib/sammy-server";
export const runtime = "nodejs";
export const dynamic = "force-dynamic";
export const GET = createSammyListConversationsHandler(getSammy(), {
resolveUser: resolveUserFromSession,
});// app/api/chat/conversations/[id]/route.ts
import {
createSammyGetConversationHandler,
createSammyDeleteConversationHandler,
} from "sammy-sdk";
import { getSammy, resolveUserFromSession } from "@/lib/sammy-server";
export const runtime = "nodejs";
export const dynamic = "force-dynamic";
const opts = { resolveUser: resolveUserFromSession };
export const GET = createSammyGetConversationHandler(getSammy(), opts);
export const DELETE = createSammyDeleteConversationHandler(getSammy(), opts);HTTP contract
| Method | Path | Purpose |
|---|---|---|
| POST | /api/chat | Send a message (JSON or SSE stream) |
| GET | /api/chat/conversations | List threads (?limit, ?before) |
| GET | /api/chat/conversations/:id | Load one thread |
| DELETE | /api/chat/conversations/:id | Delete a thread |
SSE events when streaming: text, tool_call, tool_result, agent_handoff, done, error.
Frontend Packages
Choose your level of abstraction — from a drop-in widget to a raw client for any framework.
sammy-sdk-ui — Full app (recommended)
SammyChatAppA complete chat experience: history sidebar, New chat, resumable threads, and auto-scroll. This is what scaffolding drops into /chat.
import { SammyChatApp } from "sammy-sdk-ui";
export default function ChatPage() {
return <SammyChatApp endpoint="/api/chat" title="My Assistant" height={560} />;
}sammy-sdk-ui — Inline panel & floating widget
Embed a single conversation in an existing page, or float a support bubble in the corner.
import { SammyChatInline, SammyChat } from "sammy-sdk-ui";
// Inline panel inside a page
<SammyChatInline endpoint="/api/chat" title="Support" />
// Floating widget (corner bubble)
<SammyChat endpoint="/api/chat" position="bottom-right" />sammy-sdk-react — Headless hooks
Bring your own UI. Hooks give you messages, streaming state, and the conversation list.
import { SammyProvider, useSammyChat, useSammyConversations } from "sammy-sdk-react";
function MyChat() {
const { conversations, refresh } = useSammyConversations();
const chat = useSammyChat({ onConversationId: () => refresh() });
return (
<div>
<button onClick={() => chat.startNewConversation()}>New chat</button>
{conversations.map((c) => (
<button key={c.id} onClick={() => chat.selectConversation(c.id)}>{c.title}</button>
))}
{/* render chat.messages; an input that calls chat.send(text) */}
</div>
);
}
export function App() {
return (
<SammyProvider endpoint="/api/chat">
<MyChat />
</SammyProvider>
);
}sammy-sdk-client — Any framework
A zero-dependency client for Vue, Svelte, vanilla JS, or the server.
import { SammyClient } from "sammy-sdk-client";
const client = new SammyClient({ endpoint: "/api/chat" });
const list = await client.listConversations();
const detail = await client.getConversation(list[0].id);
await client.deleteConversation(list[0].id);
const stream = client.stream("Show me unpaid invoices", { conversationId: detail.id });
stream.on("text", (chunk) => { /* update UI */ }).on("done", (final) => { /* settle */ });Security & Auth
Sammy has security built in at every layer — from tool permissions to data handling.
Permission Levels
| Level | Description | Example |
|---|---|---|
| read | Query data, no mutations | getInvoices, getUser |
| write | Create or update data | createInvoice, sendEmail |
| admin | Destructive or sensitive operations | refundPayment, updateUserRole |
Data Handling
Sammy Cloud never stores prompt or response content. Only metadata (token counts, latency, costs) is logged for billing and analytics.
BYOK mode bypasses Sammy Cloud entirely — your LLM calls go directly to Anthropic/OpenAI from your server. Zero data passes through our infrastructure.
Cached responses are encrypted at rest and auto-expire (default: 1 hour TTL).
Conversation Storage
By default, conversations are kept in an in-memory store. That is perfect for local development, but it is not durable — history is wiped on restart and is not shared across multiple instances. For production, plug in your own store.
Heads up: on serverless or any multi-instance deploy, the default memory store means each worker has its own history and threads disappear between cold starts. Use Redis, Postgres, or a custom store.
The interface
ConversationStorage is a small interface; implement it and pass it to new Sammy(...). Every chat and history route then uses it transparently.
import type { ConversationStorage } from "sammy-sdk";
export class RedisConversationStorage implements ConversationStorage {
async getOrCreate(id, userId) { /* ... */ }
async append(id, message) { /* ... */ }
async get(id, userId) { /* ... */ }
async list(userId, opts) { /* ... */ }
async delete(id, userId) { /* ... */ }
async touch(id, patch) { /* optional */ }
}
new Sammy({ conversationStorage: new RedisConversationStorage(/* ... */) });What to use where
| Environment | Recommendation |
|---|---|
Local / sammy dev | Default in-memory store |
| Single server / demos | Memory OK (history resets on restart) |
| Production / serverless / >1 instance | Required: Redis, Postgres, or custom |
Deployment
A Sammy-powered app deploys like any other Next.js app (Vercel, your own Node host, containers). Keep these production specifics in mind.
Use a real conversation store. Serverless platforms run many short-lived workers; the in-memory default won't share history. See .
Ship your .sammy/ directory. The runtime loads generated tools at request time via tsx. Commit .sammy/ or run sammy generate in CI, and keep serverExternalPackages: ["sammy-sdk", "tsx"].
Protect chat in production. Require auth for /chat and /api/chat, resolve the user from the session, and do not rely on SAMMY_DEV_USER_ID.
Keep secrets in a vault. SAMMY_API_KEY belongs in your platform's secret manager, never committed.
Before you ship
- resolveUser wired to your real auth (not SAMMY_DEV_USER_ID)
- ConversationStorage backed by Redis or a database
- serverExternalPackages: ["sammy-sdk", "tsx"] in next.config
- Middleware protects /chat and /api/chat
- SAMMY_API_KEY stored in a secrets manager
- sammy generate runs in CI when sammy.config.json changes
- App-level rate limits / cost controls on LLM usage