Add Skills to Your AI-SDK Agent in Minutes

How Bluebag eliminates the infrastructure burden of building capable AI agents

TL;DR

You've built an AI agent with Vercel's AI SDK. It streams responses beautifully. But the moment you need it to actually do something you turn to SKills for: determinsitic scripts, execute complex workflows, but you hit a wall. Sandboxing, VMs, dependency management, file storage... suddenly you're building infrastructure instead of building your agent.

Bluebag fixes this with a two-line integration:

const config = await bluebag.enhance({ model, messages });
const result = streamText(config);

Your agent gains instant access to production Agent Skills. No Kubernetes. No Docker orchestration. No infrastructure. Nada. It just works.

The Problem Every AI-SDK Developer Knows

If you've shipped an AI agent beyond a simple chatbot, you've felt this pain.

Your agent needs to process a PDF. Or generate an image. Or run a data analysis script. The LLM can describe how to do these things, but it can't actually execute them. So you start building:

A sandboxed execution environment (can't let arbitrary code touch your servers)
Dependency management (that Python script needs pandas and pdfplumber)
File storage and retrieval (where do outputs go? How do users download them?)
Session management (users expect state to persist across messages)

Before you know it, you've spent weeks on infrastructure that has nothing to do with your product's core value. And you still need to maintain it.

This is the gap between "demo agent" and "production agent" that kills so many projects.

What If Skills Just Worked?

Here's the mental model we built Bluebag around: your AI agent should be able to acquire and use Skills the same way a human does.

When you need to process a PDF, you don't build a PDF processing system from scratch. You reach for a tool that already exists. Your agent should work the same way.

With Bluebag, Skills are first-class primitives. They're pre-packaged capabilities, complete with documentation, scripts, and dependencies that your agent can discover, understand, and execute on demand.

import { Bluebag } from "@bluebag/ai-sdk";
import { streamText } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
 
const bluebag = new Bluebag({
  apiKey: process.env.BLUEBAG_API_KEY!,
  activeSkills: ["pdf", "data-analysis", "image-generation"],
});
 
const config = await bluebag.enhance({
  // use any model: OpenAI, Google, Claude, Mistral etc
  model: anthropic("claude-sonnet-4-20250514"),
  messages: [{ role: "user", content: "Analyze this quarterly report PDF" }],
});
 
const result = streamText(config);

That's it. Your agent now has access to PDF processing, data analysis, and image generation. No infrastructure required.

How It Actually Works

Let's peek under the hood, because "magic" isn't a satisfying answer. Or is it?

Progressive Skill Discovery

Skills use a three-level loading system designed to minimize token usage:

Level 1 - Metadata in System Prompt: Your agent receives a lightweight index of available skills. Just names and short descriptions. Enough to know what's available without burning context.

Available skills:
- pdf: Extract text, tables, and metadata from PDF files
- data-analysis: Run Python-based statistical analysis on datasets
- image-generation: Create images from text descriptions

Level 2 - Full Documentation on Demand: When a skill is relevant, the agent reads its SKILL.md file: a complete manual with usage examples, parameter specifications, and edge case handling.

Level 3 - Execution: Only when actually needed, the agent runs the skill's scripts with the user's specific inputs.

This progressive approach means your agent isn't stuffed with documentation it doesn't need. It reads the manual when the task calls for it.

Runtime VMs That You Don't Manage

When your agent executes a skill, it runs in a sandboxed VM. But here's what you don't have to think about:

Provisioning: VMs spin up automatically on first tool call
Dependencies: Each skill declares what it needs; environments are pre-configured
Persistence: Your session's VM persists across messages (30-minute TTL with auto-refresh)
Cleanup: VMs auto-terminate when idle; you're not paying for unused compute
Recovery: If a VM stops unexpectedly, the next tool call transparently provisions a new one

From your code's perspective, execution just works. The infrastructure complexity is completely abstracted.

Tools Your Agent Gains

When you call bluebag.enhance(), your AI SDK config gains five tools:

Tool	What It Does
`bluebag_bash`	Execute shell commands in the sandbox
`bluebag_code_execution`	Run Python/JavaScript/TypeScript with full dependency access
`bluebag_text_editor`	View, create, and modify files
`bluebag_computer_use`	GUI automation for visual workflows
`bluebag_file_download_url`	Generate signed download URLs for outputs

These are robust implementations with proper error handling, automatic retries, and seamless integration with the AI SDK's streaming model.

The Download URL Problem (Solved)

Here's a detail that seems small but matters enormously in production: how do users get files your agent creates?

If your agent generates a chart, processes a document, or creates any artifact, that file exists in a sandboxed VM. Getting it to the user typically means:

Copying the file to your server
Storing it somewhere accessible (S3, your filesystem, etc.)
Generating a URL
Managing expiration and cleanup
Handling the security implications of all the above

With Bluebag, your agent calls a single tool:

// Inside the sandbox, after creating output.pdf
const downloadUrl = await bluebag_file_download_url({
  fileId: "file_abc123",
  ttlSeconds: 3600,
});
// Returns a signed URL that expires in 1 hour

The LLM can include this URL directly in its response. Users click and download. You didn't have to build or maintain file infrastructure.

Real-World Pattern: The Specialized Agent

Let's walk through a concrete example. You're building an agent that helps users understand their data.

Without Bluebag, your architecture might look like:

User Request
    ↓
Your Server (API routes, auth, rate limiting)
    ↓
LLM Call (with custom tools pointing to your infrastructure)
    ↓
Your Execution Service (Docker containers? Lambda? K8s pods?)
    ↓
Your Storage Service (S3? GCS? Local disk?)
    ↓
Response Assembly (combine LLM output with file URLs)
    ↓
User Response

Each box represents code you wrote, infrastructure you manage, and failure modes you handle.

With Bluebag:

// pages/api/analyze.ts
import { Bluebag } from "@bluebag/ai-sdk";
import { streamText } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
 
const bluebag = new Bluebag({
  apiKey: process.env.BLUEBAG_API_KEY!,
  activeSkills: ["data-analysis"],
});
 
export async function POST(req: Request) {
  const { messages } = await req.json();
 
  const config = await bluebag.enhance({
     // use any model: OpenAI, Google, Claude, Mistral etc
    model: anthropic("claude-sonnet-4-20250514"),
    messages,
    system:
      "You are a data analysis assistant. Help users understand their datasets.",
  });
 
  const result = streamText(config);
  return result.toDataStreamResponse();
}

Your entire "execution infrastructure" is two lines. The agent can:

Accept CSV uploads (Bluebag handles file ingestion)
Run pandas analysis scripts (in the managed sandbox)
Generate matplotlib visualizations (dependencies pre-installed)
Return download links for results (signed URLs handled automatically)

You focus on the prompts, the UX, and the business logic. Bluebag handles the infrastructure.

Multi-Tenant by Default

If you're building a product (not just a demo), you need tenant isolation. User A's files shouldn't be accessible to User B. User A's session state shouldn't leak to User B.

Bluebag handles this with stable IDs:

const bluebag = new Bluebag({
  apiKey: process.env.BLUEBAG_API_KEY!,
  stableId: user.id, // Isolates this user's sandbox and files
});

Each unique stableId gets its own isolated environment. Files, session state, and execution context are partitioned automatically.

Predictable Agents Through Structured Skills

One underrated benefit of the skill system: predictability.

When you give an LLM free-form code execution capabilities, outputs are unpredictable. The model might write a script that works, or it might hallucinate a library that doesn't exist.

Skills flip this dynamic. Each skill has:

Tested scripts that are known to work
Clear documentation that tells the model exactly how to use them
Defined outputs so you know what to expect

Your agent follows proven playbooks. This makes debugging easier, outputs more consistent, and users happier.

Getting Started

Install the SDK:

npm install @bluebag/ai-sdk ai

Initialize and enhance:

import { Bluebag } from "@bluebag/ai-sdk";
import { streamText } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
 
const bluebag = new Bluebag({
  apiKey: process.env.BLUEBAG_API_KEY!,
});
 
const config = await bluebag.enhance({
   // use any model: OpenAI, Google, Claude, Mistral etc
  model: anthropic("claude-sonnet-4-20250514"),
  messages: [{ role: "user", content: "Hello from Bluebag." }],
});
 
const result = streamText(config);

That's a complete, skill-enabled agent. Runs anywhere your AI SDK code runs: Next.js API routes, Express servers, serverless functions.

The Shift in Agent Development

We're at an inflection point in AI agent development. The models are capable enough to do real work. The developer tools (like Vercel's AI SDK) are mature enough to build great UX.

The missing piece has been the execution layer, the infrastructure that lets agents actually do the things they describe.

Agent Skills fill this gap.

Bluebag provides the infrastructure to ship Agent Skills in your ai-sdk agents with confidence.

Don't build infrastructure. Start building agents that ship.

Ready to add Skills to your AI-SDK agent? Get your API key and start building in minutes.