How to Avoid LLM Vendor Lock-in When Building AI Agents

You've built your agent on Claude. It works perfectly. Then Anthropic raises prices by 30%.

Or their API goes down for 6 hours.

Or a competitor ships a model that's 10x faster at half the cost.

You're locked in. Switching would mean rewriting everything.

This happens to teams every month. They optimize for speed of development and end up trapped by a single vendor.

Here's how to build agents that work across any LLM without sacrificing development velocity.

lock-in-vs-portable

The Lock-In Problem

Vendor lock-in happens gradually. You make small decisions that seem reasonable at the time. Then you realize you can't switch without significant rework.

How Lock-In Happens

Stage 1: You pick a provider

You choose Claude because it's great at reasoning. Or GPT-4 because it's familiar. Or Gemini because of the long context window.

Reasonable choice. Single provider simplifies development.

Stage 2: You use provider-specific features

Claude has Skills. OpenAI has function calling with specific formats. Gemini has native multimodal inputs.

You use these features because they make your agent better.

Stage 3: Your codebase couples to the provider

// Anthropic-specific code
const response = await anthropic.messages.create({
  model: "claude-3-5-sonnet-20241022",
  max_tokens: 4096,
  tools: claudeTools, // Claude-specific format
  messages: claudeMessages, // Claude-specific format
});

Your prompts are optimized for Claude's behavior. Your error handling expects Claude's error codes. Your tool definitions use Claude's schema.

Stage 4: You discover the cost

Switching providers would require:

Rewriting API integration code
Converting tool definitions
Adjusting prompts for different model behavior
Testing everything again
Updating error handling
Migrating Skills or workflows

Estimated time: 2-3 months of engineering work.

You're locked in.

Why Lock-In Matters

Some teams accept vendor lock-in. "We'll deal with it if we need to switch."

Here's why that's risky:

1. Pricing Changes

LLM pricing is volatile. Providers change prices based on compute costs, competition, and business strategy.

Real examples:

OpenAI has changed GPT-4 pricing multiple times
Anthropic introduced usage tiers with different rates
Google subsidizes Gemini to gain market share

If you're locked to one provider and they raise prices, you have two options:

Pay more
Spend months migrating

Neither is good.

2. Availability and Reliability

Every API has downtime. Anthropic, OpenAI, Google. When your provider goes down, your agent stops working.

Without portability:

You wait for the provider to fix the issue. Your users are blocked.

With portability:

You route traffic to a backup provider. Your users don't notice.

3. Model Evolution

New models ship constantly. Each brings different capabilities:

GPT-4o: Fast and cheap
Claude Opus: Best reasoning
Gemini 1.5 Pro: 1M token context
Llama 3: Open source, self-hosted

If you're locked to one provider, you can't take advantage of improvements elsewhere.

4. Regulatory and Compliance

Some industries or regions restrict which LLM providers can be used. GDPR, data residency, government contracts.

If your agent only works with one provider, you can't serve customers with different compliance requirements.

The Portability Principles

Here's how to architect agents that work across any LLM.

Principle 1: Separate Business Logic from Provider APIs

Your agent's logic should be independent of which LLM executes it.

Bad: Coupled to provider

async function analyzeDocument(doc: string) {
  const response = await anthropic.messages.create({
    model: "claude-3-5-sonnet-20241022",
    messages: [{ role: "user", content: doc }],
  });
  
  return response.content[0].text;
}

Good: Provider-agnostic

async function analyzeDocument(doc: string) {
  const response = await llm.generate({
    prompt: doc,
    temperature: 0.3,
  });
  
  return response.text;
}

The business logic (analyze document) is separate from the provider API.

Principle 2: Use Abstraction Layers

Don't call provider APIs directly. Use an abstraction that works across providers.

Frameworks that provide this:

Vercel AI SDK: Works with OpenAI, Anthropic, Google, and more
LangChain: Supports dozens of LLM providers
LiteLLM: Unified API for 100+ LLMs

Example with Vercel AI SDK:

import { generateText } from "ai";
import { openai } from "@ai-sdk/openai";
import { anthropic } from "@ai-sdk/anthropic";
 
// Same code, different providers
const result1 = await generateText({
  model: openai("gpt-4o"),
  prompt: "Analyze this data",
});
 
const result2 = await generateText({
  model: anthropic("claude-3-5-sonnet-20241022"),
  prompt: "Analyze this data",
});

The interface is identical. Switching providers is a one-line change.

Principle 3: Standardize Tool Definitions

Tools (function calling) have provider-specific formats. Standardize them.

OpenAI format:

{
  "type": "function",
  "function": {
    "name": "get_weather",
    "description": "Get weather for a city",
    "parameters": {
      "type": "object",
      "properties": {
        "city": { "type": "string" }
      }
    }
  }
}

Anthropic format:

{
  "name": "get_weather",
  "description": "Get weather for a city",
  "input_schema": {
    "type": "object",
    "properties": {
      "city": { "type": "string" }
    }
  }
}

Solution: Use a framework that handles conversion

import { tool } from "ai";
 
const getWeather = tool({
  description: "Get weather for a city",
  parameters: z.object({
    city: z.string(),
  }),
  execute: async ({ city }) => {
    return `Weather in ${city}: Sunny`;
  },
});
 
// Works with any provider
const result = await generateText({
  model: openai("gpt-4o"), // or anthropic(...) or google(...)
  tools: { getWeather },
  prompt: "What's the weather in SF?",
});

The framework converts your tool definition to the provider's format.

Principle 4: Make Skills Portable

If you're using Agent Skills, ensure they work across providers.

The problem:

Anthropic's Claude Skills (like Claude Desktop Skills) only work with Claude models. If you switch to GPT-4, you lose all your Skills.

The solution:

Use a platform that implements the Agent Skills specification in a provider-agnostic way.

Bluebag does this. Skills are defined once and work with any LLM:

import { Bluebag } from "@bluebag/ai-sdk";
import { openai } from "@ai-sdk/openai";
import { anthropic } from "@ai-sdk/anthropic";
 
const bluebag = new Bluebag({
  apiKey: process.env.BLUEBAG_API_KEY,
});
 
// Same Skills, different models
const config1 = await bluebag.enhance({
  model: openai("gpt-4o"),
  messages,
});
 
const config2 = await bluebag.enhance({
  model: anthropic("claude-3-5-sonnet-20241022"),
  messages,
});

Your Skills are portable. Switch models without rewriting workflows.

Principle 5: Test Across Providers

Don't just test with one model. Validate that your agent works across providers.

const providers = [
  openai("gpt-4o"),
  anthropic("claude-3-5-sonnet-20241022"),
  google("gemini-1.5-pro"),
];
 
for (const model of providers) {
  const result = await generateText({
    model,
    prompt: "Summarize this document",
  });
  
  // Validate output meets requirements
  expect(result.text.length).toBeLessThan(500);
  expect(result.text).toContain("key points");
}

This catches provider-specific issues early.

Architecture Patterns for Portability

abstraction-layer

Here are proven patterns for building portable agents.

Pattern 1: Provider Router

Route requests to different providers based on workload characteristics.

function selectProvider(task: Task) {
  if (task.requiresReasoning) {
    return anthropic("claude-3-opus-20240229"); // Best reasoning
  } else if (task.requiresSpeed) {
    return openai("gpt-4o"); // Fastest
  } else if (task.requiresLongContext) {
    return google("gemini-1.5-pro"); // 1M tokens
  } else {
    return openai("gpt-3.5-turbo"); // Cheapest
  }
}
 
const model = selectProvider(task);
const result = await generateText({ model, prompt: task.prompt });

This optimizes cost and performance while maintaining portability.

Pattern 2: Fallback Chain

If the primary provider fails, fall back to alternatives.

const providers = [
  openai("gpt-4o"),           // Primary
  anthropic("claude-3-5-sonnet-20241022"), // Fallback 1
  google("gemini-1.5-pro"),   // Fallback 2
];
 
async function generateWithFallback(prompt: string) {
  for (const model of providers) {
    try {
      return await generateText({ model, prompt });
    } catch (error) {
      console.warn(`Provider ${model} failed, trying next`);
    }
  }
  
  throw new Error("All providers failed");
}

This ensures availability even when a provider is down.

Pattern 3: A/B Testing Across Providers

Test different providers with real traffic to find the best fit.

async function generateWithABTest(prompt: string, userId: string) {
  const variant = getUserVariant(userId); // A or B
  
  const model = variant === "A" 
    ? openai("gpt-4o")
    : anthropic("claude-3-5-sonnet-20241022");
  
  const result = await generateText({ model, prompt });
  
  // Track metrics
  await trackMetrics({
    variant,
    latency: result.latency,
    cost: result.cost,
    userSatisfaction: await getUserFeedback(userId),
  });
  
  return result;
}

This gives you data to make informed provider decisions.

Pattern 4: Multi-Provider Consensus

For critical decisions, query multiple providers and use consensus.

async function criticalDecision(prompt: string) {
  const results = await Promise.all([
    generateText({ model: openai("gpt-4o"), prompt }),
    generateText({ model: anthropic("claude-3-5-sonnet-20241022"), prompt }),
    generateText({ model: google("gemini-1.5-pro"), prompt }),
  ]);
  
  // Use majority vote or weighted average
  return consensus(results);
}

This improves reliability for high-stakes operations.

The Bluebag Approach

Bluebag was designed for portability from day one.

Same Skills, Any LLM

Skills are defined once and work with any LLM. No provider-specific code.

const bluebag = new Bluebag({
  apiKey: process.env.BLUEBAG_API_KEY,
});
 
// Works with any model
const config = await bluebag.enhance({
  model: openai("gpt-4o"), // or anthropic(...), google(...), etc.
  messages,
});
 
const result = streamText(config);

Observability Across Providers

Track execution metrics across all providers in the Bluebag dashboard.

Execution duration per Skill
Success and failure rates
Tool usage patterns
Session timelines

This data helps you understand how your agents perform across different models.

Migration Strategy

If you're already locked to a provider, here's how to migrate.

Step 1: Audit Dependencies

Identify everywhere your code couples to the provider.

# Find provider-specific imports
grep -r "anthropic" src/
grep -r "openai" src/
grep -r "google.generativeai" src/

List:

Direct API calls
Provider-specific tool formats
Prompts optimized for specific models
Error handling for provider errors

Step 2: Introduce Abstraction

Wrap provider calls in an abstraction layer.

// Before
const response = await anthropic.messages.create({...});
 
// After
const response = await llm.generate({...});

Implement llm.generate() to work with your current provider. Later, add support for others.

Step 3: Migrate Tools

Convert tools to a provider-agnostic format.

// Use a framework like Vercel AI SDK
import { tool } from "ai";
 
const myTool = tool({
  description: "...",
  parameters: z.object({...}),
  execute: async (params) => {...},
});

Step 4: Test with Alternative Providers

Add a second provider and test critical workflows.

// Test both providers
const providers = [currentProvider, newProvider];
 
for (const provider of providers) {
  await runIntegrationTests(provider);
}

Fix any provider-specific issues.

Step 5: Deploy with Fallback

Deploy with your current provider as primary and the new provider as fallback.

const providers = [currentProvider, newProvider];
 
async function generate(prompt: string) {
  for (const provider of providers) {
    try {
      return await generateText({ model: provider, prompt });
    } catch (error) {
      // Try next provider
    }
  }
}

Monitor performance and gradually shift traffic.

Step 6: Optimize Provider Selection

Once you have multiple providers working, optimize based on workload.

function selectProvider(task: Task) {
  if (task.type === "reasoning") return anthropic(...);
  if (task.type === "speed") return openai(...);
  if (task.type === "long-context") return google(...);
  return openai("gpt-3.5-turbo"); // Default
}

Cost Comparison

Portability enables cost optimization. Here's how costs vary across providers:

Task Type	Best Provider	Cost per 1M tokens
Simple queries	GPT-3.5 Turbo	$0.50
Fast responses	GPT-4o	$2.50
Complex reasoning	Claude Opus	$15.00
Long context	Gemini 1.5 Pro	$1.25

Without portability:

You use one provider for everything. Average cost: $10/1M tokens.

With portability:

You route tasks to optimal providers. Average cost: $3/1M tokens.

Savings: 70%

Conclusion

Vendor lock-in happens gradually. You make small decisions that seem reasonable. Then you're trapped.

The risks:

Pricing changes you can't avoid
Downtime you can't route around
Missing out on better models
Compliance issues you can't solve

The solution:

Architect for portability from day one:

Separate business logic from provider APIs
Use abstraction layers (Vercel AI SDK, LangChain)
Standardize tool definitions
Make Skills portable (use Bluebag)
Test across providers

The benefits:

Switch providers in minutes, not months
Optimize costs by routing to best provider
Ensure availability with fallback chains
Take advantage of new models as they ship

Build agents that work with any LLM. Your future self will thank you.

Resources

Bluebag Documentation - LLM-agnostic Agent Skills
Vercel AI SDK - Multi-provider abstraction
LangChain - Framework for LLM apps
Agent Skills Specification - Open standard for Skills

Building agents? Start with Bluebag and avoid vendor lock-in from day one.