Mock LLM API Guide
Create free LLM streaming endpoints that mimic OpenAI, Claude, and other AI providers. Build and test AI features without spending thousands on API calls.
Getting Started
Creating a mock LLM streaming endpoint takes less than 30 seconds. Follow these simple steps:
- 1
Visit the LLM Mock Page
Go to mockapi.dog/llm-mock. A unique 6-character code is automatically generated for your endpoint.
- 2
Choose LLM Provider Profile
Select which provider's response format to emulate:
- • OpenAI - Chat Completions API format (GPT-4, GPT-3.5)
- • Anthropic Claude - Claude streaming format
- • Generic Stream - Provider-agnostic token stream
- • Generic JSON - Simple JSON response (no streaming)
- 3
Select Content Mode
Choose how response content is generated:
- • Generated - Auto-generate LLM-like text (Chat, Technical, or Markdown style)
- • Static - Use your provided text exactly as is
- • Hybrid - Your text followed by generated continuation
- 4
Configure Token Generation (Optional)
For Generated or Hybrid modes, set minimum and maximum tokens (100-300 recommended). Generated text length will be randomly between these values. Not needed for Static mode.
- 5
Complete Verification & Save
Complete the Turnstile verification, then click "Save Mock Endpoint". Your endpoint URL is automatically copied!
https://abc123.mockapi.dog/v1/chat/completions
That's it! Start streaming immediately
Your endpoint is ready to use. Replace your OpenAI/Claude baseURL with your mock endpoint and start testing. No authentication or API keys required.
The Cost Problem
Real LLM APIs are expensive. During development, testing, and prototyping, costs can quickly spiral out of control. Here's what you'd pay with real providers:
OpenAI GPT-4
ExpensiveExample: Testing a chatbot with 1000 conversations (avg 500 tokens each) = $20+
Anthropic Claude
CostlyCI/CD Pipeline: Running tests 100 times per day = $300+/month
With MockAPI Dog: $0
Free streaming responses for development and testing. Save thousands during the development phase. Switch to real APIs only when you're ready for production.
Why Use LLM Mock API?
Save Money
Avoid spending thousands of dollars during development. Test your UI, streaming logic, and error handling without burning through API credits.
- No API keys or billing setup required
- Free requests during development
- Perfect for indie developers and startups
Instant Testing
Test streaming responses, UI animations, and error states instantly. No waiting for real API calls or dealing with rate limits.
- Configurable response speed and tokens
- Test edge cases and error scenarios
- Works offline - no internet required
Multiple Providers
Test your app with different LLM providers without managing multiple API keys. Switch between OpenAI, Claude, and generic formats effortlessly.
- OpenAI-compatible endpoints
- Anthropic Claude format support
- Generic SSE streaming format
CI/CD Integration
Run automated tests in your CI/CD pipeline without worrying about API costs or rate limits. Test your AI features on every commit.
- No authentication required
- Consistent, predictable responses
- Fast execution for quick feedback
Supported Providers
MockAPI Dog supports streaming formats for popular LLM providers. Simply set your endpoint as the baseURL in your preferred SDK.
OpenAI Format
Compatible with the official OpenAI SDK. Supports streaming responses in the same format as GPT-4 and GPT-3.5-turbo.
Anthropic Format
Compatible with the Anthropic SDK. Supports streaming responses in the same format as Claude 3 Opus, Sonnet, and Haiku.
Generic SSE Format
Standard Server-Sent Events (SSE) format. Use with any streaming client or build your own custom integration.
- Custom LLM integrations
- Testing EventSource implementations
- Learning streaming protocols
Content Modes
Choose how your mock LLM endpoint generates response content. Each mode offers different control over the streamed text.
Generated
Auto-generate LLM-like text in different styles. Choose from Chat (conversational tone), Technical (programming focused), or Markdown (formatted with lists and code blocks).
Static
Use your exact provided text as the response. The text streams exactly as written without any generation or modification.
Hybrid
Combines your provided text with auto-generated continuation. Your text streams first, followed by generated LLM-like content.
Text Styles for Generated Content
When using Generated or Hybrid modes, you can choose between Chat (conversational), Technical (programming-focused), or Markdown (includes formatting, lists, code blocks) styles.
Token Generation Settings
Fine-tune how your mock LLM endpoint generates and streams tokens to match your testing needs.
Token Count
Set the number of tokens (roughly equivalent to words) to generate. Useful for testing different response lengths.
Streaming Speed
Control how fast tokens are streamed. Test your UI with different streaming speeds to ensure smooth animations.
Pro Tip
Test with different speeds to ensure your UI handles both fast and slow streaming gracefully. Real LLM APIs can vary significantly in response time.
Code Examples
Here's how to use your mock LLM endpoint with popular SDKs and libraries.
OpenAI SDK
Replace the baseURL with your mock endpoint. No API key required!
import OpenAI from 'openai'; const openai = new OpenAI({baseURL: 'https://xyz789.mockapi.dog/llm',apiKey: 'dummy-api-key', // Mock endpoint doesn't check API keys }); async function main() { const stream = await openai.chat.completions.create({ model: 'gpt-4', messages: [{ role: 'user', content: 'Hello!' }], stream: true, }); for await (const chunk of stream) { const content = chunk.choices[0]?.delta?.content || ''; process.stdout.write(content); } } main();
Anthropic SDK
Use with the Anthropic SDK by setting a custom baseURL.
import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic({baseURL: 'https://xyz789.mockapi.dog/llm',apiKey: 'dummy-api-key', // Mock endpoint doesn't check API keys }); async function main() { const stream = await anthropic.messages.stream({ model: 'claude-3-opus-20240229', max_tokens: 1024, messages: [{ role: 'user', content: 'Hello!' }], }); for await (const chunk of stream) { if (chunk.type === 'content_block_delta' && chunk.delta.type === 'text_delta') { process.stdout.write(chunk.delta.text); } } } main();
Generic Fetch (SSE)
Use with vanilla JavaScript/TypeScript for maximum flexibility.
async function streamResponse() {const response = await fetch('https://xyz789.mockapi.dog/llm/stream', {method: 'POST', headers: { 'Content-Type': 'application/json', }, body: JSON.stringify({ prompt: 'Hello, world!', max_tokens: 500, }), }); const reader = response.body?.getReader(); const decoder = new TextDecoder(); while (true) { const { done, value } = await reader.read(); if (done) break; const chunk = decoder.decode(value); const lines = chunk.split('\n'); for (const line of lines) { if (line.startsWith('data: ')) { const data = line.slice(6); if (data === '[DONE]') return; try { const json = JSON.parse(data); console.log(json.content); } catch (e) { // Skip invalid JSON } } } } } streamResponse();
It's that simple!
Just replace the baseURL and you're ready to go. Your existing code will work without modifications.
Real-World Use Cases
Chatbot Development
Build and test chatbot UIs without spending on API calls. Test message threading, streaming animations, and error handling.
- Test streaming message animations
- Verify conversation threading
- Debug UI edge cases
Testing & QA
Run automated tests and manual QA without API costs. Test different response scenarios and edge cases consistently.
- Automated E2E tests in CI/CD
- Consistent test data
- Fast test execution
Learning & Tutorials
Learn AI integration without spending money. Perfect for tutorials, courses, and educational content.
- No API key setup for students
- Free practice
- Safe learning environment
MVPs & Demos
Build proof-of-concepts and demos without upfront costs. Show investors and stakeholders your vision before investing in production APIs.
- Quick prototyping
- Investor demos
- Validate ideas cheaply
Advanced Features
Custom Headers
Add custom response headers to test CORS, authentication flows, and other header-based logic in your LLM integration.
Configurable Delays
Simulate network latency and slow streaming speeds to test loading states and timeout handling in your application.
Error Simulation
Test error handling by simulating rate limits, authentication errors, and streaming interruptions.
No Authentication
Mock endpoints don't require API keys or authentication. Perfect for CI/CD pipelines and public demos.
Troubleshooting
Streaming not working
Ensure you're using the correct provider format and that your client supports streaming. Check that you're reading the response as a stream, not as a complete response.
// Make sure to set stream: true
const stream = await openai.chat.completions.create({
stream: true, // This is required!
// ...
});Response too fast/slow
Adjust the streaming speed in your endpoint configuration. Different speeds help test various network conditions and user experiences.
SDK compatibility issues
Make sure you're using a recent version of the SDK. Check the provider format matches your SDK (OpenAI SDK needs OpenAI format, Anthropic SDK needs Anthropic format).
CORS errors in browser
Mock endpoints are configured with permissive CORS headers. If you're still getting CORS errors, check your request headers and ensure you're not sending restricted headers.
Tips & Best Practices
Test with different speeds
Real LLM APIs vary in speed. Test your UI with both fast and slow streaming to ensure smooth user experience in all conditions.
Use environment variables
Store your baseURL in environment variables. Switch between mock and production APIs by changing a single variable.
// .env.development
OPENAI_BASE_URL=https://xyz789.mockapi.dog/llm
// .env.production
OPENAI_BASE_URL=https://api.openai.com/v1Test error scenarios
Don't just test happy paths. Use error simulation to test rate limits, network failures, and malformed responses.
LLM Development Workflow
Follow this workflow for efficient AI development:
- Build UI and streaming logic with mock endpoints
- Test thoroughly with different content modes and speeds
- Run automated tests in CI/CD with mock endpoints
- Switch to real API only for final integration testing
- Deploy with production API keys
Validate before production
Before switching to production APIs, validate your implementation with the real provider's API in a staging environment to catch any differences in behavior.
Glossary
LLM (Large Language Model)
AI models like GPT-4 and Claude that generate human-like text responses. Examples: OpenAI's GPT series, Anthropic's Claude, Google's Gemini.
Streaming API
An API that sends data in chunks rather than waiting for the complete response. Allows for real-time display of AI-generated text as it's being created.
Token
The basic unit of text in LLMs. Roughly equivalent to a word or word fragment. LLM pricing is typically based on token count.
SSE (Server-Sent Events)
A technology that allows servers to push data to clients in real-time. Used by LLM APIs to stream responses.
baseURL
The base address for API requests. Replace this with your mock endpoint URL to redirect requests to MockAPI Dog instead of the real provider.
Provider
Companies that offer LLM APIs, such as OpenAI (GPT), Anthropic (Claude), Google (Gemini), etc.
Ready to Start Building?
Create your first mock LLM streaming endpoint in seconds. No signup, no credit card, no hassle. Start building AI features without spending thousands on API calls.