Open beta · Free to use

Cut LLM token costs
by 93%.

Lattice Proxy sits between your app and any LLM API. Long conversations are semantically compressed before they're forwarded — preserving meaning, dropping redundancy. One URL change. Zero code changes.

93%
Token reduction
Tokens saved live
$703
Saved / day @ 10k req
0ms
Setup time

// how it works

01
Your app sends a request to Lattice Proxy instead of directly to Anthropic or OpenAI. Same format, same SDK, same everything.
02
Lattice checks the token count. Short conversations pass through untouched. Long conversations (8k+ tokens) get compressed — the middle history is summarised using a cheap model, preserving meaning and dropping the bulk.
03
The compressed request is forwarded to the real LLM API with your original API key. The response comes straight back. Your app never knows.
04
Every compression is logged to the admin dashboard with before/after token counts and a quality feedback button — so you can spot-check that meaning was preserved.

// usage

One environment variable. That's it.

# Anthropic SDK ANTHROPIC_BASE_URL=https://latticeproxy.io # OpenAI SDK OPENAI_BASE_URL=https://latticeproxy.io # Or in code (Python) import anthropic client = anthropic.Anthropic( base_url="https://latticeproxy.io", api_key="your-anthropic-key", )

Your API key is passed through directly to the provider — Lattice never stores it.

// faq

Does it work with streaming?
Yes. Streaming responses are passed through natively with no buffering overhead.
What if the summary misses something important?
Use the admin dashboard to review compressions. Hit 👎 if a summary dropped critical context — it gets logged for investigation. Compression only triggers above 8,000 tokens, so short conversations are never touched.
Is my data stored?
Token counts and compression ratios are logged for the dashboard. Message content is not stored — it's compressed in memory and forwarded.
Is it open source?
The proxy is a grassroots build. Source available on request. Built with FastAPI, tiktoken, and the Anthropic SDK.