Open-source · C++20 · v0.5.0

Cut your LLM API bill
by 40–70%

A transparent HTTPS proxy that sits between your app and OpenAI, Anthropic, Gemini, Mistral or Groq. Caches identical queries, strips PII, trims chat history — locally, with sub-millisecond overhead. Zero code changes.

Get started — free ★ Star on GitHub

$ curl -fsSL https://raw.githubusercontent.com/anonymmized/FluxGate/main/install.sh | bash

fluxgate — live

$ fluxgate

⚡ FluxGate · 127.0.0.1:8080 · MITM ON

CACHE

Hit rate 93.2%

Hits137 Misses10

SAVINGS / TRAFFIC

Tokens48.2K Cost$0.2410

See what you'd save

Drag the sliders to estimate your monthly savings. Real numbers depend on your traffic — duplicate requests and long chat histories save the most.

Monthly LLM API spend — $5,000 Duplicate / cacheable requests — 35% Long conversations trimmed by history limit — 20%

$2,150

estimated monthly savings

That's $25,800 a year, at a 43% lower bill.

How it works

FluxGate is a local HTTPS proxy. Your app talks to FluxGate; FluxGate optimises the request and forwards it to the real provider over a fresh, verified TLS connection.

💻

Your App

OpenAI SDK · Anthropic SDK
any HTTP client

→

⚡

FluxGate

filter · cache · log

→

🤖

LLM API

OpenAI · Anthropic
Gemini · Mistral · Groq

Three steps to start saving

# 1. Install (downloads the binary, generates & trusts a local CA)
curl -fsSL https://raw.githubusercontent.com/anonymmized/FluxGate/main/install.sh | bash

# 2. Run — an interactive wizard sets everything up (just press Enter)
fluxgate

# 3. Point your app at the proxy. That's it.
export HTTPS_PROXY=http://127.0.0.1:8080

Watch the savings live in the terminal, or open the web dashboard at http://127.0.0.1:9090/.

Everything you need to cut costs

Built in C++20 on Asio. Thousands of concurrent sessions, per-session strand model, sub-millisecond overhead on a cache hit.

🔁

Smart Response Cache

Identical queries return instantly from memory — no API call, no cost. Keys are normalized (sorted JSON, whitespace-insensitive) so reordered requests still hit. LRU + TTL, with an optional Redis backend.

✂️

Chat History Trimming

Keeps only the last N messages in the context window. Cuts input token count on long conversations — often the biggest line on the bill.

🔒

PII & Secret Redaction

Strips emails, phone numbers, credit cards, IP addresses and API-key-shaped secrets before they leave your network — plus your own custom regex rules. Toggle live.

🎯

Provider Routing

MITM only the providers you choose; everything else tunnels through untouched. Defaults cover OpenAI, Anthropic, Gemini, Mistral, Groq.

📊

Live Control Panel

A web control panel at :9090/ and a terminal TUI — live sparklines, and tabs for an Overview, Requests, Providers and Settings. Tune filters, cache TTL and limits without a restart.

🔎

Live Request Inspector

A rolling log of every intercepted request — host, method, path, model, cache hit/miss, status, latency and tokens saved. Never logs body content.

🏷️

Per-Provider Analytics

Cost saved, hit rate, average latency and traffic broken down by provider and model — with provider-aware pricing instead of a flat rate.

🚦

Rate Limiting & Quotas

Per-client token-bucket limits (requests/min + burst) to stop runaway costs, plus a monthly budget reference that surfaces a dashboard alert.

🐳

Deploy Anywhere

Single static binary, Docker image, systemd unit. Runs on Linux and macOS, on-prem or as a sidecar container.

🔍

Structured Logging

NDJSON logs with request ID, host, method, path and upstream latency. Never logs body content. Pipes cleanly into Datadog, Splunk, or grep.

📈

Prometheus Metrics

GET /metrics exposes every counter in Prometheus text format. Drop straight into your existing Grafana dashboards.

⚙️

TOML Config + CLI

All settings in one fluxgate.toml. Generate it with the wizard or --dump-config; override any value from the command line.

Install

Pick your path. The script and wizard handle the CA and config for you; Docker and source builds are one command each.

# Installs to ~/.fluxgate, generates a local CA, trusts it in your OS store
curl -fsSL https://raw.githubusercontent.com/anonymmized/FluxGate/main/install.sh | bash

# then just run it — the wizard does the rest
fluxgate

Trusting the CA system-wide may prompt for sudo. The installer prints the exact command if it can't do it automatically.

# 1. Generate a local CA into ./certs (one time)
docker run --rm -v "$PWD/certs:/certs" --entrypoint /usr/local/bin/fluxgate \
  ghcr.io/anonymmized/fluxgate:latest --generate-ca /certs/ca

# 2. Run the proxy — bind 0.0.0.0 so the published ports are reachable
docker run -d --name fluxgate \
  -v "$PWD/certs:/certs:ro" \
  -p 8080:8080 -p 9090:9090 \
  ghcr.io/anonymmized/fluxgate:latest \
  --listen 0.0.0.0 --admin 0.0.0.0:9090 \
  --mitm --mitm-ca-key /certs/ca.key.pem --mitm-ca-cert /certs/ca.cert.pem

# 3. Test it (no API key needed)
curl -sk -x http://localhost:8080 https://httpbin.org/get

Inside a container the proxy must bind 0.0.0.0 — otherwise Docker's published ports can't reach it. docker compose up -d does all of this for you. To verify upstream certs without -k, mount and trust ./certs/ca.cert.pem in the calling environment.

git clone https://github.com/anonymmized/FluxGate.git
cd FluxGate
cmake -S . -B build && cmake --build build --parallel
./build/FluxGate            # launches the setup wizard

Requires C++20, CMake ≥ 3.15 and OpenSSL. asio, simdjson and toml++ are fetched automatically. Redis backend: add -DFLUXGATE_REDIS=ON.

Pricing

FluxGate is free and open-source. Enterprise support is available for teams that need SLAs, custom filters, or managed deployment.

Open Source

Free forever

Everything you need for production deployments.

Full MITM proxy pipeline
PII redaction + history trimming
In-memory & Redis cache
Docker + signed binary releases
Prometheus metrics + dashboards
Community support (GitHub Issues)

View on GitHub →

Enterprise

Custom pricing

For teams processing >10M tokens/day or with compliance requirements.

Everything in Open Source
SLA-backed support
Custom filter plugins
Managed deployment assistance
Security & compliance review
Dedicated Slack channel

Frequently asked questions

Is it safe to route my API keys through FluxGate?

FluxGate runs locally — on your machine or inside your network. Your traffic never leaves your infrastructure through it. It never stores API keys or response content; structured logs contain only metadata (host, method, path, latency). See SECURITY.md for the full trust model.

Does it work with streaming responses (SSE)?

Yes — streaming responses are relayed byte-for-byte without buffering. FluxGate automatically skips caching for SSE streams (Content-Type: text/event-stream).

Which languages / SDKs are supported?

Any HTTP client that supports HTTP CONNECT proxies: Python (openai, anthropic), Node.js, Go, Ruby, Java, curl, and more. See examples/ for ready-to-run code.

How much traffic can it handle?

Async C++20 (Asio) with a per-session strand model. A single instance handles thousands of concurrent connections on commodity hardware; each cache hit adds <1ms of overhead.

Can I run multiple instances / scale horizontally?

Yes — use Redis as the cache backend (cache.backend = "redis") so all instances share one cache. Each instance is otherwise stateless.

Cut your LLM API billby 40–70%