Open-source · C++20 · v0.5.0

Cut your LLM API bill
by 40–70%

A transparent HTTPS proxy that sits between your app and OpenAI, Anthropic, Gemini, Mistral or Groq. Caches identical queries, strips PII, trims chat history — locally, with sub-millisecond overhead. Zero code changes.

Get started — free ★ Star on GitHub
$ curl -fsSL https://raw.githubusercontent.com/anonymmized/FluxGate/main/install.sh | bash
fluxgate — live
$ fluxgate
FluxGate · 127.0.0.1:8080 · MITM ON
CACHE
Hit rate 93.2%
Hits137 Misses10
SAVINGS / TRAFFIC
Tokens48.2K Cost$0.2410
40–70%
API cost reduction
<1ms
Cache-hit latency
0
Code changes
10k+
Concurrent sessions

See what you'd save

Drag the sliders to estimate your monthly savings. Real numbers depend on your traffic — duplicate requests and long chat histories save the most.

$2,150
estimated monthly savings
That's $25,800 a year, at a 43% lower bill.

How it works

FluxGate is a local HTTPS proxy. Your app talks to FluxGate; FluxGate optimises the request and forwards it to the real provider over a fresh, verified TLS connection.

💻
Your App
OpenAI SDK · Anthropic SDK
any HTTP client
FluxGate
filter · cache · log
🤖
LLM API
OpenAI · Anthropic
Gemini · Mistral · Groq

Three steps to start saving

# 1. Install (downloads the binary, generates & trusts a local CA) curl -fsSL https://raw.githubusercontent.com/anonymmized/FluxGate/main/install.sh | bash # 2. Run — an interactive wizard sets everything up (just press Enter) fluxgate # 3. Point your app at the proxy. That's it. export HTTPS_PROXY=http://127.0.0.1:8080

Watch the savings live in the terminal, or open the web dashboard at http://127.0.0.1:9090/.

Everything you need to cut costs

Built in C++20 on Asio. Thousands of concurrent sessions, per-session strand model, sub-millisecond overhead on a cache hit.

🔁

Smart Response Cache

Identical queries return instantly from memory — no API call, no cost. Keys are normalized (sorted JSON, whitespace-insensitive) so reordered requests still hit. LRU + TTL, with an optional Redis backend.

✂️

Chat History Trimming

Keeps only the last N messages in the context window. Cuts input token count on long conversations — often the biggest line on the bill.

🔒

PII & Secret Redaction

Strips emails, phone numbers, credit cards, IP addresses and API-key-shaped secrets before they leave your network — plus your own custom regex rules. Toggle live.

🎯

Provider Routing

MITM only the providers you choose; everything else tunnels through untouched. Defaults cover OpenAI, Anthropic, Gemini, Mistral, Groq.

📊

Live Control Panel

A web control panel at :9090/ and a terminal TUI — live sparklines, and tabs for an Overview, Requests, Providers and Settings. Tune filters, cache TTL and limits without a restart.

🔎

Live Request Inspector

A rolling log of every intercepted request — host, method, path, model, cache hit/miss, status, latency and tokens saved. Never logs body content.

🏷️

Per-Provider Analytics

Cost saved, hit rate, average latency and traffic broken down by provider and model — with provider-aware pricing instead of a flat rate.

🚦

Rate Limiting & Quotas

Per-client token-bucket limits (requests/min + burst) to stop runaway costs, plus a monthly budget reference that surfaces a dashboard alert.

🐳

Deploy Anywhere

Single static binary, Docker image, systemd unit. Runs on Linux and macOS, on-prem or as a sidecar container.

🔍

Structured Logging

NDJSON logs with request ID, host, method, path and upstream latency. Never logs body content. Pipes cleanly into Datadog, Splunk, or grep.

📈

Prometheus Metrics

GET /metrics exposes every counter in Prometheus text format. Drop straight into your existing Grafana dashboards.

⚙️

TOML Config + CLI

All settings in one fluxgate.toml. Generate it with the wizard or --dump-config; override any value from the command line.

Install

Pick your path. The script and wizard handle the CA and config for you; Docker and source builds are one command each.

# Installs to ~/.fluxgate, generates a local CA, trusts it in your OS store curl -fsSL https://raw.githubusercontent.com/anonymmized/FluxGate/main/install.sh | bash # then just run it — the wizard does the rest fluxgate

Trusting the CA system-wide may prompt for sudo. The installer prints the exact command if it can't do it automatically.

# 1. Generate a local CA into ./certs (one time) docker run --rm -v "$PWD/certs:/certs" --entrypoint /usr/local/bin/fluxgate \ ghcr.io/anonymmized/fluxgate:latest --generate-ca /certs/ca # 2. Run the proxy — bind 0.0.0.0 so the published ports are reachable docker run -d --name fluxgate \ -v "$PWD/certs:/certs:ro" \ -p 8080:8080 -p 9090:9090 \ ghcr.io/anonymmized/fluxgate:latest \ --listen 0.0.0.0 --admin 0.0.0.0:9090 \ --mitm --mitm-ca-key /certs/ca.key.pem --mitm-ca-cert /certs/ca.cert.pem # 3. Test it (no API key needed) curl -sk -x http://localhost:8080 https://httpbin.org/get

Inside a container the proxy must bind 0.0.0.0 — otherwise Docker's published ports can't reach it. docker compose up -d does all of this for you. To verify upstream certs without -k, mount and trust ./certs/ca.cert.pem in the calling environment.

git clone https://github.com/anonymmized/FluxGate.git cd FluxGate cmake -S . -B build && cmake --build build --parallel ./build/FluxGate # launches the setup wizard

Requires C++20, CMake ≥ 3.15 and OpenSSL. asio, simdjson and toml++ are fetched automatically. Redis backend: add -DFLUXGATE_REDIS=ON.

Pricing

FluxGate is free and open-source. Enterprise support is available for teams that need SLAs, custom filters, or managed deployment.

Open Source
Free forever
Everything you need for production deployments.
  • Full MITM proxy pipeline
  • PII redaction + history trimming
  • In-memory & Redis cache
  • Docker + signed binary releases
  • Prometheus metrics + dashboards
  • Community support (GitHub Issues)
View on GitHub →

Frequently asked questions

 

Is it safe to route my API keys through FluxGate?
FluxGate runs locally — on your machine or inside your network. Your traffic never leaves your infrastructure through it. It never stores API keys or response content; structured logs contain only metadata (host, method, path, latency). See SECURITY.md for the full trust model.
Does it work with streaming responses (SSE)?
Yes — streaming responses are relayed byte-for-byte without buffering. FluxGate automatically skips caching for SSE streams (Content-Type: text/event-stream).
Which languages / SDKs are supported?
Any HTTP client that supports HTTP CONNECT proxies: Python (openai, anthropic), Node.js, Go, Ruby, Java, curl, and more. See examples/ for ready-to-run code.
How much traffic can it handle?
Async C++20 (Asio) with a per-session strand model. A single instance handles thousands of concurrent connections on commodity hardware; each cache hit adds <1ms of overhead.
Can I run multiple instances / scale horizontally?
Yes — use Redis as the cache backend (cache.backend = "redis") so all instances share one cache. Each instance is otherwise stateless.