Context Guard Cloud
Integration Guide
Context Guard is a hosted proxy that sits in front of your LLM calls. You do not need to clone a repo, run Docker, or self-host anything for the free trial. Just create an API key in Settings, point your client at https://api.ctx-guard.com, and add your key header.
How it works
Your app sends prompts to Context Guard first. We run a hybrid detection pipeline (regex rules, source-aware analysis, and an optional ML classifier) before forwarding upstream.
- • Create an API key in Settings
- • Change your LLM client base URL to
https://api.ctx-guard.com - • Add
X-API-Key: cg_live_...to every request - • Keep using your normal OpenAI / Anthropic SDK
The hosted proxy ships with rules + source-aware detection enabled by default (p50 ~1.3ms). The ML classifier is opt-in for workloads that need higher recall on adversarial attacks. See Detection Architecture for the full pipeline.
Detection architecture
Context Guard layers three detection stages. Each stage runs only when earlier stages don't already have a confident decision, so latency stays low on the hot path.
- 1. Regex rule engine~0.5ms p50
Curated pattern set covering known prompt-injection phrasings, role-override attempts, exfiltration markers, and tool-misuse signatures. Hot-reloadable without restarts.
- 2. Source-aware analysisincluded in ~1.3ms p50
Distinguishes user input from system / tool / retrieved content and weights risk accordingly. Catches indirect injection embedded in retrieved documents or tool output.
- 3. ML classifier (optional)~84ms when it fires
DeBERTa-v3 fine-tuned for prompt-injection detection. Disabled by default. Only runs when the rule layer is silent, and only elevates risk. It never overrides a rule decision.
| Benchmark | Recall | Precision | FPR |
|---|---|---|---|
| BIPIA | 100% | 100% | 0% |
| TensorTrust | 100% | 100% | 0% |
| CyberSecEval | 97.4% | 100% | 0% |
| JailbreakBench | 77.1% | 93.1% | 13.3% |
| AdvBench | 22.5% | 100% | 0% |
Numbers reflect the hybrid pipeline (rules + source-aware + ML). The rule layer is production-ready standalone. ML adds recall on adversarial categories at the cost of per-request latency when it fires.
ML classifier (optional)
An opt-in DeBERTa-v3 model that runs after the rule layer to catch adversarial prompts the regex rules don't have signatures for.
- • Model:
protectai/deberta-v3-base-prompt-injection-v2 - • ~184M parameters, CPU-only inference (no GPU required)
- • Fine-tuned for prompt-injection classification
- • Loaded lazily on first request after enablement
- • Runs only when the rule layer is silent (no rule match)
- • Can elevate the risk score when it detects injection
- • Never overrides a rule decision: rules are authoritative
- • Silent rule + silent ML = the request passes through
- • Rules-only p50: ~0.5ms
- • Rules + source-aware p50: ~1.3ms
- • When ML fires: ~84ms (added to the request)
Because ML only runs on rule-silent traffic, average added latency depends on your traffic mix. Clean traffic in production typically sees the rule-layer latency on most requests.
The ML classifier is opt-in. For self-hosted / enterprise deployments, set the environment variable on the proxy:
# Enable the optional ML classifier CG_ENABLE_ML=1
On the hosted cloud proxy, ML is off by default. Contact us if you want it enabled on your tenant.
Fastest possible setup
If you already have an API key, this is the minimum change required.
curl -X POST https://api.ctx-guard.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: cg_live_your_key_here" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello"}]
}'That's it. Same shape as OpenAI - just send the request to Context Guard instead.
OpenAI SDK integration
Keep using the official SDK. Just change the base URL and add your Context Guard key.
from openai import OpenAI
client = OpenAI(
api_key="your-openai-key",
base_url="https://api.ctx-guard.com/v1",
default_headers={"X-API-Key": "cg_live_your_key_here"},
)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}],
)
print(response.choices[0].message.content)import OpenAI from "openai";
const client = new OpenAI({
apiKey: "your-openai-key",
baseURL: "https://api.ctx-guard.com/v1",
defaultHeaders: { "X-API-Key": "cg_live_your_key_here" },
});
const response = await client.chat.completions.create({
model: "gpt-4o-mini",
messages: [{ role: "user", content: "Hello" }],
});
console.log(response.choices[0].message.content);Anthropic integration
Same idea - point the client at Context Guard and include your key header.
import anthropic
client = anthropic.Anthropic(
api_key="your-anthropic-key",
base_url="https://api.ctx-guard.com",
default_headers={"X-API-Key": "cg_live_your_key_here"},
)
message = client.messages.create(
model="claude-3-5-sonnet-latest",
max_tokens=256,
messages=[{"role": "user", "content": "Hello"}],
)
print(message.content)Webhooks
Send threat events to Slack, a SIEM, or your internal incident pipeline.
Configure webhook endpoints in Settings. You can subscribe to block, redact, log, and allow events.
{
"event": "block",
"request_id": "req_123",
"risk_score": 0.97,
"threat_type": "prompt_injection",
"severity": "critical",
"timestamp": "2026-05-07T13:00:00Z"
}API reference
Main endpoints you'll actually use on the hosted service.
| Method | Endpoint | Purpose |
|---|---|---|
| POST | https://api.ctx-guard.com/v1/chat/completions | OpenAI-compatible proxy |
| POST | https://api.ctx-guard.com/v1/messages | Anthropic-compatible proxy |
| POST | https://api.ctx-guard.com/api/v1/inspect | Direct prompt inspection |
| GET | https://api.ctx-guard.com/api/v1/threats | Threat log |
| GET | https://api.ctx-guard.com/api/v1/stats | Dashboard stats |
| GET | https://api.ctx-guard.com/api/v1/settings | Read settings |
| PUT | https://api.ctx-guard.com/api/v1/settings | Update settings |
Use X-API-Key on your requests. Your LLM provider key stays in the normal SDK auth field.
Common errors
The main ones trial users are likely to hit.
Your Context Guard key is missing, revoked, or malformed.
Your trial ended or the key expiry date passed.
You hit the per-key request cap. Slow down or upgrade.
The underlying model provider returned an error or timed out.
Trial & upgrade
Free trial users use the hosted cloud proxy. Self-hosting is not part of the free-trial path.
Local hosting is not part of the free trial. If you need a private or self-hosted deployment, speak to us and we can discuss an enterprise setup.