LangChain Integration — Deep Dive
Detailed guide for integrating Faramesh with LangChain, covering interception patterns, policy enforcement, delegation examples, SDK usage, and production hardening.
Overview
Section titled “Overview”This deep-dive describes practical patterns for integrating Faramesh into LangChain-based applications. It focuses on runtime governance: intercepting LLM requests, canonicalizing inputs, applying policy (FPL) checks, managing credentials and delegation, and operational concerns for production deployments.
Goals:
- Show how to intercept LangChain LLM calls in Python and Node to enforce policies.
- Provide canonicalization and policy examples for prompt control, tool use, and data exfiltration prevention.
- Demonstrate delegation grants for limited agent capabilities.
- Explain monitoring, metrics, and production hardening recommendations.
Audience: engineers building LLM-powered services that require governance, auditability, or credential brokering.
Architecture Summary
Section titled “Architecture Summary”High-level flow when integrating Faramesh with LangChain:
- Application code calls a LangChain LLM wrapper (e.g.,
OpenAI,ChatOpenAI). - A Faramesh interception layer (middleware, proxy, or sidecar) receives canonical request: {
agent: calling service identityinput: canonicalized prompt or messagestools: declared tool invocations (if any)secrets: references to credentials (never raw secrets) }
- Faramesh canonicalizes inputs (deterministic, normalized whitespace, placeholders) and computes a request fingerprint.
- The policy engine evaluates FPL rules for allow/deny/modify.
- If allowed, Faramesh may mutate the request (insert redactions, add metadata), supply ephemeral credentials via credential-sequestration, or deny the request with an audit record.
- The LLM call proceeds using the provided (or original) network call method. All decisions are recorded to the audit ledger with a signed DPR entry.
Integration options:
- In-process middleware (Python/Node): wrap the LangChain LLM client and call Faramesh SDK to evaluate policies synchronously.
- Local sidecar (HTTP/UNIX socket): application sends canonical request to local Faramesh daemon which returns a decision.
- Network proxy / gateway (MCP interception): especially for centralized fleets and multi-service topologies.
Trade-offs:
- In-process: lowest latency, but requires embedding governance libraries into the application.
- Sidecar/proxy: language-agnostic, centralizes policy, easier to update policies without deploying apps.
- Gateway: powerful for multi-tenant, but introduces a network hop and requires robust auth between services.
Recommended Integration Patterns
Section titled “Recommended Integration Patterns”1) In-process Python middleware (preferred for small teams)
Section titled “1) In-process Python middleware (preferred for small teams)”- Wrap LangChain
LLMclasses via subclassing or a customLLMwrapper. - Use Faramesh Python SDK to call
evaluate_request()which returns an allow/deny and optionalmutationsorcredentialmaterial. - Keep canonicalization deterministic: use the SDK helper
canonicalize_text().
Example wrapper (conceptual):
from langchain.llms import OpenAIfrom faramesh import FarameshClient
class FarameshGuardedOpenAI(OpenAI): def __init__(self, *args, faramesh_client: FarameshClient, **kwargs): super().__init__(*args, **kwargs) self.faramesh = faramesh_client
def _call(self, prompt, stop=None): req = { "agent": self.faramesh.agent_id, "input": self.faramesh.canonicalize_text(prompt), "meta": {"client": "langchain", "model": self.model_name}, } decision = self.faramesh.evaluate_request(req) if not decision.allow: raise Exception(f"Request denied: {decision.reason}") # Apply safe mutations if present prompt = decision.mutate_text(prompt) if decision.mutations else prompt # If credential was provided, inject via SDK-managed transport with self.faramesh.credential_scope(decision.credential) as creds: return super()._call(prompt, stop=stop)Notes:
- Use context managers for ephemeral credentials so secrets are not logged or leaked.
- Keep the SDK call timeout bounded (e.g., 200–500ms) to avoid request stalls.
2) Local sidecar (Unix socket / HTTP)
Section titled “2) Local sidecar (Unix socket / HTTP)”- Start a Faramesh local daemon configured for
--agent id=.... - App canonicalizes request and POSTs to
/v1/inspector similar endpoint. - Daemon returns a decision and optional mutations.
Advantages:
- Language-agnostic: any LangChain runtime (Python, Node) can call the socket.
- Centralized auditing per-host.
Caveats:
- Ensure robust authentication between app and sidecar (local mTLS or token).
- Keep privacy: do not send raw user PII outside isolation unless necessary.
3) Network gateway / MCP interception
Section titled “3) Network gateway / MCP interception”- Useful for multi-service platforms where a single ingress can enforce policies.
- Configure gateway to intercept outgoing LLM calls, canonicalize, and call the policy engine.
Best practices:
- Use
bearer_or_mtlsedge auth modes to allow flexible deploy topologies. - Implement retries and timeouts so gateway failures fail-safe (deny or shadow with alert).
Canonicalization Guidance for LangChain
Section titled “Canonicalization Guidance for LangChain”Canonicalization is critical to produce stable policy evaluations and sensible audit entries.
Rules:
- Normalize Unicode to NFC.
- Strip trailing/leading whitespace and collapse multi-space sequences to single spaces except within code blocks.
- Remove variable session metadata (timestamps, ephemeral IDs) or replace with placeholder tokens.
- For chat APIs, canonicalize the
role/contentstructure in a stable order. - For tool calls, represent tools as structured elements:
tool:{name} args:{...}rather than free-text.
Example canonicalization for chat messages:
Input (user):
User: Can you access https://internal.example.com/data and summarize?Canonicalized form (for policy):
[USER_MESSAGE] Can you access <URL_REDACTED> and summarize?Policy authors should write rules against canonical shapes rather than raw text where possible.
Example FPL Policies for LangChain Scenarios
Section titled “Example FPL Policies for LangChain Scenarios”- Deny outbound URL fetches from non-privileged agents:
agent "untrusted-webhook" { when true { deny if input.match("<URL_REDACTED>") }}- Allow tool use only when a standing grant exists:
agent "qa-service" { when true { allow if has_grant("tooling:fetch_internal_data") else deny with message("Tooling grant required") }}- Redact secrets or credential patterns before sending to model:
agent "*" { when true { mutate input = redact_secrets(input) }}- Shadow mode for canary rollout:
agent "experimental-model*" { when true { shadow if percent(5) }}Shadow entries are recorded in audit but not enforced; use them to collect false-positives before enforcement.
Delegation Example (Ephemeral Tool Grant)
Section titled “Delegation Example (Ephemeral Tool Grant)”Use delegation grants to give a LangChain agent short-lived permission to call internal tools.
Flow:
- Operator creates a standing grant
delegate grant create --to qa-service --scope tooling:fetch_internal_data --ttl 1h. - Faramesh stores the grant in
delegate_grantsand returns a tokendel_<b64>.<hmac>. - LangChain runtime requests the token via an admin flow or via an OIDC flow delegated to the agent.
- For each request, the app attaches the grant token and Faramesh validates it server-side.
In code (pseudo):
# Requesting ephemeral usagedecision = faramesh.evaluate_request({ "agent": "qa-service", "input": canonical_prompt, "delegation_token": "del_..."})if not decision.allow: raise Exception("Denied")# else proceedBest practice: keep grants minimized in scope and life-span.
SDK Usage Patterns (Python)
Section titled “SDK Usage Patterns (Python)”- Use
FarameshClient.evaluate_request()for synchronous checks. - Use
FarameshClient.credential_scope(...)context manager to get ephemeral credentials and automatically revoke/expiry handling. - Use non-blocking health checks and local caches for policy data to reduce evaluation latency.
Example:
from faramesh import FarameshClientclient = FarameshClient(socket_path="/run/faramesh.sock")
with client.credential_scope("vault:read:secrets/db") as creds: result = llm.call(prompt, api_key=creds.api_key)Caching:
- Cache allow decisions for idempotent requests where safe (e.g., model temperature=0 deterministic tasks).
- TTL should be conservative (seconds to minutes) depending on risk appetite.
LangChain-Specific Hooks and Tooling
Section titled “LangChain-Specific Hooks and Tooling”LangChain offers multiple extension points where Faramesh can be attached:
LLMwrappers (synchronous calls)Agenttool handlers (before/after tool invocation)Chains(pre/post process chains)
Key places to enforce policy:
- Pre-LLM call: deny or mutate prompts
- Pre-tool call: check tool-specific grants
- Post-LLM response: redact or detect exfiltration (e.g., secret patterns) and record an audit event
Example: tool wrapper to check rights before allowing requests.get from tools.
Observability and Metrics
Section titled “Observability and Metrics”Expose the following metrics from the Faramesh sidecar/SDK for scraping by Prometheus:
faramesh_requests_total{decision=allow|deny|shadow}faramesh_policy_eval_duration_seconds{policy_pack=...}(histogram)faramesh_credential_issue_total{type=vault|aws|gcp}faramesh_audit_dpr_signed_total
Tracing:
- Instrument the evaluate_request path with OpenTelemetry spans.
- Tag spans with
agent_id,policy_result, andrequest_fingerprint.
Logging:
- Only log request metadata and canonical forms (never raw secrets).
- Keep redaction deterministic so logs are consistent and searchable.
Production Hardening
Section titled “Production Hardening”- Fail-closed vs fail-open: choose deny-by-default for high-risk flows. For low-risk exploratory functions, shadow mode then opt into enforce.
- Timeouts: SDK calls to Faramesh should have conservative timeouts. If the interceptor times out, default to either deny or shadow per policy.
- Credential brokering: never return raw long-lived secrets. Use ephemeral credentials (Vault AppRole short TTL or STS AssumeRole with short session).
- Rate-limiting: apply per-agent quotas for LLM calls to prevent runaway cost or abuse.
- Canary rollout: enable rules in shadow mode for a small percentage and gradually increase coverage.
- Key rotation: rotate DPR signing keys and HMAC keys and rehearse audit verification.
Debugging Tips
Section titled “Debugging Tips”- If a request is unexpectedly denied, reproduce by sending the canonicalized request to Faramesh
inspectendpoint. - Use shadow mode to collect examples without breaking traffic.
- Collect the
request_fingerprintfrom logs/audit to correlate model outputs with policy decisions. - Verify delegation tokens in
delegate_grantstable for expiry and scope.
Example End-to-End (Python)
Section titled “Example End-to-End (Python)”- App starts Faramesh sidecar with agent identity
qa-service. - App configures LangChain to use
FarameshGuardedOpenAIwrapper. - App calls chain; wrapper sends canonical request to sidecar.
- Sidecar evaluates policy, returns allow and ephemeral Vault credential.
- Wrapper injects the credential into the tool client and performs the LLM call.
- Sidecar emits audit DPR record with reference to the request fingerprint.
Q: Should I send raw user data to Faramesh for evaluation? A: Prefer canonicalized and redacted forms. If raw PII must be evaluated, ensure minimal retention and consent.
Q: Does this add latency? A: Yes — but typical in-process SDK calls add ~10–50ms; sidecar/proxy adds an extra network hop. Optimize caching and use local caches for low-risk decisions.
Q: How to test policy changes safely?
A: Use shadow mode, run unit tests with canonicalized fixtures, and use synthetic traffic to validate false-positive rates.
Checklist Before Enabling Enforcement
Section titled “Checklist Before Enabling Enforcement”- Run policies in shadow mode for 1–2 weeks.
- Monitor
deny/shadowrates and review false positives. - Ensure credential brokering is audited and rotated.
- Add Prometheus metrics and dashboards for
faramesh_policy_eval_duration_secondsandfaramesh_requests_total. - Confirm that DPR audit verification passes with the current key set.
References
Section titled “References”- See
faramesh-core/docs/fpl/LANGUAGE_REFERENCE.mdfor FPL details. - See
faramesh-core/docs/guides/DELEGATION_GRANTS.mdfor grant flows. - SDK reference:
faramesh-python-sdkandfaramesh-node-sdk.