Faramesh Docs
CLI

faramesh test

Replay policy fixtures against the compiled engine. Assert decisions for known tool calls in CI and locally.

faramesh test runs policy fixtures, recorded or hand-written tool calls with expected decisions, against the compiled policy. It's how you write unit tests for governance.fms and how you make sure a policy refactor doesn't silently change a decision.

Use it locally while iterating, and in CI on every PR.

Usage

Terminal
faramesh test [--dir DIR] [--fixture FILE] [--fail-fast] [--update]
FlagDescription
--dir DIRStack directory. Defaults to the current directory.
--fixture FILERun a single fixture file (otherwise globs tests/*.fmtest.json under the stack).
--fail-fastStop on the first mismatch.
--updateUpdate fixtures' expected fields to match actual decisions. Use carefully — it rewrites your tests.

Fixture format

A fixture is a JSON file describing one or more tool calls and the expected decision for each:

tests/refunds.fmtest.json
{
  "fixture":  "refunds",
  "agent_id": "support-bot",
  "cases": [
    {
      "name":    "small refund permits",
      "tool":    "stripe/refund",
      "args":    { "amount": 80, "card_number": "4242…" },
      "expect":  { "effect": "PERMIT", "rule_ref": "governance.fms:14" }
    },
    {
      "name":    "large refund defers",
      "tool":    "stripe/refund",
      "args":    { "amount": 8000 },
      "expect":  { "effect": "DEFER" }
    },
    {
      "name":    "payouts always deny",
      "tool":    "stripe/payouts",
      "args":    { "amount": 100 },
      "expect":  { "effect": "DENY", "code": "POLICY_DENY" }
    }
  ]
}

Fields:

FieldRequiredDescription
agent_idMatches an agent "<id>" { … } in governance.fms.
cases[].toolTool name as the SDK would send.
cases[].argsAction payload. Conditions evaluate against this.
cases[].expect.effectPERMIT, DEFER, or DENY.
cases[].expect.codeOptional denial code (POLICY_DENY, RATE_EXCEEDED, …).
cases[].expect.rule_refOptional rule reference (file
). Helps catch silent rule reordering.
cases[].principalOverride principal.* claims for this case.

Output

Output
$ faramesh test
Compiling governance.fms ... ok
Running 3 fixtures, 12 cases ...

  refunds
    ✓ small refund permits
    ✓ large refund defers
    ✗ payouts always deny
        expected: DENY (POLICY_DENY)
        got:      PERMIT (matched rule governance.fms:11)
        diff:     rule re-ordering allowed payouts unexpectedly

  budgets
    ✓ daily under cap
    ✓ daily at warning threshold

  egress
    ✓ allowed host
    ✓ denied host

11 passed, 1 failed

A failed test prints the expected and actual decisions plus a one-line explanation when one is obvious.

Exit codes

CodeMeaning
0All cases pass.
1At least one case failed.
2Compile failed (same family as check).

CI integration

.github/workflows/policy.yml
- run: faramesh check --strict
- run: faramesh test
- run: faramesh plan --format json > plan.json

test runs against the compiled AST in your repo. It doesn't need a running daemon and doesn't write to the WAL.

Recording fixtures from real traffic

For complex stacks, hand-writing fixtures is tedious. You can record them from a running daemon instead:

Terminal
faramesh test record --since 1h --out tests/recorded.fmtest.json

This pulls every decision from the WAL in the time window, normalizes the args, and writes a fixture with the actual decisions as expected. Then commit the file. Future faramesh test runs will fail if any of those decisions change.

Updating fixtures after a policy change

If you intentionally changed a rule and need to update the expected decisions:

Terminal
faramesh test --update

Diffs are printed before the rewrite so you can review them. Treat --update like committing test snapshots. Review the diff before pushing.

Difference from faramesh plan

CommandWhat it does
faramesh testAsserts known cases produce known decisions. Fails the build on regression.
faramesh planReports diffs and replay deltas. Does not fail on legitimate changes.

Use both: test for the cases you care about, plan for the rest of history.

What's next

On this page