Development
This is the orientation page for building, testing, and contributing to probectl
from source. It tells you what toolchain to install, how the repo is laid out as a
Go workspace, which make target does what, and which CI gates your change has to
pass. If you just want to deploy probectl, see install.md
instead.
Toolchain
Install these once. Exact pinned versions for the tools probectl installs for you
(golangci-lint, buf, the protobuf plugins) live at the top of the
Makefile and are fetched by make tools / make proto-tools —
you do not pin them yourself.
| Tool | Version | Used for |
|---|---|---|
| Go | 1.26.4 (exact) | control plane, agents, CLI (the patch is pinned — see build/toolchain.md) |
| Python | 3.11+ (CI runs 3.12) | the BGP analyzer (analyzer/) |
| Docker + Buildx | recent | dev stack + multi-arch images |
| golangci-lint | v2.12.2 | Go linting (installed via make tools) |
| buf | v1.50.0 | protobuf lint + codegen (installed via make proto-tools) |
The Go version is pinned to the exact patch (
1.26.4), not a loose1.26. That is intentional — it keepsgovulncheck's standard-library scan honest and keeps the FIPS build working underGOTOOLCHAIN=local. Details:build/toolchain.md.
Go workspace & modules
The repo is a go.work workspace tying together two modules:
.— the primary modulegithub.com/imfeelingtheagi/probectl(cmd/,internal/,pkg/). Production code and unit tests live here../test— the black-box integration harness. Kept as a separate module on purpose: its heavy, test-only dependencies stay out of the main module'sgo.mod/go.sum, so production builds never pull them. These tests talk to services over the wire, not throughinternal/.
make commands that span modules iterate both (the GO_MODULE_DIRS loop in the
Makefile). Production builds use each module's own go.mod; the workspace is a
local/CI convenience.
Make targets
Run make help for the authoritative, self-documenting list. The ones you'll
reach for most:
| Target | What it does |
|---|---|
make build |
Build all binaries into ./bin (version stamped via -ldflags) |
make build-cross |
Cross-compile every binary for linux amd64 + arm64 (smoke) |
make run |
Run probectl-control locally |
make test |
Unit tests across all workspace modules (-race) |
make test-isolation |
Cross-tenant isolation gate (-tags=isolation) |
make test-integration |
Integration tests (-tags=integration; needs a DB / dev stack) |
make test-python |
pytest for the analyzer (incl. Hypothesis property tests) |
make cover-gate |
Per-package coverage floor on service-free packages (scripts/check_coverage.sh) |
make fuzz-smoke |
Run each Go fuzz target briefly to catch crashers |
make lint |
lint-go + lint-python |
make fmt |
Auto-format Go (gofmt) and Python (ruff --fix, black) |
make proto |
buf lint + generate Go (+ gRPC) from proto/ |
make proto-tools |
Install protobuf codegen tools (buf + Go plugins, pinned) |
make migrate |
Apply DB migrations via probectl-control migrate |
make vuln |
govulncheck over Go dependencies |
make images |
Multi-arch (amd64/arm64) images for every component |
make compose-up / compose-down |
Start / stop the local dev dependency stack |
make tools |
Install pinned dev tools (golangci-lint) |
make ci |
lint + test + test-isolation (the core gates locally) |
make ciruns the core gates fast and locally. It is not the full CI suite — the integration, isolation-against-real-DBs, eBPF-kernel-matrix, coverage, and supply-chain gates run in GitHub Actions (next section). Most per-surface gates have an identically-namedmaketarget you can run yourself:editions-gate,fips-gate,openapi-gate,migration-gate,helm-gate,gitops-gate,terraform-gate,cover-gate,perf-smoke,e2e.
CI jobs (.github/workflows/ci.yml)
The CI job names are a contract — every pull request runs them, and they are
how a change earns its way to main. This is the full list; ci.yml is the
source of truth.
| Job | Gate |
|---|---|
action-pins |
every workflow action is commit-SHA-pinned + the supply-pins gate (no :latest image refs under deploy/, no unpinned installs) |
secret-scan |
gitleaks over the working tree (.gitleaks.toml allow-lists the deliberate redaction-test fixtures) |
no-devauth-in-release |
the dev-auth bypass is physically absent from release binaries — symbol + literal absence, and the binary refuses PROBECTL_AUTH_MODE=dev at boot |
lint-go |
gofmt + go vet + golangci-lint + crypto/editions/SQL/TLS import guards |
lint-python |
ruff check + black --check |
editions-gate |
ee/ import guard + the core-only build/test (-tags probectl_core) |
fips-gate |
FIPS artifact builds + power-on self-test passes |
test-go |
unit tests (-race) + backup/erasure tests + fuzz smoke (parsers must not crash) + bench smoke + cross-compile (incl. the endpoint agent's Linux/macOS/Windows builds) |
rca-eval |
the AI root-cause eval set through the real pipeline; blocking floors: answer accuracy ≥ 0.85, citation precision ≥ 0.85 |
coverage |
per-package coverage floor on service-free packages (make cover-gate) |
test-python |
analyzer pytest (incl. Hypothesis property tests) + hash-lock drift refusal |
browser-worker |
the Playwright synthetic worker: real-browser scripted-login smoke inside the official Playwright image |
openapi-gate |
spec is valid OpenAPI 3.1 and the registered /v1 routes exactly match it (no undocumented routes) |
migration-gate |
expand/contract migrations only — rejects destructive/blocking schema changes |
helm-gate |
chart hardening for every profile + the agent chart (make helm-gate), kubeconform on the rendered charts, GitOps manifest validation (make gitops-gate), compose config validation |
terraform-gate |
terraform fmt -check + terraform validate of the module's example root |
ebpf-kernel-matrix |
the BPF programs load and attach on real LTS kernels (5.15/6.6, amd64 + arm64, plus a lockdown-hardened entry) under QEMU |
ebpf-image-live |
the shipped eBPF-agent image is the live -tags ebpf build, not the fixture replayer |
cross-tenant-isolation |
permanent tenant-isolation gate (a non-negotiable), against real Postgres (TLS) + ClickHouse |
integration |
migrations are idempotent + /readyz passes + result pipeline, against a real Postgres + Prometheus |
perf-smoke |
ingest baseline + pooled multi-tenant query-latency smoke |
backup-drill |
the backup → wipe → restore drill executes against real stores |
failover-drill |
timed kill-the-primary → promote-the-replica drill (RTO/RPO measured) |
load-smoke |
S-tier full-stack load smoke through real Kafka + Prometheus |
proto |
buf lint + buf breaking vs main (additive-only wire contract) + generated-code drift |
web |
typecheck + lint + the a11y / theme-swap / no-hardcoded-token gates + npm audit + production build |
dependency-scan |
govulncheck + Trivy filesystem scan (vulnerabilities only) |
image-scan |
Trivy image scan (vulnerabilities only) |
build-images |
multi-arch image build for every component (Buildx + QEMU) |
commitlint |
Conventional Commits (pull requests only) |
dco |
every commit carries a Signed-off-by trailer (pull requests only) |
sbom |
CycloneDX SBOM of the Go module graph, retained 90 days (informational, not a merge gate) |
verify-all |
the umbrella check: red unless every verification job it depends on concluded green |
Trivy is vulnerability-only here, by design. Secret scanning is the separate
secret-scanjob (gitleaks), which owns the.gitleaks.tomlallow-list for the deliberate fake secrets inside redaction tests. Trivy can't see that allow-list, so pointing it at secrets would re-flag those intentional fixtures — hencescanners: vulnin both Trivy jobs.
Two more workflows run outside the pull-request loop: nightly.yml (the
black-box full-stack e2e via make e2e, ingest benchmarks, and the M-profile
scale-gate regression guard — too slow to run per-PR) and
security-scan.yml (weekly govulncheck / npm audit / Trivy with
retained evidence artifacts, so a newly-disclosed CVE in an unchanged pin
still goes red on its own).
CI itself cannot make merges blocking — that is branch protection, a GitHub
repository setting documented in
ops/branch-protection.md. Independently of it, the
release pipeline refuses to publish any tag whose CI run isn't green (see
releasing.md).
Testing layers
probectl tests in layers, fast-to-slow, each catching a different class of bug.
Local test DSNs (documented dev-only plaintext)
Integration and isolation tests fall back to sslmode=disable connection strings
only when PROBECTL_DATABASE_URL is unset — the local-dev convenience path
against the test/ compose stack. CI never uses those fallbacks: every DB-backed
job starts Postgres with TLS under a per-run test CA and connects
sslmode=verify-full (scripts/ci_pg_tls.sh), the production posture. The shipped
deploy recipes are sslmode=require or stricter.
- Unit (
make test,-race) — hermetic, table-driven; the default fast path. - Integration (
make test-integration,-tags=integration) — against real Kafka (in-process kfake), Postgres, ClickHouse, and Prometheus, plus in-process HTTPS/DNS servers and loopback sockets for the probes. The DNS/HTTP/TLS canary behaviour (success / 5xx / slow / expired-cert / DNSSEC-bogus) lives here. - Fuzz (
make fuzz-smoke) — Go fuzz targets over the untrusted-input parsers (ICMP / Time-Exceeded / MPLS ininternal/path, the BGP-event ingest ininternal/bgp). The invariant is "never panic", and the bridge must additionally never publish a tenant-less event under fuzzing — the fail-closed tenancy rule (see the non-negotiables). CI runs a short smoke; run longer locally with-fuzztime. - Property (Hypothesis, in the analyzer suite) — the MRT parser and RPKI validator are checked over thousands of generated inputs (robustness + round-trip + soundness), the Python counterpart to the Go fuzzers.
- Coverage gate (
make cover-gate→ thecoverageCI job) — a per-package statement-coverage floor on the service-free logic / parser / probe packages (scripts/check_coverage.sh). The stateful DB/transport packages are gated for correctness by theintegrationandcross-tenant-isolationjobs instead — a stronger guarantee than a coverage percentage — so they are not floored here.
Commits
Use Conventional Commits (e.g. feat(canary): add ICMP network test), and
sign off every commit (git commit -s, which adds the Signed-off-by trailer the
dco CI job requires). See ../CONTRIBUTING.md. You can
pre-load the message template with:
git config commit.template .gitmessage