Threat-intel enrichment
What this is
Think of probectl as a security guard who already watches everyone walking through the building. Threat-intel enrichment hands that guard a wanted list — public feeds of known-bad IP addresses, hostnames, malicious server certificates, and malicious TLS client fingerprints (JA3). When something probectl already saw on the network matches an entry on that list, probectl raises a flag.
Concretely: probectl matches observed network activity — peer IPs, target hostnames, server certificates, and client JA3 fingerprints — against public threat-intel feeds, and surfaces each match as a confidence-scored, source-attributed threat-plane signal that lands on the unified incident timeline.
It lives in internal/opendata (the feed adapters and the shared indicator
store) and internal/control/threatintel.go (the consumer that scores live
results and raises signals).
The one rule that shapes everything: a signal, not an IPS
probectl never blocks traffic and never sits inline. A match is a lead — scored, tunable, and suppressible — not an enforcement action (detection is a signal, never an IPS — one of probectl's non-negotiables). The wanted list tells the guard "keep an eye on that person"; it does not let the guard slam a door. Why hold this line? Because an inline blocker that acts on a noisy public feed will eventually drop legitimate traffic (a CDN that once hosted malware, a Tor exit that is not actually attacking you). probectl is an observability tool: it tells a human why to look, and the human decides.
Status: off by default
Enabling threat-intel makes outbound fetches to the configured feeds. That
crosses probectl's default of zero outbound calls (the no-phone-home
non-negotiable), so it is gated
behind PROBECTL_THREATINTEL_ENABLED and ships off. When disabled, no feed
is ever contacted and no indicator-matching code runs — verified in
BuildThreatIntel, which returns (nil, nil, false) unless the flag is set
(internal/control/threatintel.go).
How it works
Threat-intel reuses the open-data enrichment framework (internal/opendata)
and its provenance/acceptable-use model. The flow has three moving parts:
- Feed adapters each fetch one public feed over hardened TLS and normalize
its lines into IOCs (indicators of compromise) — a normalized
{type, value, source, category, confidence, license}record (internal/opendata/feeds.go). - A shared, tenant-agnostic
IOCStoreis rebuilt atomically on every refresh (internal/opendata/intelstore.go). "Atomically" means the new index is built off to the side and swapped in under a lock, so a query never sees a half-loaded list. - The threat plane scores already-captured observations against that store and emits a tenant-scoped signal for each hit.
Feeds are ingested once and shared across all tenants (the wanted list is the same for everyone). The match itself lands on a tenant-scoped incident record, so the tenant boundary is enforced where the match lands, not in the shared store.
%%{init: {'theme':'base','themeVariables':{'background':'#0d1117','primaryColor':'#161b22','primaryTextColor':'#e6edf3','primaryBorderColor':'#3b82f6','lineColor':'#8b949e','secondaryColor':'#21262d','tertiaryColor':'#0d1117','clusterBkg':'#161b22','clusterBorder':'#30363d','fontFamily':'ui-monospace, SFMono-Regular, Menlo, monospace'},'flowchart':{'curve':'basis','nodeSpacing':55,'rankSpacing':55,'padding':12}}}%%
flowchart LR
subgraph feeds["Public threat-intel feeds (read-only, over TLS)"]
SH["Spamhaus DROP\n(CIDR)"]
FE["Feodo Tracker\n(C2 IPs)"]
SB["SSLBL\n(cert SHA1 + JA3)"]
UH["URLhaus\n(URLs)"]
TOR["Tor exit list\n(IPs)"]
FH["FireHOL L1\n(IP/CIDR)"]
end
feeds -->|"Fetch() → []IOC\n(hardened TLS, untrusted)"| REF["IntelRefresher\n(per-source last-good,\ngraceful degradation)"]
REF -->|"atomic rebuild"| STORE[("IOCStore\nip · cidr · domain · url · cert · ja3")]
RES["probectl.network.results\n(synthetic + flow)"] --> IPC["IOCConsumer\nscore peer IP / host"]
TLS["TLS observer\n(captured leaf + JA3)"] --> AN["threat.Analyzer\nWithIntel(store)"]
STORE --> IPC
STORE --> AN
IPC -->|"ioc.* signal\n+ source attribution"| COR["incident.Correlator\n(tenant-scoped)"]
AN -->|"tls.malicious_cert / _ja3"| COR
COR --> TL["Unified incident timeline\n+ alerting"]
The two scoring paths
There are two independent places where a captured observation meets the wanted list:
- IP / host (
IOCConsumer, over theprobectl.network.resultstopic): for every result, probectl takes the peer address and scores it. An IP is matched two ways — exactly, and against any containing CIDR block (so a single bad/24flags every address inside it). A hostname target is matched exactly (lowercased) against any domain-type indicators in the store — the index exists and is scored, though none of today's built-in feeds emitdomainindicators (they emit IPs, CIDRs, URLs, and cert/JA3 fingerprints). A trailing:portis stripped first (peerHostusesnet.SplitHostPort), so198.51.100.7:443scores as198.51.100.7. - Certificate / JA3 (
threat.Analyzer.WithIntel, ininternal/threat/analyze.go): the already-captured leaf certificate's SHA-1 fingerprint (computed viacrypto.CertSHA1) and the client's JA3 are scored against the SSLBL feeds. This reuses TLS data probectl already observed — it never re-handshakes with the target.
The "already captured" part matters: probectl is not making new connections to test things. It scores what it saw passively, which keeps it observe-only.
Severity comes from feed confidence
Each feed entry carries a confidence number (0–100). probectl maps that to an
incident severity (severityForConfidence), and that severity is always
tunable downstream:
| Confidence | Severity |
|---|---|
| >= 80 | critical |
| 60–79 | warning |
| < 60 | info |
So a botnet-C2 IP (confidence 90) becomes a critical incident, while a
Tor-exit IP (confidence 50) becomes an info context note rather than an
alarm. That gradient is deliberate: not everything on a public list deserves a
page at 3 a.m.
Every signal carries provenance attributes so an analyst can see why it
fired and can tune or suppress it: intel.source, intel.category,
intel.type, intel.indicator, intel.confidence, and intel.license
(emitted in iocSignal).
Feeds and acceptable-use matrix
Each feed carries machine-readable provenance and acceptable-use (AUP) terms in
its Descriptor().AUP (internal/opendata/feeds.go). As with open-data
enrichment, these terms are not a constraint on private development or
single-tenant open-source use — they gate only commercial / MSP resale
(reselling probectl to many customers; see editions.md).
Resolve redistribution terms before enabling provider mode commercially.
| Feed | name |
IOC type | Category | Confidence | License / terms | Commercial use |
|---|---|---|---|---|---|---|
| Spamhaus DROP | spamhaus_drop |
CIDR | spam / hijacked netblocks | 90 | Spamhaus DROP (free) | allowed-with-attribution |
| Feodo Tracker (abuse.ch) | feodo_tracker |
IP | botnet C2 | 90 | abuse.ch CC0 | allowed |
| SSLBL certs (abuse.ch) | sslbl |
cert SHA1 | malicious cert | 95 | abuse.ch CC0 | allowed |
| SSLBL JA3 (abuse.ch) | sslbl_ja3 |
JA3 | malicious JA3 | 85 | abuse.ch CC0 | allowed |
| URLhaus (abuse.ch) | urlhaus |
URL | malware URL | 85 | abuse.ch CC0 | allowed |
| Tor exit list | tor_exit |
IP | tor exit | 50 | Tor Project (CC0) | allowed |
| FireHOL level 1 | firehol_level1 |
IP / CIDR | aggregate blocklist | 75 | aggregate (mixed terms) | restricted |
FireHOL aggregates many upstream feeds with mixed licenses, so its descriptor marks it
restrictedfor resale (CommercialRestricted). Verify upstream terms before commercial redistribution.
Set PROBECTL_THREATINTEL_FEEDS to a comma-separated subset of the name
column, or leave it empty to load all built-in feeds.
Reliability and accuracy caveats
- Graceful degradation — a down or rate-limited external source must never
break core function: the
IntelRefresherkeeps each source's last-good indicators. A feed that is down, rate-limited, or malformed leaves the prior indicators in place — it never empties the store and never breaks a core path. (SeeRefreshinintelrefresher.go: a failedFetchlogs a warning and the union is rebuilt from the retained sets.) - Freshness: indicators are only as current as the last successful refresh
(
PROBECTL_THREATINTEL_REFRESH, default 6h). A signal reflects feed state at refresh time, not real time. - False positives: public feeds carry stale or shared-infrastructure
entries (a CDN IP once used for C2; a Tor exit is not inherently malicious).
Treat matches as leads to triage, weigh
intel.confidence, and tune/suppress noisy sources. This is enrichment, not adjudication. - Untrusted input: every feed is fetched over TLS with certificate
validation (never disabled) via
crypto.HardenedHTTPClientand parsed as untrusted — malformed indicators are skipped, not trusted.
Configuration
| Variable | Default | Description |
|---|---|---|
PROBECTL_THREATINTEL_ENABLED |
false |
master switch (outbound feed fetches); off ⇒ no indicator code runs |
PROBECTL_THREATINTEL_REFRESH |
6h |
feed refresh cadence |
PROBECTL_THREATINTEL_FEEDS |
(all) | comma-separated feed names to load; empty ⇒ all built-in feeds |
Security guardrails upheld
These are probectl's standing non-negotiables, as they apply here:
- Signal, not IPS — confidence-scored, tunable, suppressible; no inline block.
- No phone-home — off by default; outbound only when explicitly enabled.
- Tenant-scoped — feeds are shared (ingested once); matches land on
tenant-scoped incidents, carrying each result's
tenant_id. - Graceful and untrusted — last-good caching; TLS-verified fetch; untrusted parse.
- AUP/provenance tracked per feed for MSP/commercial resale.
- FIPS crypto abstraction — the cert SHA-1 fingerprint goes through
crypto.CertSHA1; no hash primitive is imported outsideinternal/crypto.
The triage surface
Attributed matches are also retained as tenant-scoped, in-memory detections
(newest first, bounded per tenant). The recognizer keys on threat-plane signals
that carry intel.* provenance — and the same store accepts the NDR-lite
detector signals (see ndr.md) without a separate pipeline, because
both flow through one recognizer (DetectionFromSignal in
internal/threat/detections.go).
They are served at GET /v1/threat/detections (RBAC threat.read; the
response carries a detections_running honesty flag so a caller can tell the
store is wired). Each detection carries the flagged entity, the matched
indicator, the attributing feed with confidence + category + license
(verbatim provenance), and the correlated incident id — the pivot into
the incident timeline (/incidents?incident=<id> is a supported deep link).
The web surface lives on /security, above the certificate inventory:
severity/source/text filters, plus a provenance detail view that states plainly
that feeds can list benign infrastructure and that probectl never blocks.
The detection store is in-memory and rebuilds from the stream;
the durable trail is the incident record plus the SIEM export
(see siem.md).