Technical Briefing · Confidential · Updated April 2026

ARIA — AI Rapid
Investigation Agent

A multi-agent AI SOC platform with an MSSP operating layer. Tenancy is enforced at the ingress channel — every source ships with a per-channel token; payload content never decides the tenant. Raw events flow through a canonical parser (with cross-source delegation when one vendor wraps another) into enrichment, scoring and explainable triage, where an asset resolver anchors the investigator on ground-truth device profiles instead of guessing roles from device names. Deterministic analyst-authored rules short-circuit triage for benign chatter; the rest lands in a 7-stage LLM + rules pipeline with MITRE-mapped verdicts and policy-gated remediation. Native SIEM search, a Detection Engineering workbench, a Security Intelligence operating layer, and an Admin Control Plane round out the analyst UI. Tenant-safe by default; every destructive action audit-logged. Runs on-prem — nothing leaves the box by default.

160+
FastAPI routes
30+
Postgres tables
< 90s
Alert → verdict
7
Investigation stages
19
Worker processes
11
SIEM adapters

Video tour

3-minute 30-second captioned walkthrough of every module and new feature. Generated from a demo environment.

1680×1050 · 3 min 30 s · Confidential HuntJacq Labs © 2026

Technical deck

Eight-slide briefing for technical review — architecture, pipeline, AI, security, integrations, resilience, workflows, economics. Navigate with / arrows inside the frame.

Open deck full-screen ↗

Product walkthrough

Thirty-six-slide auto-playing captioned walk through the product, with keyboard navigation. Each slide explains what you're looking at and which feature it demonstrates.

Self-playing slideshow

Opens in a new tab. 7s per slide, / to navigate, P to pause.

Open walkthrough ↗

Or browse the screenshots

Click any tile below to open full-size. 36 images at 1680×1050 @ 2× density.

Jump to gallery ↓

What's new — April 2026 releases

A two-sprint cycle that rebuilt tenancy from the ingress up, made triage explainable, anchored the investigator on ground-truth device profiles, and hardened the MSSP control plane. Grouped by concern.

Tenancy & isolation

  • Per-source ingest tokens — tenancy resolved at the door, never from payload content.
  • Dedicated ingest service on port 8001 as a security boundary (separate systemd unit).
  • Server-side tenant_scope choke point applied to CMDB, Devices, Campaigns, Detection Rules, Saved Searches, Scheduled Reports.
  • Cross-tenant admin access is explicit, role-gated, audit-logged, UI-banner surfaced.
  • Tenant hierarchy columns (parent / child) for MSP-of-MSP scenarios.

Canonical pipeline & explainability

  • Canonical parser schema across all sources; enrichment → scoring → triage is one code path.
  • Multi-parser delegation — a wrapper parser can hand off to the inner vendor's parser (e.g. firewall alarms relayed through a Wazuh manager).
  • score_reasons[] audit trail on every alert; triage_reason human-readable sentence; trigger_reason on every investigation.
  • Rescored 8,400+ historical wrapped alerts after parser delegation landed — average score delta +25.71 points.
  • Per-tenant triage threshold (30-80) with platform default fallback.

Asset resolver & CMDB

  • Periodic 4-hour discovery loop + onboarding-time one-shot sync per integration.
  • Device resolver attaches ground-truth device_profile (role / OS / manufacturer / confidence) to every alert before the LLM sees it.
  • Investigator prompt renders a DEVICE PROFILE block above free text — no more "hallucinate OS from device name."
  • SIEM pseudo-agent filter: skips agent id 000 (the manager itself) and GCs stale entries automatically.
  • Per-agent syscollector walk pulls MAC for stable identity across DHCP churn.
  • Alert-mined discovery: src-side private IPs + hostnames only, dedup against discovered devices.

Deterministic noise suppression

  • Rules layer runs before the canonical pipeline — analyst-authored auto_close / suppress actions short-circuit triage.
  • Pre-seeded global rules for Apple/iCloud, Google/Microsoft/Zoom, NTP, DoH connectivity checks, Amazon benign endpoints, firewall ad/tracker blocks, SIEM keep-alives, agent heartbeats.
  • Rules are per-tenant or global; analysts add more without touching code.
  • Immutable versioning with one-click rollback.

Per-tenant workers

  • Queue naming aria.<concern>.<company_id>; per-tenant processes alongside stage workers.
  • Supervisor queries the tenant roster on startup; new tenant = supervisor restart, no container changes.
  • Noisy tenants can't starve quiet tenants; blast radius of a pause/restart is one tenant.
  • Designed to scale past 10+ tenants on a single host without container-per-tenant blowup.

Saved-search alerting

  • Any saved search becomes a scheduled alert — cron + timezone + threshold + multi-recipient.
  • Six cron presets (5m / 15m / hourly / daily 9 / weekdays 9 / Monday 9) plus arbitrary expressions.
  • Timezone select seeded from Intl.supportedValuesOf with curated fallback.
  • Ownership + admin RBAC gate matches delete rule; runs on tenant timezone, not server's.

Security hardening (this cycle)

  • Authentication added on all CMDB, Devices, Detection Rules PATCH/DELETE/export, and Campaigns resolve endpoints.
  • Destructive operations (sync, purge) moved to admin-only.
  • Related-alerts join tenant-scoped so a shared private IP cannot leak cross-tenant alert titles.
  • Client-supplied company_id on create ignored for non-admin users — forced to the caller's tenant.
  • Per-source ingest token lookup with TTL cache; payload company_id deliberately dropped.

Ops & observability

  • Split aria-ingest service as a systemd unit — security boundary, separate blast radius.
  • /health, /healthz, /api/health all respond (prevents external-probe false alarms).
  • Prometheus /metrics + pg + OS backups + logrotate + pytest regression suite.
  • Health-check script updated to the new 19-worker layout.
  • Firewalla direct poller auto-starts on backend boot.

Pipeline architecture

Every event follows the same path. Tenant resolution happens at the ingress door; deterministic rules run before the expensive pipeline; every downstream stage reads a canonical schema so adapters add without touching downstream code.

Vector / Filebeat API pollers (per tenant) aria-ingest :8001 X-Aria-Source-Token → (company_id, ds_id) Canonical parser per source + cross-source delegate Rules engine auto_close · suppress short-circuits pipeline Enrichment IOC · asset · MITRE UEBA · rule FP-rate Score & triage src × derived bands · reasons auto_close → persisted, never reaches LLM auto_closed (< 30) logged · zero LLM cost retrievable in dashboards correlating (30 – thresh) 10-min attach window per-device case batching investigated (≥ thresh) 7-stage LLM pipeline policy-gated remediation escalated (≥ 85) fast-path notification on-call + email + PDF Investigated → ① Triage MITRE map ② ThreatIntel VT · AbuseIPDB · TAXII ③ Knowledge RAG history ④ Forensics timeline · campaign ⑤ Investigator LLM verdict ⑥ Remediation policy-gated ⑦ Validation post-action recheck

Device profile anchoring

Before the investigator runs, the enricher does a (tenant, MAC → hostname → IP) lookup against the tenant-scoped devices table. Match returns a device_profile (role, OS, manufacturer, confidence, discovery source) that is rendered at the top of the LLM prompt as a DEVICE PROFILE block. The LLM is told not to guess an OS or role contradicted by the block. Discovery adapters populate the devices table via a 4-hour loop + onboarding-time one-shot.

Rules engine as the override

Analyst authored auto_close / suppress rules are deterministic, not a model input. They fire before the canonical pipeline so they never contend with scoring. Each rule has a natural key, priority, action, optional suppression window, and is versioned with one-click rollback. Pre-seeded rules cover the long tail of benign SaaS chatter so the downstream pipeline only sees what actually needs thinking about.

Per-tenant workers

Queue naming aria.<concern>.<company_id>. The supervisor queries the tenant roster on startup and launches worker processes per tenant. Stage workers (7 of them: triage, intel, knowledge, forensics, investigator, remediation, validation) are shared across tenants; per-tenant consumers isolate customer-level failure modes. A single backend host comfortably runs 19 worker processes for three tenants, scales horizontally past ten.

Platform capabilities

What the platform does, organized by concern. Every surface is tenant-safe by default and audit-logged for destructive actions.

Multi-agent investigation pipeline

Seven stages per alert: Triage (MITRE mapping) → Threat Intel (VT/AbuseIPDB/TAXII/CVE) → Knowledge (RAG over history) → Forensics (timeline + campaign) → Investigator (LLM verdict) → Remediation (policy-gated) → Validation (post-action re-check). Learning runs async to generate Sigma rule candidates. Typical end-to-end: 60-90s.

Channel-based tenant isolation

Tenancy is an ingress property, never a payload property. Every data source on a tenant carries a unique ingest token; the ingress API resolves X-Aria-Source-Token to (company_id, data_source_id) at the door before the event body is touched. A dedicated ingest service (port 8001) forms the security boundary — compromising the analyst UI can't forge events, and no upstream relay can spoof another tenant by rewriting a field. Partner-portal wizard emits shipper-config snippets (Vector / Filebeat / Fluent Bit / curl) per source; token shown once.

Canonical parser + explainable triage

Raw vendor events are normalized into a CanonicalAlert schema, then enriched (IOC, asset, MITRE, rule-FP-rate, frequency, UEBA), scored (source confidence × derived confidence, combined via max + 0.3·min), and triaged into auto-closed / correlating / investigated / escalated bands with a per-tenant threshold (30-80). Every alert carries score_reasons[] (audit trail of what moved the number) and a human-readable triage_reason; every investigation carries a trigger_reason. No opaque "the model said so."

Multi-parser delegation

Real-world pipelines relay one vendor's events through another — for example, a firewall alarm arriving through a Wazuh manager as a wrapped payload. The Wazuh parser sniffs the wrapper (data.source == firewalla-msp or data._type starts with ALARM_), lazy-loads the Firewalla parser, and delegates normalization and scoring, then stamps the relay trail (agent id, rule id, manager) as tags for audit. Adding a new source = one parser file + one fixture.

Asset-anchored investigation

A device resolver kills the hallucinate-the-OS-from-a-device-name class of false positives. Integration adapters (Firewalla, Wazuh) push discovered devices into a tenant-scoped devices table with role, OS, manufacturer, model, confidence, and discovery source; a periodic 4-hour loop refreshes. At enrichment time, (tenant, MAC → hostname → IP) lookup attaches a ground-truth device_profile to the alert. The investigator prompt renders it as a DEVICE PROFILE block above free-text, so the LLM is anchored on role before it reasons.

Deterministic noise suppression

Analyst-authored rules are the override, not a suggestion. A rules layer runs before the canonical pipeline and can auto_close or suppress benign chatter (Apple/iCloud, Google/Microsoft/Zoom, NTP, DoH connectivity checks, Wazuh keep-alives, ad/tracker blocks) short-circuiting triage entirely. Rules are global or per-tenant and ship pre-seeded; analysts can add more without touching code. The pipeline downstream only sees what actually needs thinking about.

Saved-search alert scheduling

Any saved search can be promoted to an alert with a cron expression, timezone, hit threshold, and multi-recipient email list. Six presets cover the common cadences (5 min / 15 min / hourly / daily 9 AM / weekdays 9 AM / Monday 9 AM); arbitrary cron strings are supported. Ownership and admin RBAC match the delete rule — only the search owner or an admin can configure alerting. Backend croniter runner fires on the tenant's timezone, not the server's.

Per-tenant workers

Queues are named aria.<concern>.<company_id>; a supervisor queries the tenant roster on startup and spawns dedicated worker processes per tenant alongside stage workers. Noisy tenants don't starve quiet ones, and tenant-level pause / restart operates on a blast-radius of one. Designed to scale horizontally to 10+ tenants without a container-per-tenant blowup.

Native SIEM-style search

Built-in search tab with field-aware DSL (severity:critical AND source:wazuh), visual filter builder, right-side schema panel, cursor-paginated infinite scroll, SQL pushdown into JSONB for sub-100ms first page, and saved searches with private / tenant / global visibility. Timechart and top-value aggregations share the same filter state. Any saved search can be promoted to a scheduled alert (cron + timezone + threshold + multi-recipient). This is the single search surface — no external index to operate, no split-brain with a parallel search engine.

Detection Engineering

Closes the loop Alert → Verdict → Rule. Per-rule TP/FP/escalation rate, noisy-rule ranking by fp_rate × log(1+hits), explainable tuning recommendations with evidence, replay against historical alerts, MITRE coverage with gap detection, candidate-rule review queue (LearningAgent output), immutable rule versioning with one-click rollback, and Sigma import with structural validation.

Security Intelligence

IOCs move from "enrich and show" to scored + prioritized + correlated. Threat score 0-100 with explainable components (source confidence, sighting volume, confirmed TPs, freshness, lifecycle overrides). Lifecycle states: active / stale / expired / suppressed / trusted. Campaign severity + confidence (confirmed / probable / unknown). Feed sync health, hunt-suggestion generation, intel ⇄ rule coverage gaps, per-tenant threat landscape.

MSSP tenancy + governance

Single choke point for tenant isolation. Every query scoped server-side from the JWT; client-supplied tenant parameters are ignored. Cross-tenant admin access is explicit, role-gated, audit-logged, and surfaced via UI banner. Per-tenant white-label branding (logo, display name, primary color, report footer) flows into every report payload and PDF export. Per-tenant per-action policy overrides with effective-policy visualization. MSP-of-MSP tenant hierarchy via parent-child wiring.

Admin Control Plane

Not just settings — governance. Worker lifecycle (start / stop / restart / restart-all) with confirm dialogs + audit. Aggregated system alerts across 6 categories. Cost dashboard (tokens + USD per day, per tenant, top drivers). Active session list with one-click revoke. Runtime config editor with type validation. RabbitMQ queue depth. Policy editor showing effective policy per tenant.

Dashboards & Reports

KPI strip with click-through to filtered investigations. Per-tenant comparison with grades A-D. Alert-lifecycle funnel. SLA breach root-cause analysis (backlog / slow investigation / escalation delay). Data coverage per source. Automation coverage. Decision-engine metrics. Dashboard narrative that summarizes "what changed" vs the previous period, plus z-score alert-volume anomaly detection. Reports carry executive-ready narrative on every type; scheduled delivery via cron + email; PDF export with tenant branding; QoQ / YoY period comparison with graceful fallback when history is insufficient.

Observability & trust

Evidence chain: SHA-256 hash-chained custody per investigation. Audit log covers every authentication, authorization, config change, policy override, cross-tenant access, rule rollback, IOC override, session revoke, and remediation action. Search telemetry for SLO tracking. Email delivery log — no silent failures. Per-agent tracing via Langfuse when configured. Compliance / Audit report type exports the audit trail with event-type summary and narrative.