Back
Technical Briefing · Confidential · April 2026
HuntJacq Labs

ARIA — AI Rapid
Investigation Agent

A multi-agent SOC platform: 8 specialized AI agents, 147 API routes, 30+ DB tables, 11 SIEM adapters — ingests alerts, runs a full investigation pipeline on-prem, and produces a verdict + evidence + action in < 90 seconds.

147
FastAPI routes · OpenAPI at /openapi.json
30+
PostgreSQL tables · schema.py verifiable
< 90s
End-to-end alert to verdict
20/20
health-check.sh checkpoints pass
ARIA · Technical Briefing
CONFIDENTIAL
02 / 08

Deployment Architecture

Single-server or container stack. On-prem or private cloud. Nothing leaves the box by default.

nginx · TLS edge

/ → React SPA (frontend/dist)
/api/* → FastAPI app
/opensearch/* → event search UI

FastAPI · ~147 routes

Agent orchestrator (LangGraph)
Adapter registry (11 SIEMs)
7 message-queue worker consumers
Auto-ingest polling + watchdog

Local LLM runtime · GPU

qwen3.5 — investigation reasoning
nomic-embed-text — RAG

Cost posture

Zero cloud cost by default
Cloud LLM fallback only if key set
PostgreSQL
30+ tables · primary store
Message bus
Pipeline queues + DLQ
In-memory cache
Dedup + rate-limit
OpenSearch
Event search index
Search UI
Embedded Kibana-style explorer
Ingest pipeline
Filebeat/Vector → alert ingest
External calls only when explicitly enabled: cloud LLM · VT/AbuseIPDB/TAXII · SMTP/webhook. Everything else is local.
ARIA · Technical Briefing
CONFIDENTIAL
03 / 08

The Investigation Pipeline

Every alert runs this chain. Each stage writes to agent_events — full audit trail, shown in the Pipeline tab.

1

TriageAgent ≈ 5 ms

MITRE ATT&CK mapping · kill-chain stage · priority score
Deterministic
2

ThreatIntelAgent ≈ 5 s

VirusTotal + AbuseIPDB + TAXII + CVE/NVD
External API
3

KnowledgeAgent ≈ 10 s

nomic-embed-text · cosine similarity · top-3 past cases
RAG / Vectors
4

ForensicsAgent ≈ 10 s

Timeline · IOC pivot · lateral movement · campaign grouping
Deterministic
5

InvestigatorAgent ≈ 28 s (dominant)

LangGraph + qwen3.5 · verdict + confidence 0–100
LLM
6

RemediationAgent ≈ 3 s

block_ip / isolate_host / disable_user · HITL or autonomous
PolicyGated
7

ValidationAgent ≈ 3 s

Re-check IOCs · recurrence detection · final confidence
Deterministic
8

LearningAgent async

Sigma rule generation · runbook update per MITRE tactic
Non-blocking
Output per investigation: verdict + evidence chain (SHA-256) + ticket + Sigma rule + runbook + Langfuse trace. Pipeline total timeout: 300 s. InvestigatorAgent failure = investigation failed; all others continue with partial context.
ARIA · Technical Briefing
CONFIDENTIAL
04 / 08

AI & LLM Architecture

AI for reasoning, not ground truth. Deterministic rules handle safety; LLM handles judgment.

● Tier 1 · Primary

Ollama qwen3.5:latest

On-prem GPU · zero cost · no external call

● Tier 2 · Fallback

OpenRouter (claude-sonnet-4-6)

Cloud · only if API key configured · cost tracked

● Tier 3 · Degraded

Rules-only mode

No LLM · confidence capped at 50 · always available

Token accounting
prompt_tokens · completion_tokens · total_tokens stored in investigations table
Cost tracking
estimated_cost_usd = 0 for local Ollama · populated for cloud calls · visible in Admin
Verdict source
verdict_source ∈ {ai, rules_only, analyst} — every override tracked for accuracy benchmarking
Full trace
LLM response + tool calls stored in investigation.trace · shown in Events tab step-by-step
Shadow mode: AI runs full pipeline without executing actions. Analyst compares verdicts. Shadow metrics drive the L1 Readiness score (accuracy 40% · FN rate 20% · MTTR 15% · cost 10% · uptime 15%).
ARIA · Technical Briefing
CONFIDENTIAL
05 / 08

Security, Compliance & Evidence

Every action is audited. Every investigation produces a tamper-proof chain of custody.

Password auth

bcrypt · 10-byte salt · force-change flag

MFA

TOTP (pyotp) · QR provisioning · per-user enforcement

SSO

SAML 2.0 SP · IdP metadata · ACS callback · auto-user-creation

Sessions

JWT HS256 · 30-min inactivity · 2-min expiry warning in UI

Brute-force

5 failed logins → 30 s lockout with countdown UI

RBAC

9 permissions · role defaults + per-user overrides · server-side enforcement

Multi-tenancy

company_id-scoped · cross-tenant reads blocked · admin sees all

API keys

Per-tenant · shown once · SHA-256 fingerprint stored

SHA-256 Evidence Chain

hash = sha256(prev_hash + event_type + actor + data_json + timestamp)
UI shows chain-integrity badge · PDF export for compliance archives.

Audit + NIST IR Mapping

Every authn · config change · remediation approve → audit_log. NIST IR 6-phase tab per investigation (Detect → Analyze → Contain → Eradicate → Recover → Post).

ARIA · Technical Briefing
CONFIDENTIAL
06 / 08

Integrations & Remediation Safety

11 SIEM adapters through one interface. Every destructive action passes a 7-step safety gate.

✓ Firewalla MSP (live) Wazuh Elastic Security Splunk Cloud CrowdStrike Falcon AWS CloudTrail SentinelOne Fortinet FortiGate Palo Alto PAN-OS Darktrace MS Defender/Sentinel

Ticketing

ClickUp (live) · Jira · ServiceNow · Linear · PagerDuty

Threat Intel

VirusTotal · AbuseIPDB · TAXII 2.x · CVE / NVD

Notifications

SMTP (Gmail / iCloud / SES / Postmark / Mailgun) · Slack · Teams · PagerDuty webhook

1
Autonomy-mode gate (shadow / assisted / supervised / autonomous)
2
Asset criticality from CMDB — prod / critical always require manual approval
3
Rollback plan required — no action without a stored rollback procedure
4
Per-tenant allow / deny lists — configurable block-target restrictions
5
Confirmation dialog in UI for all destructive operations
6
Audit row: actor + action + target + impact + rollback plan + result
7
Rollback endpoint — POST /api/remediation/{id}/rollback
ARIA · Technical Briefing
CONFIDENTIAL
07 / 08

Resilience, Observability & Operations

Every failure mode has a defined behavior. 20-checkpoint health-check.sh verifiable in under 60 seconds.

Backend crash
Workers keep queueing · nginx 502 · supervisor/systemd restarts
Worker dies
supervisor.py auto-restarts with exponential backoff · durable queue retains messages
Postgres down
Health returns degraded · ingestion pauses · frontend shows banner
Ollama down
Falls through to OpenRouter · if also down → rules-only (confidence capped 50)
Single-agent failure
InvestigatorAgent failure = investigation failed; all others continue with partial context
Mid-pipeline reboot
On startup: status=running investigations > 12 min → marked failed · analyst can Re-investigate
Full host reboot
recover-aria.sh idempotently starts only missing tiers · health-check.sh verifies 20 points

Verify from the server

# 20 pass / 0 fail / 0 warn
./health-check.sh

curl -sk https://localhost/api/health
{"status":"ok"}

curl .../openapi.json | jq '.paths|length'
147

Observability signals

• Langfuse spans per agent (services/tracing.py)
• Pipeline trace JSON per investigation (Events tab)
agent_events.duration_ms per stage (SLA-by-phase API)
• Worker health: supervisor ping · RabbitMQ queue depth
• Audit log with compound filters + CSV export
ARIA · Technical Briefing
CONFIDENTIAL
08 / 08

Analyst Workflows + Investment Thesis

A complete SOC platform — not a point tool. The business case is the cost of the analysts it replaces.

Triage

Dashboard KPIs → Alerts → Investigations list

Investigate

12-tab detail: Summary, Graph, Pipeline, NIST IR, Custody, Comments, Tasks…

Respond

PolicyGate-validated actions · confirmation modal · rollback · audit trail

Hunt

Hunt Workbench + SIEM-style Alert Search + IOC Registry + Campaigns

Collaborate

Comments (@mentions, type badges) + Tasks (assignable, priority, due dates)

Report

7 report types · scheduled + email distribution · SLA breach drill-down

$120K/yr
Cost of one L1 analyst
~$200/mo
ARIA infra cost per MSSP tenant
$8–25K/mo
Target revenue per MSSP customer
85%+
Target gross margin
Live validation: HuntingJacq (jayakrishnancp.com) — 525+ investigations complete · 44 Firewalla-managed devices · Firewalla + ClickUp live · auto-polling every 60 s · 20 / 20 health-check passes.
← → arrows · SPACE next · HOME/END · F fullscreen