Security Architecture

Beyond SIEM Pipes and Federated Queries

Why the security data problem isn't about moving data or querying it in place—it's about understanding what the data means

SR
Setu Research
February 12, 2026·10 min read

The Two Poles of the Security Data Debate

The security industry has spent a decade arguing about data. Specifically: where should it live, how should it move, and who should pay for the storage?

Two recent entrants have staked out opposing positions on this question, and both are worth examining—not because they're wrong, but because they're solving a problem that's already downstream of the one that matters.

Monad is building the perfect pipe. A security data pipeline with 250+ source connectors, OCSF normalization, in-flight transforms, and conditional routing to any SIEM, data lake, or analytics platform. Think of it as Fivetran for security telemetry. Their pitch: your tools generate the data, your SIEM needs the data, and Monad makes the plumbing invisible.

Vega is eliminating the pipe entirely. Their Security Analytics Mesh (SAM) uses federated queries to analyze data wherever it already lives—your S3 buckets, your Snowflake tables, your vendor APIs—without copying a single byte into a central store. Their pitch: stop paying to move and duplicate data when you can just ask questions across it.

Both approaches are technically sound. Both solve real pain points. And both leave the hardest problem in security completely untouched.


The Pipe Problem: Monad's Bet

Monad's architecture is elegant in its simplicity:

StageWhat Happens
Ingest250+ connectors pull from security tools (EDR, IdP, cloud, SaaS)
NormalizeRaw events mapped to OCSF schema
TransformIn-flight enrichment, filtering, deduplication
RouteConditional delivery to SIEM, lake, or archive based on rules

The value proposition is clear: enterprises spend 30–40% of their SIEM budget on ingestion costs. Monad lets you filter noise before it hits Splunk, route cold data to cheap storage, and normalize everything to a common schema so your detection rules aren't vendor-specific.

This is genuinely useful. Any security team that's watched their Splunk license balloon because a single misconfigured firewall started logging every packet knows the pain Monad addresses.

But here's the question Monad doesn't answer: what do you do with the data once it arrives?

Monad delivers normalized events to your SIEM. Your SIEM runs correlation rules written by your detection engineering team. Those rules fire alerts. Your SOC triages those alerts. And the fundamental bottleneck—that your analysts lack the context to distinguish a true positive from noise—remains exactly where it was.

A perfectly plumbed pipeline that delivers 10,000 normalized events per second to a SIEM that generates 500 alerts per day that a team of 6 analysts can't keep up with is not a solved problem. It's a well-formatted unsolved problem.


The Mesh Problem: Vega's Bet

Vega takes the opposite approach and asks: why move data at all?

Their Security Analytics Mesh connects to data sources in place and executes federated queries across them. An analyst asks "show me all failed authentication attempts across all identity providers in the last 24 hours" and the mesh fans the query out to Okta, Azure AD, and AWS IAM simultaneously, aggregates the results, and returns a unified view.

The technical implementation is thoughtful:

  • Two-stage aggregation: Local pre-aggregation at each source reduces data transfer
  • Probabilistic data structures: HyperLogLog and Count-Min Sketch for approximate analytics without full scans
  • AI-assisted analysis: Jupyter-backed agents that write and execute analytical queries

The cost savings are real. If your security data already lives in S3 and Snowflake, querying it in place at 5/TBscannedisdramaticallycheaperthancopyingitintoaSIEMat5/TB scanned is dramatically cheaper than copying it into a SIEM at 15/GB/day ingested.

But Vega has its own blind spot: federated queries can only answer questions you know to ask.

Security analytics isn't a SQL problem. The questions that matter—"what's the Expected Compromise Impact if this service account is compromised?" or "which identities have transitive access to production databases through nested group memberships?"—aren't queries you run against event logs. They require a fundamentally different data model: a graph.


What Both Approaches Miss: The Identity Graph

Here's what neither a perfect pipe nor a federated mesh gives you:

1. Relationship-Aware Context

When a detection rule fires on "service account accessed production S3 bucket at 3 AM," the critical question isn't "did this event happen?" (both Monad and Vega answer that). The critical question is: "what else can this identity reach, and how bad is it if this is an attacker?"

Answering that requires traversing an identity graph—from the service account to its role bindings, to the resources those roles can access, to the data classification of those resources, to the other identities that share those access paths. That's not a log event. It's a graph traversal.

2. Exposure Quantification Before Detection

Both Monad and Vega operate in the detection paradigm: events happen, alerts fire, analysts investigate. But the highest-value security question is upstream of detection: "which identities represent unacceptable exposure right now, before anything bad happens?"

An identity with standing admin access to 47 production databases, no MFA, and a password last rotated 400 days ago is a critical finding regardless of whether any suspicious event has occurred. No amount of log normalization or federated querying surfaces this. It requires continuous posture analysis across identity providers, cloud IAM, and entitlement systems.

3. Non-Human Identity Visibility

With non-human identities outnumbering humans 144:1, the majority of your attack surface isn't generating the kind of telemetry that either Monad or Vega handles well. API keys don't produce authentication logs when they're used in automated pipelines. Service account tokens don't trigger MFA challenges. OAuth grants between SaaS applications operate invisibly.

These identities live in configuration state, not event streams. You need to discover them through API enumeration and entitlement analysis, not log collection.


The Setu Approach: Understand First, Then Detect

Setu doesn't compete with Monad on plumbing or Vega on query federation. We operate at a different layer entirely: identity-aware exposure management.

CapabilityMonadVegaSetu
Data normalization (OCSF)YesNo (queries raw)Yes
250+ source connectorsYesPartialGrowing
Federated query across sourcesNoYesNo (graph-native)
Identity exposure graphNoNoYes
Non-human identity discoveryNoNoYes
ECI quantificationNoNoYes
Attack path analysisNoNoYes
Continuous trust scoringNoNoYes
STIX/TAXII threat intel correlationNoNoYes

The architecture is fundamentally different. Where Monad builds a pipeline and Vega builds a query mesh, Setu builds a continuously updated identity exposure graph that correlates:

  • Human identities across IdPs (Okta, Azure AD, Google Workspace)
  • Non-human identities (service accounts, API keys, OAuth tokens, CI/CD secrets)
  • Entitlements and permissions across cloud providers and SaaS applications
  • Device posture from MDM and EDR
  • Vulnerability and misconfiguration data from scanners
  • Threat intelligence via STIX/TAXII feeds

Every node in this graph has a computed ECI (Expected Compromise Impact)—a 0–100 score quantifying the damage potential if that identity is compromised. Every edge represents a real access path an attacker could traverse. The graph updates continuously, not when a query is executed.


Why This Matters for Detection Too

This isn't an argument against detection. It's an argument that detection without identity context is noise.

When Setu ingests security events (and yes, we normalize to OCSF), those events are immediately correlated against the identity graph. A "failed login" event for a service account with an ECI of 12 and access to a single dev bucket is informational. The same event for a service account with an ECI of 94 and transitive access to every production database is critical.

That context—which Monad can't provide because it only moves data, and Vega can't provide because it only queries data—is what turns a noisy alert stream into actionable intelligence.

Closed-Loop Remediation

The graph doesn't just score exposure. It recommends specific actions to reduce it:

  • Permission right-sizing: "This service account has 47 permissions but has only used 3 in the last 90 days. Here are the 44 to remove."
  • Access path elimination: "Removing this single role binding reduces the ECI of 12 identities by an average of 34 points."
  • Credential rotation: "These 23 API keys haven't been rotated in over 365 days and have access to production systems."

Each remediation is scored by its impact on aggregate ECI, so security teams can prioritize the changes that reduce the most exposure with the least operational disruption.


The Emerging Stack

We don't see Monad and Vega as competitors. We see them as potential complements—useful layers in a stack that's still being defined:

LayerFunctionExample
Data MovementGet telemetry where it needs to go efficientlyMonad, Cribl, Amazon Security Lake
Data AnalyticsQuery and analyze security events at scaleVega, Snowflake, your SIEM
Exposure ManagementUnderstand identity relationships and quantify risk continuouslySetu
ResponseAutomate remediation based on exposure contextSOAR, Setu Remediation

The insight is that these layers are complementary, not competitive. You need good plumbing (Monad's strength). You need efficient querying (Vega's strength). But you also need a system that continuously answers: "given everything we know about every identity, every access path, and every asset—where is our exposure highest, and what's the most efficient action to reduce it?"

That's the layer Setu occupies.


Summary

The security data problem isn't about moving data faster or querying it in place. It's about understanding what the relationships between identities, permissions, and resources mean for your actual exposure.

Monad and Vega solve real problems at the data layer. Setu solves the problem that exists after the data question is settled: what does this access actually mean, and what should we do about it?

The answer lives in the graph.

SR

Setu Research

Setu Security Research