Agentic Security

What is Parse-at-Query? Inverting the Traditional SIEM Ingestion Model

Traditional SIEMs force teams to discard logs they can't afford to parse. Parse-at-query stores everything raw, parses only what's queried, and makes full log coverage economically viable.
Published on
March 6, 2026
Go Back

Every security team has made this trade-off: which logs do we keep, and which do we throw away? The math is simple and painful. Traditional SIEMs charge by ingest volume, and parsing every log at the point of ingestion is the single biggest cost driver in the pipeline. So teams make cuts. They drop logs, reduce firewall verbosity, and sample cloud telemetry. The result is a large visibility gap.

Parse-at-query architecture eliminates this trade-off entirely.

What Is Parse-at-Query Architecture?

Parse-at-query is an architectural pattern that inverts the traditional log management model. Instead of structuring and indexing every log line at the moment it enters the platform, raw log data is stored in its original format: unmodified, unstructured, and cheap to retain. Parsing, the computationally expensive work of extracting fields, normalizing schemas, and building searchable structures, happens only when an analyst or automation actually queries the data. With AI, there is no longer a need for a federated data layer with identical formatting. Agents can parse the data on demand in its original format.  

Compare this to the conventional approach. In a traditional SIEM, every log goes through an Extract-Transform-Load (ETL) pipeline at ingest. The platform parses the raw data, maps fields to a schema, indexes everything, and stores the structured output. This is expensive. Indexing and parsing at scale require significant compute, and the structured data that results is typically 3 to 5 times larger than the raw source. Storage costs multiply. Licensing costs follow.

Parse-at-query flips this model: store everything cheaply, extract structure on demand.

The Cost Inversion

Legacy

Parse-at-Ingest

Raw logs ingested
Parse, normalize, index
Expensive compute on every log line
Structured storage
3 to 5x data inflation after indexing
Analyst runs query
Only searches what you paid to keep
Modern

Parse-at-Query

Raw logs ingested
Compressed object storage
Low cost, full fidelity, no parsing
At query time
Analyst runs query
Targets specific time range and source
Parse and normalize on demand
Compute cost only on queried data

The difference is not subtle. In the parse-at-ingest model, you pay to process 100% of your logs whether anyone ever looks at them or not. In the parse-at-query model, you pay to process only the data that someone actually needs to examine. For most organizations, the percentage of log data that is ever actively queried in a given month sits well below 5%.

Why Parse-at-Query Matters for Security Teams

The implications go beyond cost savings. Parse-at-query changes what is economically possible in a security program.

Complete log retention becomes viable
When storage costs drop by an order of magnitude, the calculus around what to keep and what to discard changes completely. DNS query logs, full NetFlow data, verbose cloud API audit trails—data sources that teams routinely exclude from their SIEM due to cost—can now be retained indefinitely. This is not a luxury. It is the difference between being able to reconstruct an attacker's full kill chain during incident response and hitting a dead end because the relevant telemetry was discarded three weeks ago.

Schema decisions do not have to be made upfront
In traditional platforms, the parsing rules and field mappings must be defined before data is ingested. If a new log source appears or a vendor changes their log format, someone has to write new parsers, test them, and deploy them before the data becomes useful. Parse-at-query eliminates this bottleneck. The raw data is already stored. When a new detection use case requires a previously unextracted field, AI can apply parsing logic to data that has been sitting in storage for months.

Investigation decoupled from ingest overhead
SOC analysts do not care about ingest pipelines. They care about getting answers. When an analyst is investigating an alert, they are typically only querying a narrow slice of data: a specific IP range, a time window, a particular user. Parse-at-query allows for broader investigation visibility by enabling more logs to be ingested with less overhead cost. 

Where Parse-at-Query Changes the Game

Consider a few scenarios where this architecture makes a material difference.

A threat intelligence feed identifies a new indicator of compromise: a domain that was being used for command-and-control six months ago. Under the traditional SIEM model, the team may have chosen not to ingest those DNS logs to manage costs, meaning they aren't available for the investigation. If the team had instead chosen to ingest those logs due to a lower-cost ingestion model (parse-at-query), the logs are available for investigation on demand and the team can resolve the issue quickly.   

An organization is onboarding a new cloud service that generates audit logs in a proprietary format. With parse-at-ingest, the team needs to structure that data in a federated data layer before any of it is usable. With parse-at-query, the raw logs start flowing into storage immediately. Agents can parse them on demand without the need for a federated data layer Parsers can be developed and refined iteratively against real data, with no risk of losing events during the development window.

A compliance audit requires demonstrating that specific access patterns were monitored over the past 12 months. The organization had been logging the relevant events, but the traditional SIEM's retention policy capped storage at 90 days due to cost constraints. If those time-limited storage constraints are removed,  12-month or multi-year retention becomes economically viable.

What About Real-Time Detection and Query Performance?

Parse-at-query does not mean parse-only-at-query. The most common objection from security practitioners is that real-time detection and alerting depend on structured data being available immediately. That's true, and any viable architecture accounts for it.

The answer is a hybrid approach: a lightweight streaming layer handles real-time detections on high-priority fields as logs arrive, while full parse-at-query applies to investigation, threat hunting, and historical analysis. SOC teams still get sub-second alerting on IOC hits and correlation rules, without paying to fully parse and index every field of every log at ingest.

Here are some common concerns about parse-at-query that are worth discussing: 

Query performance
“If every investigation query requires parsing raw logs on the fly, won't it be slow?” In practice, the data being queried during an active investigation is a narrow slice: a specific time window, an IP range, a user identity. Parsing a targeted subset of raw data is fast. For the small percentage of log data that gets actively queried in a given month (typically under 5%), the performance trade-off is negligible compared to the cost savings on the other 95%.

Sensitive data in raw logs
Storing unmodified logs raises fair questions about PII and field-level access controls. Targeted redaction policies can operate on raw data at ingest without requiring full parsing, so compliance requirements are met before data hits storage.

Log integrity
Raw, unmodified logs actually strengthen chain-of-custody arguments. Combined with immutable storage and cryptographic verification, the original record is preserved exactly as it was generated.

Schema consistency across sources
Anyone who has tried to correlate events across dozens of log sources knows the pain of inconsistent field names. This is where agentic parsing earns its value: micro-agents can dynamically resolve field mappings across sources at query time, handling the normalization that traditional SIEMs do at ingest, but without locking teams into rigid schemas that break when vendors change formats.

The Takeaway

The security industry has spent two decades accepting that comprehensive log visibility is too expensive. Parse-at-query architecture proves that the expense was never in the storage, it was in the parsing model. By deferring the expensive work to query time, organizations can retain every log source at a fraction of the traditional cost, investigate historical events without hitting retention walls, and stop making coverage decisions based on licensing pressure rather than security requirements.

This architectural foundation makes Strike48's approach to complete log coverage economically viable. Instead of forcing teams to choose between visibility and budget, parse-at-query removes the choice. Ingest everything. Parse what you need. Pay for what you use.

Get a demo of Strike48 to learn how we make full log coverage economically viable for everyone.

Latest Articles