EnMetrics Practice · Discipline 04

Real-Time
Analytics.

The Discipline Of Acting On Data While It Still Matters — Engineered To The Latency The Outcome Actually Demands.

Most enterprise data is consumed late. Reports are produced after the close, dashboards are refreshed overnight, and decisions are made on yesterday's view of the world. For a great many decisions, that latency is acceptable — financial reporting, monthly reviews, strategic planning. For a growing class of decisions, it is not. A fraudulent transaction stopped within seconds is fraud prevented; the same transaction caught the next morning is fraud absorbed. A supply-chain exception detected at the loading dock is a delivery rerouted; the same exception detected in the next-day report is a customer commitment missed. Real-time analytics is the discipline of engineering the data path — ingestion, processing, state, serving, and consumption — to operate inside the latency window where the outcome is still recoverable. It is not streaming for its own sake. It is the deliberate matching of architecture to the half-life of the decision.

What Entiovi means by
real-time analytics.

Real-time has become a marketing word, and the engineering discipline behind it has been blurred in the process. In Hatsya engagements, real-time analytics is defined precisely — by the latency budget the use case actually requires, and by the architecture engineered to meet that budget reliably. A use case with a latency budget of seconds requires sub-second event-time processing, stateful streaming, and an analytical store that responds within the budget. A use case with a budget of minutes requires near-real-time micro-batch ingestion, materialised view freshness, and a fast warehouse query path. Both are real-time relative to the decision — and conflating them is what produces over-engineered systems for one use case and under-engineered systems for another.

The discipline is therefore a sequence of explicit decisions. What is the decision being supported? What is its half-life? What is the cost of being late? What is the cost of being wrong? What state must the system maintain? What guarantees — exactly-once, at-least-once, ordered, idempotent — does the consumer require? Hatsya engagements answer those questions before architecture is selected, and the architecture then follows. The result is real-time systems engineered to the actual latency the outcome demands, with the cost envelope, operational complexity, and reliability profile that flows from that target — not a generic streaming stack carried over from a previous project.

The boundary with the layers underneath is deliberate. Data Engineering & Pipelines builds the streaming ingestion and CDC paths. AI-Ready Data Platforms provides the real-time analytical engines and feature stores. Real-Time Analytics is the discipline that designs the end-to-end real-time use case across both — ingestion, state, processing, serving, and consumption — against a defined latency budget and a defined business outcome.

Key service
components.

Hatsya's real-time analytics practice is structured around six service components, each addressing a distinct part of the latency-bounded data path.

Streaming ingestion and event backbone

Event buses engineered for the throughput, retention, ordering, and replayability the workload requires — Apache Kafka, Confluent Cloud, Apache Pulsar, Amazon Kinesis, Azure Event Hubs, Google Pub/Sub. Topic design, partitioning, schema registry, and exactly-once delivery semantics are decisions made deliberately, not inherited from a default. The event backbone is the single replayable substrate every downstream real-time use case is built on.

Stream processing and complex event processing

Stateful stream processing on Apache Flink, Spark Structured Streaming, and Kafka Streams — with windowing, joins, aggregations, watermarks, and state backends engineered to the workload. Streaming SQL on Materialize, RisingWave, ksqlDB, and Flink SQL where declarative semantics fit. Complex event processing for pattern detection, sequence matching, and anomaly recognition where the value is in the relationship between events, not in any single event.

Real-time analytical engines

Sub-second analytical engines tuned for the workload shape — ClickHouse for high-cardinality OLAP, Apache Pinot for user-facing analytics on event streams, Apache Druid for time-series aggregations, StarRocks and Apache Doris for hybrid analytical workloads. Each is selected against the query pattern, the cardinality, the freshness budget, and the cost envelope — and operated alongside, not in place of, the warehouse and lakehouse.

Streaming features and online ML inference

Streaming feature pipelines that maintain online and offline parity — Feast, Tecton, Databricks Feature Store, and bespoke streaming feature paths — so models score in production on the same features they trained on. Online inference services with sub-100-millisecond latency budgets, fed by feature stores that themselves operate inside the latency budget. Drift detection, freshness monitoring, and skew alarms wired into the same observability surface as the rest of the platform.

Real-time consumption surfaces

Operational control towers, customer-facing analytics, alerting and exception services, agent and workflow triggers, and event-driven UIs — each engineered to the user-perceived latency budget. Push-based delivery (WebSockets, SSE, push notifications) where the user must not refresh; pull-based delivery with sub-second response times where polling fits. Consumption surfaces inherit the same metric definitions and governance posture as the rest of the BI estate — so the operational view does not contradict the analytical one.

State, observability, and reliability engineering

Checkpointing, state backends, exactly-once configuration, watermark strategy, dead-letter queues, replay procedures, and disaster-recovery design are engineered in. Latency, lag, throughput, and freshness SLIs are instrumented end-to-end. Real-time pipelines fail in ways that batch pipelines do not — and Hatsya engagements treat that failure modelling as a first-class engineering deliverable, not a post-deployment discovery.

Architecture & delivery
considerations.

Real-time architecture decisions amplify both value and cost. Hatsya engagements anchor every decision to the latency budget and the failure profile.

01

Latency budget defined per use case, not per stack

Sub-second, single-digit seconds, minutes, and tens of minutes are different architectural regimes. The use case is sized to one of them, and the architecture is engineered to that regime — not over-engineered for milliseconds when seconds are sufficient, and not under-engineered for seconds when sub-second is required. The latency budget is a contract with the consumer, not an aspiration.

02

Stateful versus stateless processing

Stateless transformations are simple, recoverable, and cheap. Stateful processing — windows, joins, sessionisation, deduplication, sequence detection — introduces correctness, ordering, and recovery complexity that must be designed for. Hatsya engagements identify which parts of the pipeline must be stateful, isolate them from the parts that need not be, and engineer the state backends and checkpoint strategy specifically for the stateful tier.

03

Exactly-once, at-least-once, and idempotency

Exactly-once is not free — it constrains throughput, increases operational cost, and depends on cooperation across producer, broker, and consumer. Where the consumer can be made idempotent, at-least-once with idempotent writes is often the better trade. Where the consumer cannot, exactly-once is engineered in — with the corresponding cost accepted. The decision is explicit per pipeline.

04

Lambda, kappa, and hybrid architectures

Pure streaming (kappa) is elegant where it fits; lambda — streaming for latency, batch for accuracy and reprocessing — still earns its place where regulatory, accounting, or correctness requirements demand a deterministic batch view alongside the streaming one. Most production estates are hybrid by necessity, and Hatsya engagements design that hybrid posture explicitly rather than letting it accumulate by accident.

05

Real-time engine selection by workload shape

ClickHouse, Pinot, Druid, StarRocks, Materialize, and RisingWave each have a defensible workload niche — and a workload niche they fit poorly. Engagements profile the queries, the cardinality, the freshness expectation, the concurrency, and the cost envelope, then select the engine that survives the production load. The wrong choice is recoverable; recognising the wrong choice early matters.

06

Schema evolution, backwards compatibility, and replay

Real-time pipelines run for years; the schemas they carry will change. Schema registry discipline, compatibility rules, contract testing, and replay strategies are engineered in. A pipeline that cannot be replayed safely after a schema change is a pipeline that will eventually corrupt downstream state in a way that is expensive to diagnose.

07

Cost discipline at the streaming tier

Streaming infrastructure is more expensive per byte than batch — and the cost compounds when retention, partitioning, and replication are sized without intent. Hatsya engagements right-size every component, monitor unit economics per pipeline, and identify workloads where the latency requirement does not justify the streaming cost. Some real-time workloads, on examination, do not need to be real-time.

Business
use cases.

Real-time analytics engagements deliver the most consequential return where the value of an event decays inside minutes — and where the business has been operating without that capability.

Outcomes
for clients.

Hatsya real-time engagements are evaluated on the operational outcomes they unlock and the latency budgets they reliably hold.

01

Decisions move from hours to seconds

Use cases that previously waited for the next batch cycle now act inside the operational window where the outcome is still recoverable. The behavioural change — operations teams responding to live signals rather than reviewing yesterday's report — is the most durable outcome of these engagements.

02

Fraud, exception, and incident response measurably accelerated

Mean time to detection collapses; mean time to action collapses. Engagements typically deliver 60–80 percent reductions in detection latency for fraud, AML, and operational exception use cases — with the corresponding loss-reduction and customer-satisfaction outcomes.

03

Customer-facing analytics deliver inside the user's expectation

Sub-second OLAP on multi-billion-row event datasets, served at high concurrency, lets product teams build analytics into the customer experience itself — replacing the historical pattern of exporting data for an analyst to interpret a day later.

04

Online ML inference becomes a platform capability

Streaming feature pipelines, online feature stores, and low-latency inference services let machine learning operate where it produces measurable revenue — inside the user's decision window — rather than as a batch scoring job that arrives too late to influence anything.

05

Operational control towers replace batch reporting cycles

Manufacturing, supply-chain, customer-support, and revenue-operations functions move from waiting for daily reports to operating against sub-minute KPIs — with the operating discipline and the decision-cycle compression that flows from it.

06

Streaming cost held inside an envelope

Right-sized brokers, partitioning discipline, retention strategy, and engine-per-workload selection mean real-time capability is delivered without the unbounded cost growth that has historically accompanied streaming programmes. Engagements typically hold steady-state streaming spend within 15–25 percent of the original target.

Proof points
<300ms fraud decisioning on streaming card transactions — replacing a 4-hour batch fraud cycle and reducing fraud losses by 41 percent in the first six months.
30s telemetry refresh on a real-time supply-chain control tower across 22 distribution sites — exception alerting and route-replanning workflows replacing a daily batch reporting cycle.
9B+ row event stream queried at sub-second latency, embedded inside a B2B SaaS product surface across 400+ enterprise tenants.
<100ms online ML scoring latency on a streaming feature store with full batch-online parity — enabling dynamic pricing across 18 markets without offline-online training-serving skew.
12,000+ asset telemetry streams under industrial IoT analytics — anomaly detection, predictive intervention, and sub-minute operational visibility replacing a daily reporting cadence.

Why
Entiovi.

Real-time analytics is one of the disciplines where a stack carried over from a previous project causes the most damage. Hatsya engagements begin from a different posture.

Architecture sized to the latency budget — not to the toolkit

Engagements begin with the use case, the decision half-life, and the failure profile. The streaming stack and analytical engine follow that analysis. Entiovi has no incentive to recommend a particular streaming platform over another.

Workload-led engine selection across the real-time landscape

Kafka, Pulsar, Kinesis, Event Hubs, Pub/Sub; Flink, Spark Structured Streaming, Kafka Streams, Materialize, RisingWave, ksqlDB; ClickHouse, Pinot, Druid, StarRocks. Each has a niche it serves well, and each has a niche it fits poorly. Hatsya engagements profile the workload before selecting the engine.

Streaming engineered with the operational discipline of platform teams

Checkpointing, state backends, schema evolution, replay strategy, dead-letter handling, lag and freshness SLIs, and disaster-recovery procedures are designed in from day one. Real-time systems fail in ways batch systems do not — and the engagements account for that explicitly.

End-to-end latency design, not point-tool optimisation

The latency budget runs from event generation to consumer-perceived response. Hatsya engagements engineer every hop in that path — ingestion, processing, state, storage, query, network, render — against the overall budget rather than optimising each component in isolation.

Online ML and AI workloads as first-class consumers

Streaming feature pipelines, online feature stores, and inference services are engineered as part of the same fabric — with the consistency, governance, and observability properties that production AI workloads require. Real-time AI is treated as a use case the platform must serve, not as a parallel programme.

Cost discipline as a delivery constraint

Real-time programmes are where unmanaged streaming spend most often appears. Hatsya engagements right-size each component, identify workloads where the latency requirement does not justify the streaming cost, and hold steady-state spend inside the original envelope.

Acting on data
while it still matters.

There is a class of decisions that batch analytics will never serve. The fraudulent transaction, the exception in the loading bay, the customer about to abandon the session, the asset about to fail — each of these has a window in which action is still possible, and a moment after which the answer becomes a post-mortem. Real-time analytics is the discipline that engineers the data path to operate inside that window.

Done well, it is invisible — the systems just hold their latency budgets, the decisions just get made on time, and the operating model adapts to a tempo it could not previously support. Done badly, it is an expensive parallel estate that produces unreliable answers more quickly than the batch one. Hatsya engagements are built around the difference.

Entiovi's team will assess, in a structured two-week engagement, the current state of an organisation's real-time data path, the use cases that justify the architecture, the latency budgets they actually require, and the steady-state cost envelope inside which the capability must operate.

Inside the latency window.

Acting on data
while it still matters.

Entiovi · Hatsya Practice · Discipline 04