ClickHouse for Real-Time Analytics: How to Run Low‑Latency Queries on Billions of Rows -

Community manager and producer of specialized marketing content

If your dashboards feel sluggish, your product analytics time out, or your log analysis takes minutes when you need milliseconds, you’re likely hitting the limits of a row‑store database designed for transactions, not analytics. ClickHouse flips that model on its head.

This guide explains how ClickHouse delivers low‑latency analytical queries at massive scale, when to use it, how to model your data, and the patterns that keep queries consistently fast. You’ll also find practical SQL examples, performance tips, and a roadmap to get from proof of concept to production.

For an architectural deep dive with patterns and do’s/don’ts, explore this companion guide: ClickHouse for lightning‑fast analytics: architecture, best practices, and real‑world patterns. If you’re new to the engine, this overview is also helpful: ClickHouse: real-time analytics in a columnar database.

What is ClickHouse?

ClickHouse is an open-source, columnar OLAP database engineered for real‑time analytics. It’s built to scan, aggregate, and filter billions of rows in milliseconds, serving sub‑second queries for workloads like:

Product analytics and event tracking
Log and observability analytics
IoT and time‑series metrics
Ad‑tech, growth experiments, and A/B testing
Financial tick data and pricing analysis

Unlike transactional databases (OLTP), ClickHouse shines when most queries are reads, joins are minimized or optimized, and aggregations dominate.

Why ClickHouse Is So Fast

ClickHouse’s speed isn’t magic; it’s architectural:

Columnar storage: Reads only the columns needed by your query, massively reducing I/O.
Vectorized execution + SIMD: Processes data in CPU‑friendly batches using vector instructions.
Compression: Advanced codecs (LZ4, ZSTD) minimize disk and I/O cost; data stays compressed until needed.
Data skipping indexes: Min‑max and Bloom filter indexes let queries “skip” data blocks that can’t match filters.
MergeTree storage engine family: High‑performance append/write path and background merges keep data read‑optimized.
Approximate aggregations: Functions like uniqCombined and quantileTDigest deliver near‑exact results at a fraction of the cost.

Core Building Blocks You’ll Use

The MergeTree Family

MergeTree is the backbone of ClickHouse. You’ll often use variants tailored to your needs:

MergeTree: General‑purpose table engine.
SummingMergeTree: Sums numeric columns across identical keys.
AggregatingMergeTree: Stores and merges aggregate states (e.g., uniqState, sumState).
ReplacingMergeTree / CollapsingMergeTree: Handle versioned/upsert‑like or event‑sourced data patterns.
ReplicatedMergeTree: Adds replication for high availability.

Key design choices:

PARTITION BY: Typically by time (e.g., month). Enables pruning and TTL policies.
ORDER BY: Defines the primary sort key for read‑in‑order scans and efficient skipping. Align it with your filter/sort patterns.
TTL: Lifecycle rules to automatically delete or move old data to cheaper storage.

Materialized Views and Pre‑Aggregations

Materialized Views (MVs) compute and store ready‑to‑query aggregates in near real time. Combine them with SummingMergeTree or AggregatingMergeTree to answer common queries instantly.

Dictionaries

External dictionaries cache dimension data (like user attributes) in memory for fast lookups, avoiding heavy joins on large tables.

Distributed Queries, Shards, and Replicas

Sharding spreads data across nodes to scale horizontally.
Replication keeps copies for fault tolerance.
Distributed tables let you query shards as one logical table.
Modern clusters typically use ClickHouse Keeper (built‑in) for coordination.

A Practical Data Model for Event Analytics

Start with an immutable, append‑only events table. Denormalize high‑cardinality lookups when possible; use dictionaries for small dimensions.

`sql

CREATE TABLE analytics.events

(

event_date Date DEFAULT toDate(ts),

ts DateTime64(3),

user_id UInt64,

session_id UUID,

path String,

referrer LowCardinality(String),

device LowCardinality(String),

country FixedString(2),

revenue Decimal(12,2) DEFAULT 0,

latency_ms UInt32

)

ENGINE = MergeTree

PARTITION BY toYYYYMM(event_date)

ORDER BY (event_date, user_id, ts)

SETTINGS index_granularity = 8192;

Why this works:

Partition by month keeps partitions manageable and easy to age out.
Order by (date, user_id, ts) aligns with common filters, enabling read‑in‑order and skipping.
LowCardinality optimizes storage for common categorical fields.

Add TTL to control retention:

`sql

ALTER TABLE analytics.events

MODIFY TTL event_date + INTERVAL 180 DAY DELETE;

Add a Bloom filter index for selective filters:

`sql

ALTER TABLE analytics.events

ADD INDEX idx_path_bf path TYPE bloom_filter(0.01) GRANULARITY 64;

Pre‑Aggregate What You’ll Query Most

Suppose you frequently need daily metrics by country and device. Create a Materialized View writing into a SummingMergeTree:

`sql

CREATE TABLE analytics.daily_metrics

(

event_date Date,

country FixedString(2),

device LowCardinality(String),

events UInt64,

users UInt64,

revenue Decimal(12,2)

)

ENGINE = SummingMergeTree

PARTITION BY toYYYYMM(event_date)

ORDER BY (event_date, country, device);

CREATE MATERIALIZED VIEW analytics.mv_daily_metrics

TO analytics.daily_metrics AS

SELECT

toDate(ts) AS event_date,

country,

device,

count() AS events,

uniqExact(user_id) AS users,

sum(revenue) AS revenue

FROM analytics.events

GROUP BY event_date, country, device;

Querying daily_metrics is now instant—even with billions of raw events.

Query Patterns That Stay Sub‑Second

Fast distincts: use approximate functions where acceptable.
Percentiles: use quantile* functions (e.g., quantileTDigest).
Window functions: supported when needed, but pre‑aggregation is often faster.

Examples:

`sql

-- Approximate DAU by device using uniqCombined

SELECT

toDate(ts) AS d,

device,

uniqCombined(user_id) AS dau

FROM analytics.events

WHERE ts >= now() - INTERVAL 14 DAY

GROUP BY d, device

ORDER BY d, device;

-- P95 latency by endpoint

SELECT

path,

quantileTDigest(0.95)(latency_ms) AS p95_latency_ms,

count() AS hits

FROM analytics.events

WHERE ts >= now() - INTERVAL 24 HOUR

GROUP BY path

ORDER BY p95_latency_ms DESC

LIMIT 20;

Getting Data In Fast (Streaming and Batch)

ClickHouse ingests from many sources:

Streaming: Kafka, RabbitMQ, HTTP streaming
Batch: Parquet, CSV, JSON via files, S3, or HTTP
Connectors: JDBC/ODBC, Python, Go, Java, Rust, and more

Kafka Engine example:

`sql

CREATE TABLE ingest.events_kafka

(

ts DateTime64(3),

user_id UInt64,

path String,

device LowCardinality(String),

country FixedString(2),

revenue Decimal(12,2),

latency_ms UInt32

)

ENGINE = Kafka

SETTINGS kafka_broker_list = 'broker:9092',

kafka_topic_list = 'events',

kafka_group_name = 'ckh_events_consumer',

kafka_format = 'JSONEachRow';

CREATE MATERIALIZED VIEW ingest.mv_events

TO analytics.events

AS SELECT * FROM ingest.events_kafka;

Working from MongoDB? You can stream collections into ClickHouse (no Kafka required) with this practical approach: How to stream MongoDB to ClickHouse in real time — no code, no Kafka.

Cluster Design: Scale Out Without Surprises

Start small (single node) to validate schema and query patterns.
Add replicas for HA; add shards to scale throughput.
Create local tables per node with ReplicatedMergeTree; expose a Distributed table for unified queries.
Use a sharding key that balances load (e.g., cityHash64(user_id)).

Tip: Keep shard count modest early on. Over‑sharding increases operational overhead and cross‑shard latency.

Performance Tuning That Actually Matters

Align ORDER BY with filters and sorting. This is the single most important design decision.
Use LowCardinality for categorical strings.
Choose index_granularity based on query selectivity (8192 is a solid default).
Prefer approximate aggregates for large cardinalities (uniqCombined vs uniqExact where possible).
Pre‑aggregate common views with MVs to cut query cost.
Tune parallelism: max_threads, max_bytes_before_external_group_by/sort for memory‑heavy queries.
Avoid tiny parts: batch inserts; configure background merges (min_bytes_for_wide_part, merge settings).
Use system tables (system.query_log, system.part_log) to analyze hotspots and part churn.

Data Lifecycle and Cost Control

TTL DELETE for strict retention; TTL TO VOLUME for hot/warm/cold tiering.
Offload cold partitions to S3 with on‑prem or cloud setups.
Compress with ZSTD for better storage efficiency.

Security and Governance

RBAC with users, roles, and fine‑grained grants.
Row‑level security via filters on views.
TLS for client/server; network isolation at the infrastructure layer.
Audit via system.query_log and query_thread_log.

ClickHouse vs. Alternatives (Quick Perspective)

ClickHouse vs. BigQuery/Snowflake: ClickHouse excels at real‑time, sub‑second interactive analytics on your own infra; BigQuery/Snowflake shine for elastic, serverless batch analytics and mixed workloads.
ClickHouse vs. Apache Druid/Pinot: All are built for fast OLAP. ClickHouse is general‑purpose with rich SQL and broad ecosystem; Druid/Pinot often target specific streaming/real‑time use cases with different operational tradeoffs.

Real‑World Use Cases

Product analytics: Funnels, retention, cohorts, feature adoption—interactive and instant.
Observability: Logs, metrics, traces with fast filters and drilldowns.
IoT: Millions of device signals per minute with percentile and anomaly queries.
Ad‑tech and growth: Low‑latency audience segmentation, attribution, and bidding analysis.
Billing/telemetry: Near real‑time aggregations with exact or approximate distincts.

Common Pitfalls to Avoid

Misaligned ORDER BY: If it doesn’t match filters/sorts, you’ll sacrifice data skipping and speed.
Too much normalization: Large joins on massive tables can erase performance gains; denormalize strategically.
Excessive tiny parts: Many small inserts increase merge overhead; buffer/widen batches.
Over‑sharding early: More shards ≠ faster. Start simple; scale out deliberately.
Overusing exact distinct: uniqExact is expensive at scale—prefer uniqCombined where acceptable.

A Practical 7‑Step Blueprint

Define high‑value queries and SLAs (latency, concurrency, retention).
Choose PARTITION BY (time‑based) and ORDER BY (filter‑aligned) keys.
Start with a single node; validate schema, indexes, and query speed.
Add Materialized Views for pre‑aggregations that reflect your dashboards.
Build ingestion (Kafka, HTTP, or batch) with durable, idempotent delivery.
Introduce replication and sharding once you trust your patterns.
Add governance: RBAC, auditing, and lifecycle policies (TTL, tiering).

Quick Start Queries to Validate Design

`sql

-- How many parts? (too many = merge pressure)

SELECT table, sum(parts) AS total_parts

FROM system.parts

WHERE database = 'analytics'

GROUP BY table;

-- Top heavy queries

SELECT normalized_query_hash, query, count() AS cnt, avg(query_duration_ms) AS avg_ms

FROM system.query_log

WHERE type = 'QueryFinish'

AND event_time >= now() - INTERVAL 1 HOUR

GROUP BY normalized_query_hash, query

ORDER BY avg_ms DESC

LIMIT 10;

-- Partition health

SELECT table, partition, sum(rows) AS rows, sum(bytes_on_disk) AS bytes

FROM system.parts

WHERE database = 'analytics' AND active

GROUP BY table, partition

ORDER BY rows DESC

LIMIT 20;

Where to Go Next

Validate your schema with actual queries—then optimize ORDER BY and pre‑aggregations.
Add streaming ingestion and a Distributed layer as concurrency grows.
Keep an eye on system tables to tune merges, memory, and parallelism.

If you need a deeper architectural walkthrough and proven patterns, revisit: ClickHouse for lightning‑fast analytics: architecture, best practices, and real‑world patterns, plus this introduction: ClickHouse: real-time analytics in a columnar database. Starting from MongoDB? Here’s a streamlined path: How to stream MongoDB to ClickHouse in real time — no code, no Kafka.

FAQ: ClickHouse, Low‑Latency Analytics, and Real‑Time Data

1) Is ClickHouse an OLTP database?

No. ClickHouse is an OLAP database optimized for analytical reads and aggregations. It can handle high write throughput, but it does not provide the transactional guarantees or row‑level update patterns typical of OLTP systems like PostgreSQL or MySQL.

2) How should I choose PARTITION BY and ORDER BY?

PARTITION BY: Use time (daily or monthly) to enable pruning and TTL retention. Monthly is common for large volumes; daily for heavy churn.
ORDER BY: Match your most common WHERE filters and sorting. If you always filter by event_date and user_id, include them first in ORDER BY. This alignment unlocks read‑in‑order scans and data skipping.

3) Do I need pre‑aggregations (Materialized Views)?

For interactive dashboards and repeated queries, yes. MVs cut query cost drastically by computing common aggregates as data arrives, letting you serve sub‑second responses at scale.

4) How does ClickHouse achieve fast distinct counts?

It offers approximate algorithms like uniqCombined and exact options like uniqExact. For large cardinalities and dashboards, uniqCombined is usually indistinguishable in practice and far faster.

5) What’s the best way to ingest streaming data?

Kafka Engine is the most common, feeding Materialized Views that write into MergeTree tables. For simpler pipelines or specific sources, you can ingest via HTTP streaming, file drops (Parquet/CSV), or specialized tools. If you’re pulling from MongoDB, consider the no‑code path described here: streaming MongoDB to ClickHouse in real time without Kafka.

6) How do I scale ClickHouse horizontally?

Use sharding and replication:

Create ReplicatedMergeTree tables on each shard.
Expose a Distributed table that routes queries.
Pick a sharding key that balances data (e.g., cityHash64(user_id)).

Start with replication for HA, then add shards as concurrency and data grow.

7) Can ClickHouse handle joins?

Yes, but design carefully. Denormalize frequently used dimensions; use dictionaries for small reference data. Large joins across massive tables can degrade performance—test and benchmark.

8) What about data retention and cost control?

Use TTL rules to delete old data or move it to colder storage tiers. Offload cold partitions to S3‑backed volumes where appropriate. Compression (ZSTD) and LowCardinality columns keep storage and I/O efficient.

9) How do I monitor performance issues?

Leverage system tables:

system.query_log and system.query_thread_log for slow queries and resource usage
system.parts and system.part_log for part counts and merge health
system.metrics and system.events for engine‑level stats

Correlate slow queries with ORDER BY alignment, index usage, and merge pressure.

10) When should I consider alternatives?

If you need serverless elasticity with minimal ops and mostly batch workloads, consider BigQuery or Snowflake.
If you have specialized real‑time streaming constraints tied to an existing stack (like Pinot/Druid), compare operational tradeoffs.

For most real‑time analytics on large volumes with strict latency targets, ClickHouse is a leading choice.

Ready to make dashboards fly and queries feel instant? Start with a simple MergeTree table, align your ORDER BY with real filters, and add pre‑aggregations where they count—you’ll be surprised how far that gets you.

Data Analytics

ClickHouse for Real-Time Analytics: How to Run Low‑Latency Queries on Billions of Rows

What is ClickHouse?

Why ClickHouse Is So Fast

Core Building Blocks You’ll Use

The MergeTree Family

Materialized Views and Pre‑Aggregations

Dictionaries

Distributed Queries, Shards, and Replicas

A Practical Data Model for Event Analytics

Pre‑Aggregate What You’ll Query Most

Query Patterns That Stay Sub‑Second

Getting Data In Fast (Streaming and Batch)

Cluster Design: Scale Out Without Surprises

Performance Tuning That Actually Matters

Data Lifecycle and Cost Control

Security and Governance

ClickHouse vs. Alternatives (Quick Perspective)

Real‑World Use Cases

Common Pitfalls to Avoid

A Practical 7‑Step Blueprint

Quick Start Queries to Validate Design

Where to Go Next

FAQ: ClickHouse, Low‑Latency Analytics, and Real‑Time Data

1) Is ClickHouse an OLTP database?

2) How should I choose PARTITION BY and ORDER BY?

3) Do I need pre‑aggregations (Materialized Views)?

4) How does ClickHouse achieve fast distinct counts?

5) What’s the best way to ingest streaming data?

6) How do I scale ClickHouse horizontally?

7) Can ClickHouse handle joins?

8) What about data retention and cost control?

9) How do I monitor performance issues?

10) When should I consider alternatives?

Don't miss any of our content

Sign up for our BIX News

Our Social Media

Most Popular

5 key considerations for implementing Gen AI in Your business

dbt Semantic Layer: How Metrics Work in Practice (and Why It Changes Analytics)

Best Observability Tools for LLM-Based Applications: A Practical Guide to Traces, Costs, Quality, and Safety

Implementing dbt in an Existing Data Warehouse: A Practical, Low-Risk Playbook

The Best BI Tools for Non‑Technical Users (and How to Choose the Right One)

The Hidden Costs of “Cheap” Data Solutions: Why Low Price Often Means High Risk

Is Your Company Ready to Use Generative AI? A Practical Readiness Guide for Leaders

Start your tech project risk-free