This guide explains indexing strategies in ClickHouse for analytics applications with Spring Boot to improve query performance.
1. Use Primary Key with MergeTree
ClickHouse does not have traditional indexes. Instead, it uses
primary keys and sorting keys.
CREATE TABLE sales (
id UInt32,
product String,
amount Float32,
region String,
ts DateTime
) ENGINE = MergeTree()
ORDER BY (region, ts);
This allows efficient range scans and filtering.
2. Use Partitioning for Time-Based Data
For time-series analytics, partition by month or day.
CREATE TABLE metrics (
service String,
value Float32,
ts DateTime
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(ts)
ORDER BY (service, ts);
Partitioning speeds up queries by scanning only relevant data.
3. Sparse Index with Primary Key
ClickHouse stores sparse indexes on primary key values. Choose columns often used in
WHERE filters.
Example:
ORDER BY (region, ts) helps queries like:
SELECT sum(amount) FROM sales WHERE region = 'US' AND ts >= now() - INTERVAL 7 DAY;
4. Secondary Index (Data Skipping Indexes)
ClickHouse supports skipping indexes for faster analytics.
ALTER TABLE sales ADD INDEX idx_region (region) TYPE set(100) GRANULARITY 4;
This skips data blocks where region doesn’t match.
5. Bloom Filter for High-Cardinality Columns
Use bloom filter index for string searches.
ALTER TABLE sales ADD INDEX idx_product (product) TYPE bloom_filter GRANULARITY 64;
6. Combine Indexing with Spring Boot Queries
Spring repositories will benefit from indexes when using filters:
@Query(value = "SELECT sum(amount) FROM sales WHERE region = :region AND ts BETWEEN :start AND :end", nativeQuery = true)
Double salesByRegionAndTime(String region, LocalDateTime start, LocalDateTime end);
image quote pre code