This guide shows how to scale ClickHouse clusters with Spring Boot for high-performance queries and analytics.
1. Cluster Setup in ClickHouse
Create cluster configuration with multiple nodes.
CREATE TABLE events_cluster ON CLUSTER my_cluster (
id UInt64,
type String,
ts DateTime
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/events', '{replica}')
ORDER BY ts;
This ensures replication and high availability.
2. Distributed Table for Queries
Create a distributed table that routes queries across shards.
CREATE TABLE events_distributed ON CLUSTER my_cluster AS events_cluster
ENGINE = Distributed(my_cluster, default, events_cluster, rand());
3. Spring Boot Configuration
Set JDBC URL to distributed node or load balancer:
spring.datasource.url=jdbc:clickhouse://clickhouse-proxy:8123/default
spring.datasource.driver-class-name=com.clickhouse.jdbc.ClickHouseDriver
4. Repository for Cluster Queries
public interface EventRepository extends JpaRepository<Event, Long> {
@Query(value = "SELECT type, count(*) FROM events_distributed GROUP BY type", nativeQuery = true)
List<Object[]> countByType();
}
5. Batch Inserts into Cluster
Use the distributed table to insert data automatically across shards.
jdbcTemplate.batchUpdate(
"INSERT INTO events_distributed (id, type, ts) VALUES (?, ?, ?)",
events,
1000,
(ps, e) -> {
ps.setLong(1, e.getId());
ps.setString(2, e.getType());
ps.setTimestamp(3, Timestamp.valueOf(e.getTs()));
}
);
6. Performance Tips
- Use sharding keys for balanced data.
- Enable compression in JDBC for faster transfers.
- Monitor with
system.merges and system.replication_queue.
image quote pre code