#1
This guide shows how to scale ClickHouse clusters with Spring Boot for high-performance queries and analytics.

1. Cluster Setup in ClickHouse

Create cluster configuration with multiple nodes.
CREATE TABLE events_cluster ON CLUSTER my_cluster (
  id UInt64,
  type String,
  ts DateTime
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/events', '{replica}')
ORDER BY ts;
This ensures replication and high availability.

2. Distributed Table for Queries

Create a distributed table that routes queries across shards.
CREATE TABLE events_distributed ON CLUSTER my_cluster AS events_cluster
ENGINE = Distributed(my_cluster, default, events_cluster, rand());

3. Spring Boot Configuration

Set JDBC URL to distributed node or load balancer:
spring.datasource.url=jdbc:clickhouse://clickhouse-proxy:8123/default
spring.datasource.driver-class-name=com.clickhouse.jdbc.ClickHouseDriver

4. Repository for Cluster Queries

public interface EventRepository extends JpaRepository<Event, Long> {
    @Query(value = "SELECT type, count(*) FROM events_distributed GROUP BY type", nativeQuery = true)
    List<Object[]> countByType();
}

5. Batch Inserts into Cluster

Use the distributed table to insert data automatically across shards.
jdbcTemplate.batchUpdate(
  "INSERT INTO events_distributed (id, type, ts) VALUES (?, ?, ?)",
  events,
  1000,
  (ps, e) -> {
      ps.setLong(1, e.getId());
      ps.setString(2, e.getType());
      ps.setTimestamp(3, Timestamp.valueOf(e.getTs()));
  }
);

6. Performance Tips

  • Use sharding keys for balanced data.
  • Enable compression in JDBC for faster transfers.
  • Monitor with system.merges and system.replication_queue.

image quote pre code