#1
This guide explains how to use ClickHouse with Spring Boot for ETL workloads to extract, transform, and load large datasets efficiently.

1. Extract Data from Source

Spring Boot can connect to multiple databases. Example with PostgreSQL:
spring.datasource.url=jdbc:postgresql://localhost:5432/source_db
spring.datasource.username=source_user
spring.datasource.password=secret
Use JdbcTemplate to read records:
List<Map<String, Object>> rows = jdbcTemplate.queryForList("SELECT * FROM orders");

2. Transform Data in Java

Apply transformations before loading:
List<Event> events = rows.stream()
    .map(r -> new Event(
        (Long) r.get("id"),
        ((String) r.get("status")).toUpperCase(),
        ((Timestamp) r.get("created_at")).toLocalDateTime()
    ))
    .toList();

3. Load into ClickHouse

Configure ClickHouse datasource:
spring.clickhouse.url=jdbc:clickhouse://localhost:8123/default
spring.clickhouse.driver-class-name=com.clickhouse.jdbc.ClickHouseDriver
Batch insert data:
jdbcTemplate.batchUpdate(
    "INSERT INTO events (id, status, ts) VALUES (?, ?, ?)",
    events,
    500,
    (ps, e) -> {
        ps.setLong(1, e.getId());
        ps.setString(2, e.getStatus());
        ps.setTimestamp(3, Timestamp.valueOf(e.getTs()));
    }
);

4. Automate ETL Jobs

Use Spring @Scheduled tasks for recurring ETL:
@Scheduled(fixedRate = 60000)
public void runETL() {
    extractTransformLoad();
}

5. Verify Data

Query aggregated results in ClickHouse to confirm data load:
SELECT status, count(*) FROM events GROUP BY status;

image quote pre code