📊 Initial Performance Assessment

Exasol vs ClickHouse Performance Comparison on TPC-H

🔒 Security Note: Sensitive information (passwords, IP addresses) has been sanitized. Placeholders like <EXASOL_DB_PASSWORD>, <PRIVATE_IP>, and <PUBLIC_IP> are used throughout. Substitute with your actual credentials when reproducing.

Overview

We conducted performance testing of clickhouse against exasol using the TPC-H benchmark at scale factor 10.

Key Finding: Clickhouse demonstrated 8× slower median query performance compared to Exasol, highlighting significant optimization opportunities.

The Baseline: Exasol

Exasol achieved a median query runtime of 67.9ms across all TPC-H queries, establishing a competitive performance baseline for analytical workload processing.

Clickhouse 25.9.3.48 - System Under Test

Hardware Specifications

  • Cloud Provider: AWS
  • Region: eu-west-1
  • Instance Type: r5d.4xlarge
  • CPU: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
  • CPU Cores: 16 vCPUs
  • Memory: 124.4GB RAM

Software Configuration

  • Database: clickhouse 25.9.3.48
  • Setup Method: native
  • Data Directory: /data/clickhouse

Installation & Configuration

Clickhouse 25.9.3.48    Installation Steps

Storage Configuration

# Stop existing RAID array at /dev/md0 if present
sudo mdadm --stop /dev/md0 2>/dev/null || true
# Clear RAID superblock on /dev/nvme1n1
sudo mdadm --zero-superblock /dev/nvme1n1 2>/dev/null || true
# Clear RAID superblock on /dev/nvme2n1
sudo mdadm --zero-superblock /dev/nvme2n1 2>/dev/null || true
# Create RAID0 array from 2 devices
yes | sudo mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/nvme1n1 /dev/nvme2n1
# Wait for RAID array /dev/md0 to be ready
sudo mdadm --wait /dev/md0 2>/dev/null || true
# Create mdadm configuration directory
sudo mkdir -p /etc/mdadm
# Save RAID configuration
sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf
# Format /dev/md0 with ext4 filesystem
sudo mkfs.ext4 -F /dev/md0
# Create mount point /data
sudo mkdir -p /data
# Mount /dev/md0 to /data
sudo mount /dev/md0 /data
# Set ownership of /data to ubuntu:ubuntu
sudo chown -R ubuntu:ubuntu /data
# Create ClickHouse data directory under /data
sudo mkdir -p /data/clickhouse
# Set ownership of /data/clickhouse to clickhouse:clickhouse
sudo chown -R clickhouse:clickhouse /data/clickhouse

Prerequisites

# Update package lists
sudo apt-get update
# Install prerequisite packages for secure repository access
sudo apt-get install -y apt-transport-https ca-certificates curl gnupg

Repository Setup

# Add ClickHouse GPG key to system keyring
curl -fsSL 'https://packages.clickhouse.com/rpm/lts/repodata/repomd.xml.key' | sudo gpg --dearmor -o /usr/share/keyrings/clickhouse-keyring.gpg
# Add ClickHouse official repository to APT sources
ARCH=$(dpkg --print-architecture) && echo "deb [signed-by=/usr/share/keyrings/clickhouse-keyring.gpg arch=${ARCH}] https://packages.clickhouse.com/deb stable main" | sudo tee /etc/apt/sources.list.d/clickhouse.list
# Update package lists with ClickHouse repository
sudo apt-get update

Installation

# Install ClickHouse server and client version <SERVER_IP>
sudo apt-get install -y clickhouse-server=25.9.3.48 clickhouse-client=25.9.3.48

Configuration

# Create custom ClickHouse configuration file
sudo tee /etc/clickhouse-server/config.d/benchmark.xml > /dev/null << 'EOF'
<clickhouse>
    <listen_host>::</listen_host>
    <path>/data/clickhouse</path>
    <max_server_memory_usage>106897735680</max_server_memory_usage>
    <max_concurrent_queries>8</max_concurrent_queries>
    <background_pool_size>16</background_pool_size>
    <background_schedule_pool_size>16</background_schedule_pool_size>
    <max_table_size_to_drop>50000000000</max_table_size_to_drop>
</clickhouse>
EOF

User Configuration

# Configure ClickHouse user profile with password and query settings
sudo tee /etc/clickhouse-server/users.d/benchmark.xml > /dev/null << 'EOF'
<clickhouse>
    <users>
        <default replace="true">
            <password_sha256_hex>2cca9d8714615f4132390a3db9296d39ec051b3faff87be7ea5f7fe0e2de14c9</password_sha256_hex>
            <networks>
                <ip>::/0</ip>
            </networks>
        </default>
    </users>
    <profiles>
        <default>
            <max_threads>16</max_threads>
            <max_memory_usage>45000000000</max_memory_usage>
            <max_bytes_before_external_sort>73492193280</max_bytes_before_external_sort>
            <max_bytes_before_external_group_by>73492193280</max_bytes_before_external_group_by>
            <join_use_nulls>1</join_use_nulls>
            <allow_experimental_correlated_subqueries>1</allow_experimental_correlated_subqueries>
            <optimize_read_in_order>1</optimize_read_in_order>
            <max_insert_threads>8</max_insert_threads>
        </default>
    </profiles>
</clickhouse>
EOF

Service Management

# Start ClickHouse server service
sudo systemctl start clickhouse-server
# Enable ClickHouse server to start on boot
sudo systemctl enable clickhouse-server

Tuning Parameters

  • memory_limit: 48g
  • max_threads: 16
  • max_memory_usage: 45.0GB
Data Directory: /data/clickhouse

Performance Results

Overall Performance Summary

Metric Clickhouse Exasol Difference
Median Runtime 525.0ms 67.9ms 8× slower
Average Runtime 827.4ms 86.6ms 10× slower
Fastest Query 52.5ms 13.8ms 3.8× slower
Slowest Query 6098.9ms 255.1ms 23.9× slower

Selected Query Highlights

The following queries demonstrate the performance characteristics observed during testing:

Queries with Largest Performance Gaps

  • Q21: Clickhouse 6071.5ms vs Exasol 108.2ms (57× slower)
  • Q08: Clickhouse 1582.3ms vs Exasol 33.5ms (48× slower)
  • Q17: Clickhouse 719.9ms vs Exasol 16.2ms (45× slower)

Queries with Competitive Performance

  • Q16: Clickhouse 185.2ms vs Exasol 232.1ms (0.8× slower)
  • Q15: Clickhouse 91.8ms vs Exasol 82.7ms (1.1× slower)

Analysis & Optimization Opportunities

Based on these initial results, several areas for investigation and potential optimization emerge:

1. Query Execution Planning

The significant performance variance across different query types suggests opportunities for query optimizer tuning and execution strategy refinement.

2. Resource Utilization

Analyzing memory allocation, CPU utilization patterns, and I/O characteristics could reveal bottlenecks limiting performance on analytical workloads.

3. Configuration Tuning

Database-specific parameters, cache settings, and parallelism configurations warrant detailed examination to maximize hardware utilization.

4. Data Structure Optimization

Table layout, partitioning strategies, and index usage patterns may offer substantial performance improvements for specific query patterns.

Key Questions

Download Data

📊

Query Results (Tested Systems)

Raw benchmark results for tested systems (CSV format)

Download
📈

Summary Statistics (Tested Systems)

Aggregated performance metrics for tested systems (JSON)

Download
💻

Clickhouse System Information

Hardware specifications captured during the benchmark (JSON)

Download
📦

Benchmark Package

Self-contained reproduction package for tested systems

Download

Query Details