Skip to main content
Skip to main content
Edit this page

Star Schema Benchmark (SSB, 2009)

The Star Schema Benchmark is roughly based on the TPC-H's tables and queries but unlike TPC-H, it uses a star schema layout. The bulk of the data sits in a gigantic fact table which is surrounded by multiple small dimension tables. The queries joined the fact table with one or more dimension tables to apply filter criteria, e.g. MONTH = 'JANUARY'.

References:

First, checkout the star schema benchmark repository and compile the data generator:

Then, generate the data. Parameter -s specifies the scale factor. For example, with -s 100, 600 million rows are generated.

Now create tables in ClickHouse:

The data can be imported as follows:

In many use cases of ClickHouse, multiple tables are converted into a single denormalized flat table. This step is optional, below queries are listed in their original form and in a format rewritten for the denormalized table.

The queries are generated by ./qgen -s <scaling_factor>. Example queries for s = 100:

Q1.1

Denormalized table:

Q1.2

Denormalized table:

Q1.3

Denormalized table:

Q2.1

Denormalized table:

Q2.2

Denormalized table:

Q2.3

Denormalized table:

Q3.1

Denormalized table:

Q3.2

Denormalized table:

Q3.3

Denormalized table:

Q3.4

Denormalized table:

Q4.1

Denormalized table:

Q4.2

Denormalized table:

Q4.3

Denormalized table: