YMatrix vs Greenplum TPC-H Benchmark Comparison Report

Test Overview


This performance report compares the performance of YMatrix and Greenplum (GPDB) in analytical query scenarios based on the TPC-H benchmark. The test results show that YMatrix significantly outperforms GPDB at both 100x and 1,000x data scales, achieving performance improvements of 13x and 12x, respectively.

TPC-H is a decision support benchmark consisting of a set of business-oriented ad hoc queries and concurrent data modifications. The selected queries and data patterns are broadly applicable across industries. This benchmark demonstrates the capability of decision support systems to analyze large volumes of data, execute highly complex queries, and answer critical business questions—reflecting multiple aspects of database system query processing capabilities.

Test Environment


Hardware Configuration

Data Scale Machine vCPU RAM Bandwidth EBS
100 AWS EC2, m5.8xlarge 32 128GB 10Gbps gp3, iops = 3000, throughput = 125MB/s
1000 AWS EC2, m5.8xlarge 32 128GB 10Gbps io1, iops = 64000

Software Configuration

The YMatrix TPC-H benchmark uses a single-node deployment with the default configuration automatically selected by the product. On the hardware described above, this corresponds to 6 Segment nodes.

  • Kernel version: 3.10.0-1160.66.1.el7.x86_64
  • OS version: CentOS Linux release 7.9.2009
  • YMatrix: Enterprise edition matrixdb5-5.0.0+enterprise_5.0.0
  • Greenplum: Open-source greenplum-db-6.23.3-rhel7-x86_64

Test Methodology


Tests were conducted at two data scales: 100x and 1,000x. Both YMatrix and GPDB were configured with 6 Segments. All test results were obtained by YMatrix engineers using identical machine configurations.

Key differences between the 100x and 1,000x scale tests:

  • At the 100x scale, data fits entirely in memory (128GB), so gp3 storage was used. YMatrix tested both lz4 and zstd compression formats. Since the open-source GPDB does not support quicklz, only zstd compression was used for GPDB.
  • At the 1,000x scale, higher-performance io1 storage was used. Both YMatrix and GPDB used zstd compression.
  • In this test, statement_mem=1GB was used for the 100x scale, and statement_mem=2GB for the 1,000x scale.

Test Data


TPC-H 100x Scale Data

Table Records YMatrix(lz4) YMatrix(zstd) GPDB(zstd)
nation 25 5 MB 5 MB 1 MB
region 5 4 MB 4 MB 1 MB
part 20,000,000 1 GB 690 MB 592 MB
partsupp 80,000,000 4 GB 3 GB 3 GB
supplier 1,000,000 97 MB 70 MB 55 MB
customer 15,000,000 1 GB 969 MB 861 MB
orders 150,000,000 7 GB 5 GB 4 GB
lineitem 600,037,902 34 GB 19 GB 18 GB

TPC-H 1,000x Scale Data

Table Records YMatrix(zstd) GPDB(zstd)
nation 25 5 MB 1 MB
region 5 4 MB 1 MB
part 200,000,000 5 GB 5 GB
partsupp 800,000,000 29 GB 31 GB
supplier 10,000,000 616 MB 538 MB
customer 150,000,000 8 GB 8 GB
orders 1,500,000,000 46 GB 46 GB
lineitem 5,999,989,709 185 GB 184 GB

Test Procedure


  1. Prepare the test environment

  2. Download the TPC-H benchmark tool

    git clone https://github.com/ymatrix-data/TPC-H.git

Note!
The YMatrix TPC-H tool is open source and easy to use. Try it today.

  1. Run the tpch.sh script
  • Set database environment variables, including PORT and DATABASE:

     export PGPORT=5432
     export PGDATABASE=tpch_s100
  • Execute the tpch.sh script to generate the tpch_variable.sh configuration file. Use the -d parameter to specify the database type (e.g., matrixdb, greenplum, postgresql) and -s to define the data scale:

     ./tpch.sh -d matrixdb -s 100
  • After modifying the configuration file, run tpch.sh. This script automatically generates data, creates tables, loads data, executes all TPC-H queries, and records execution times:

     ./tpch.sh

Note!
During data loading, the tpch.sh script uses MatrixGate for YMatrix and gpfdist for GPDB.

  1. Key Parameters in the TPC-H Benchmark Tool

Customize the tpch_variable.sh configuration file to meet specific needs:

  • RUN_GEN_DATA="true": Generate data
  • RUN_DDL="true": Create tables and indexes
  • RUN_LOAD="true": Load data
  • RUN_SQL="true": Execute all TPC-H queries
  • PREHEATING_DATA="true": Specify warm-up rounds to cache data files
  • SINGLE_USER_ITERATIONS="2": If PREHEATING_DATA="true" is enabled, TPC-H queries run three times. The first is a warm-up; the minimum of the last two runs is recorded.

Example:

To repeat TPC-H query execution, modify the following settings in the tpch_variable.sh configuration file:

`RUN_COMPILE_TPCH="false"`
`RUN_GEN_DATA="false"`
`RUN_INIT="false"`
`RUN_LOAD="false"`
`RUN_SQL="true"`
`RUN_SINGLE_USER_REPORT="true"`

Then re-run tpch.sh.

Test Results


Multiple test runs were performed for both YMatrix and GPDB. The best result from each system was used for comparison.

  • For TPC-H 100x, YMatrix used lz4 compression, while GPDB used zstd (compresslevel=1), since open-source GPDB does not support quicklz.
  • For TPC-H 1,000x, both systems used zstd compression (compresslevel=1).

Performance results show that YMatrix significantly outperforms GPDB—more than 10x faster at both 100x and 1,000x scales.

Data Scale GPDB (ms) YMatrix (ms) Speedup
100 930,071 70,044 1300%
1000 15,273,254 1,265,478 1200%

Detailed results are shown below. YMatrix tested TPC-H 100 with both lz4 and zstd compression. lz4 delivered better performance because the available memory (128GB) could fully cache the dataset. Thus, the lower-compression but faster-decoding lz4 format provided superior performance.

GPDB was tested with ORCA optimizer enabled and disabled. ORCA degraded performance at the 100x scale but improved it slightly at 1,000x. All YMatrix tests were conducted with ORCA disabled.

(ms) YMatrix TPC-H 100 planner lz4 YMatrix TPC-H 100 planner zstd GPDB TPC-H 100 orca zstd GPDB TPC-H 100 planner zstd YMatrix TPC-H 1000 planner zstd GPDB TPC-H 1000 orca zstd GPDB TPC-H 1000 planner zstd
Q01 4,200 4,846 94,271 90,473 53,291 929,166 907,474
Q02 1,228 1,417 13,134 11,005 12,960 163,967 132,898
Q03 4,409 4,860 31,654 32,057 50,194 406,933 456,339
Q04 4,965 4,947 40,743 30,522 103,699 492,440 429,417
Q05 4,405 5,226 43,100 40,094 88,930 787,161 569,668
Q06 183 254 4,066 3,995 2,852 40,141 38,985
Q07 1,865 2,219 29,294 28,879 29,921 340,402 402,481
Q08 2,239 3,123 51,852 49,998 41,305 610,720 650,542
Q09 7,012 8,229 84,506 91,597 248,033 1,072,529 1,719,890
Q10 3,861 4,469 61,953 28,238 64,568 810,094 395,927
Q11 470 569 5,937 10,010 6,475 54,006 97,012
Q12 2,319 2,486 27,271 30,032 26,964 326,579 335,811
Q13 4,610 4,458 34,345 26,018 72,861 631,285 651,340
Q14 588 696 5,591 3,318 7,277 48,476 47,320
Q15 1,310 1,249 9,579 12,001 31,236 93,387 172,448
Q16 1,471 1,584 8,493 22,038 25,295 141,958 492,614
Q17 1,613 1,960 154,488 143,057 28,158 3,299,179 3,272,970
Q18 7,225 6,950 78,451 89,587 93,391 1,064,011 1,276,977
Q19 3,225 4,173 22,224 21,027 40,080 217,796 208,500
Q20 850 1,004 24,920 24,818 9,596 293,892 421,818
Q21 10,219 10,529 149,483 128,112 205,788 2,427,732 2,420,413
Q22 1,777 1,858 19,866 13,196 22,603 226,963 172,399
SUM 70,044 77,107 995,221 930,071 1,265,478 14,478,829 15,273,254