YMatrix
Quick Start
Simulate Time Series Scenarios
Standard Cluster Deployment
Data Modeling
Connecting to The database
Data Writing
Data Migration
Data Query
Scene Application Examples
Federal Query
Maintenance and Monitoring
Global Maintenance
Partition Maintenance
Backup and Restore
Cluster Expansion
Monitoring
Performance Tuning
Troubleshooting
Reference Guide
Tool Guide
Data Type
Storage Engine
Execution Engine
Stream Processing
DR (Disaster Recovery)
Configuration Parameters
Index
Extension
SQL Reference
FAQ
This test compared the performance performance of YMatrix and ClickHouse (CK for short) in single-table query analysis scenarios. The results show that YMatrix has increased by 24% and 37% compared with CK respectively under the two test data scales.
SSB is a star test set commonly used in the industry. It is an "cost-performance ratio" evaluation standard for analysis scenarios evolved from TPC-H. 13 query scenarios were defined in the SSB benchmark, and the comprehensive performance of an analytical database was examined from different angles and dimensions.
With the development of the community, the original design of SSB was not entirely suitable for most of today's digital warehouse usage patterns. Apache Druid In a recent benchmark test, an SSB variant was proposed based on the original version, that is, aggregation of multiple original table data into a wide table, and then the test items in the SSB benchmark test were performed based on this wide table. This solution is also used in The latest SSB benchmark results released by ClickHouse.
Machine | vCPU | RAM | Bandwidth | EBS |
---|---|---|---|---|
AWS EC2, m5.8xlarge | 64 | 128GB | 10Gbps | gp3, iops = 3000, thoughtput = 125MB/s |
In order to restore the latest test results of ClickHouse to the greatest extent, YMatrix's SSB benchmark uses the same test environment and scenario as [ClickHouse] (https://altinity.com/blog/clickhouse-nails-cost-efficiency-challenge-against-druid-rockset), that is, a stand-alone deployment, and adopts the default deployment method automatically selected by YMatrix products. In the hardware environment introduced above, there are 6 Segment nodes.
Operating system kernel: 3.10.0-1127.19.1.el7.x86_64
Operating system version: CentOS Linux release 7.8.2003
YMatrix: Enterprise version of matrixdb-5.0.0.beta.2+enterprise-1, where YMatrix cluster deployment is available in YMatrix official website document and YMatrix's SSB Benchmark Tool.
ClickHouse: 22.7.2.15
This time, YMatrix adopted an SSB benchmark model consistent with ClickHouse, and tested 100 times and 1000 times data scales respectively (the data scale of the SSB comes from the scale factor of TPC-H).
Among them, the wide table data volume of 100 times data is about 600 million pieces of data, and the wide table data volume of 1000 times data is about 6 billion.
Among the test results, the test results of ClickHouse at 100 times the data scale reference ClickHouse official data, but the official did not disclose the results at 1000 times the data scale, so the 1000 times the result was measured by YMatrix engineers using the same machine configuration.
Prepare for the test environment As in the "Hardware Environment" section above, initialize the virtual machine environment on AWS. And follow the [YMatrix official website document to install the test cluster] (https://ymatrix.cn/doc/5.2/install/mx5_cluster/mx5_cluster).
Download the SSB benchmark tool
git clone https://github.com/ymatrix-data/ssb.git
Notes!
YMatrix's SSB tool is open source and easy to use, and you are welcome to try it out.
Environmental inspection
./validate_environment.sh
Generate test data With -s 100, generate data of 100 times the size.
./generate_data.sh -s 100
Import test data The tool supports testing data sets of multiple data sizes. The generated 100-fold data set is imported into YMatrix through -s 100 selection. The default import method is MatrixGate, which has high concurrency, distributed, streaming, batch writing data, etc. Currently, MatrixGate and COPY modes are supported. If you need to adjust the import method, you can specify it with -t.
./import_data.sh -s 100
Generate wide table data YMatrix makes some necessary adjustments to the table structure and queries in the SSB benchmark, and then executes:
./generate_flat_table.sh -s 100
Perform SSB benchmarking
./ssb.sh -s 100
The final complete test results are as follows:
| Data Scale | ClickHouse (ms) | YMatrix (ms) | Increase the proportion | --- | --- | --- | | 100 times data | 1112 | 840 | 24% | | 1000 times data | 5794 | 3670 | 37% |