MxBench is a data loading and query benchmarking tool for MatrixDB. It can quickly generate random data, automatically create tables, and perform serial or concurrent data loading and querying based on user-defined configurations such as number of devices, time range, and number of metrics. Users can flexibly configure metric types, collection frequency, null value ratio, randomness level, query thread count, and custom queries. MxBench can be run via command line or configuration file. The tool is located at bin/mxbench under the MatrixDB installation directory.
A running MatrixDB cluster is required.
Since MxBench calls commands such as createdb, gpconfig, and gpstop, users must set up the necessary environment variables to ensure these commands execute correctly.
Specifically, run source<MatrixDB安装目录>/greenplum_path.sh``, and properly set the following environment variables:
Additionally, test-run commands like createdb mxbench, gpconfig -s log_rotation_size, gpstop -rai to verify they execute successfully.
To quickly try MxBench on a local development machine, use either a configuration file or command-line interface.
Use the following configuration file, name it mxbench.conf, and run:
mxbench --config mxbench.conf.
Note: The benchmark-parallel parameter should match machine capability; it is recommended to set it less than or equal to the number of CPU cores.
[database]
db-database = "testdb1"
db-master-port = 5432
[global]
# Enable progress monitoring (default is true)
watch = true
# Directory to store generated DDL, GUC recommendations, queries, etc.
workspace = "/tmp/mxbench"
table-name = "table1"
# Number of devices
tag-num = 25000
# Metric data type; supported: int4, int8, float4, float8
metrics-type = "float8"
# Total number of metrics; if greater than 998, first 997 are simple columns,
# remaining stored in JSON format in a column named 'ext'
total-metrics-count = 300
# Start timestamp for generated data; ts-end must be later than ts-start
ts-start = "2022-04-19 09:00:00"
ts-end = "2022-04-19 09:03:00"
# Interval between metric collections (in seconds)
ts-step-in-second = "2"
[generator]
generator = "telematics"
[generator.telematics]
# Number of batches per device per timestamp uploaded to DB (merged into one tuple)
generator-batch-size = 1
# Percentage of delayed data (1–100), with timestamps moved back by 1 hour
generator-disorder-ratio = 0
# Null value ratio in generated data (1–100)
generator-empty-value-ratio = 90
# Data randomness level: OFF/S/M/L; default is "OFF"; M is suitable for general scenarios
generator-randomness = "OFF"
[writer]
writer = "stdin"
[benchmark]
benchmark = "telematics"
[benchmark.telematics]
# Array of query concurrency levels
benchmark-parallel = [8]
# Predefined queries: single-tag latest, 10-tag latest, single-tag detail
benchmark-run-query-names = ["SINGLE_TAG_DETAIL_QUERY"]
# Number of times each query runs per round, or duration in seconds
# To enable runtime, set run-times to 0:
benchmark-run-times = 0
benchmark-runtime-in-second = "30"
Alternatively, run MxBench using the command line. The following command is equivalent to running MxBench with the above configuration file:
mxbench run \
--db-database "testdb1" \
--db-master-host "localhost" \
--db-master-port 5432 \
--db-user "mxadmin" \
--workspace "/tmp/mxbench" \
--table-name "table1" \
--ts-start "2022-04-19 09:00:00" \
--ts-end "2022-04-19 09:02:00" \
--ts-step-in-second 2 \
--generator "telematics" \
--writer "stdin" \
--benchmark "telematics" \
--benchmark-run-query-names "SINGLE_TAG_DETAIL_QUERY" \
--benchmark-parallel 8 \
--benchmark-run-times 0 \
--benchmark-runtime-in-second 30
As shown in the example configuration file, the configuration is divided into the following sections:
Global Settings
Plugin-based Configurations
Below are detailed explanations of parameters in each module:
[database]
# Database name; MxBench creates it if it does not exist; no error if already exists
db-database = "postgres"
## Hostname where MatrixDB master instance resides
db-master-host = "localhost"
## Port number of MatrixDB master instance
db-master-port = 5432
# MatrixDB user password
db-password = ""
# MatrixDB username
db-user = "mxadmin"
[global]
# Directory for MxBench-generated files including DDL, GUC restore scripts,
# recommended GUC changes, CSV data files, and query files.
# MxBench creates this directory if it does not exist; errors if path exists but is not a directory.
# Permissions may need attention.
# Each MxBench run creates a subdirectory named with Unix timestamp under this path.
# Default: "/tmp/mxbench"
workspace = "/tmp/mxbench"
# Whether to dump CSV data files; options: true or false.
# Default: false — executes DDL, data loading, and queries.
# Under the timestamped subdirectory, generates:
# 1. mxbench_ddl.sql: DDL script;
# 2. mxbench_gucs_setup.sh: Recommended GUC settings script (may require restart to take effect);
# 3. mxbench_gucs_backup.sh: Backup of current system GUCs; not generated if identical to recommendations;
# 4. mxbench_<benchmark-plugin>_query.sql: Queries generated by the benchmark plugin.
# If set to true, DDL, loading, and queries are not executed; additionally generates:
# 5. mxbench_<generator-plugin>_data.csv: CSV data file from the generator plugin.
dump = false
# Whether to enable progress monitoring; options: true or false.
# (Default) If true, prints execution progress of writer and benchmark modules every 5 seconds.
# Set to false to disable.
watch = true
# Whether to run data loading and querying simultaneously; options: true or false.
# If true: Runs mixed workload of loading and querying.
# Queries loop until data loading finishes.
# (Default) If false: Loads data first, then runs queries.
simultaneous-loading-and-query = false
# Path to custom DDL file.
# (Default) If empty, DDL is auto-generated based on other configurations.
ddl-file-path = ""
# Type of metrics. Default: "float8" (double precision).
# Only supports "int4", "int8", "float4", "float8".
metrics-type = "float8"
# Schema name; default: "public".
schema-name = "public"
# Table name. Default: ""; must be manually set.
# If a table with the same name exists in the specified database and schema, MxBench exits with an error.
table-name = "test_table"
# Number of devices. Default: 25000.
tag-num = 25000
# Total number of metrics. Default: 300.
# If greater than 998, excess metrics are stored as JSON in a column named "ext".
# For example, if set to 1000, c0~c997 hold 998 simple columns,
# and the remaining two are stored as JSON in the "ext" column.
total-metrics-count = 300
# Start timestamp for generated data. Generated data may precede this due to delayed reporting.
ts-start = "2022-04-25 09:00:00"
# End timestamp for generated data. ts-end must be later than ts-start, otherwise error.
ts-end = "2022-04-25 09:01:00"
# Interval between metric collections (in seconds). Default: "1".
ts-step-in-second = "1"
Generates telematics scenario data. Use generator="telematics".
[generator]
generator = "telematics"
[generator.telematics]
# Number of batches per device per timestamp sent to MatrixDB.
# For example, if set to 5, data for device "tag1" at "2022-04-25 09:00:03"
# is split into 5 rows and merged into one tuple in DB.
# Default: 1 — no splitting.
generator-batch-size = 1
# Ratio of delayed reporting data. Value: 1–100.
# Default: 0 — no delayed data.
generator-disorder-ratio = 0
# Null value ratio per row. Value: 1–100. Default: 90% — 90% of metrics are NULL.
generator-empty-value-ratio = 90
# Data randomness level: OFF/S/M/L. Default: OFF.
generator-randomness = "OFF"
Reads data from CSV files. Use generator="file".
[generator]
generator = "file"
[generator.file]
# Absolute paths to CSV data files. Accepts an array for multiple files.
generator-file-paths = []
# Same as [generator.telematics]; setting these is optional,
# but helps generate better DDL. Has no effect if custom DDL is used.
generator-batch-size = 1
generator-empty-value-ratio = 90
Does not generate or load any data. Use generator="nil".
[generator]
generator = "nil"
Starts MxGate via HTTP and loads data.
[writer]
writer = "http"
[writer.http]
# Whether to use gzip compression when sending HTTP messages. Default: false.
writer-use-gzip = false
# Concurrency level when sending data to MxGate.
writer-parallel = 8
Starts MxGate via stdin and loads data.
[writer]
writer = "stdin"
Does not start MxGate or write data.
[writer]
writer = "nil"
[benchmark]
benchmark = "telematics"
[benchmark.telematics]
# List of telematics queries to execute sequentially.
# Available queries:
# 1. "SINGLE_TAG_LATEST_QUERY"
# 2. "MULTI_TAG_LATEST_QUERY"
# 3. "SINGLE_TAG_DETAIL_QUERY"
# These correspond to:
# "SINGLE_TAG_LATEST_QUERY": Get latest metric values for one device;
# "MULTI_TAG_LATEST_QUERY": Randomly select 10 devices and get their latest metric values;
# "SINGLE_TAG_DETAIL_QUERY": Get metric values for one device over a time range.
# Note: For very wide tables, retrieving all metrics at once may exceed DBMS limits.
# Parameters below allow tuning the number of metrics and time range for "SINGLE_TAG_DETAIL_QUERY".
# Example: [ "SINGLE_TAG_LATEST_QUERY", "MULTI_TAG_LATEST_QUERY", "SINGLE_TAG_DETAIL_QUERY" ] runs all three.
# Remove names to skip corresponding queries. Invalid names are ignored.
# Default: empty — no predefined queries executed.
benchmark-run-query-names = [ "SINGLE_TAG_DETAIL_QUERY" ]
# Custom queries, separated by commas.
# Example: ["SELECT COUNT(*) FROM t1", "SELECT MAX(ts) FROM t1"]
# Default: empty — no custom queries executed.
benchmark-custom-queries = []
# Query concurrency levels. Multiple values allowed, executed in order.
# Example: [1, 8] runs queries first with concurrency 1, then 8.
# Default: empty.
benchmark-parallel = [8]
# Number of times each query runs per concurrency level. Used for latency and TPS statistics.
# Default: 0.
benchmark-run-times = 0
# Duration (seconds) each query runs per concurrency level. Statistics based on executions within this period.
# Only effective when benchmark-run-times is 0. Default: 60 seconds.
benchmark-runtime-in-second = "60"
Set benchmark = "nil" to skip query execution.
[benchmark]
benchmark = "nil"
[database]
db-database = "testdb2"
db-master-port = 5432
[global]
# Enable progress monitoring (default is true)
watch = true
# Directory for generated DDL, GUC best practices, queries, etc.
workspace = "/home/mxadmin/mxbench/workspace"
# Whether to run data loading and querying simultaneously
simultaneous-loading-and-query = true
table-name = "table2"
# Number of devices
tag-num = 20000
# Metric data type; supports int4, int8, float4, float8
metrics-type = "float8"
# Number of metrics; if >998, first 997 are simple columns,
# rest stored as JSON in a column named "ext"
total-metrics-count = 5000
# Start timestamp for generated data; ts-end must be later than ts-start
ts-start = "2022-04-19 00:00:00"
ts-end = "2022-04-19 00:01:00"
[generator]
generator = "telematics"
[generator.telematics]
# Number of batches per device per timestamp; merged into one tuple in DB
generator-batch-size = 1
# Ratio of delayed data (1–100), timestamp shifted back by 1 hour
generator-disorder-ratio = 0
# Null value ratio in generated data (1–100)
generator-empty-value-ratio = 90
# Data randomness level: OFF/S/M/L; default: "OFF"
generator-randomness = "OFF"
[writer]
writer = "stdin"
[benchmark]
benchmark = "telematics"
[benchmark.telematics]
# Array of query concurrency levels
benchmark-parallel = [64]
# Provided queries: single-tag latest, 10-tag latest, single-tag detail
benchmark-run-query-names = [ "SINGLE_TAG_LATEST_QUERY", "MULTI_TAG_LATEST_QUERY", "SINGLE_TAG_DETAIL_QUERY" ]
# Number of runs or duration per query per round; set runs to 0 to enable duration:
benchmark-run-times = 0
benchmark-runtime-in-second = "60"
[database]
db-database = "testdb3"
db-master-port = 5432
[generator]
# Read data from CSV file
generator = "file"
[generator.file]
generator-file-paths = ["/home/mxadmin/mxbench/data.csv"]
[global]
table-name = "table3"
watch = true
workspace = "/home/mxadmin/mxbench/workspace"
ddl-file-path = "/home/mxadmin/mxbench/ddl.sql"
[writer]
writer = "stdin"
[benchmark]
benchmark = "nil"
Running MxBench with Example Config 1 is equivalent to:
mxbench run \
--db-database "testdb2" \
--db-master-port 5432 \
--db-user "mxadmin" \
--workspace "/home/mxadmin/mxbench/workspace" \
--simultaneous-loading-and-query \
--table-name "table2" \
--tag-num 25000 \
--metrics-type "float8" \
--total-metrics-count 5000 \
--ts-start "2022-04-19 00:00:00" \
--ts-end "2022-04-19 00:01:00" \
--generator "telematics" \
--generator-batch-size 1 \
--generator-disorder-ratio 0 \
--generator-empty-value-ratio 90 \
--generator-randomness "OFF" \
--writer "stdin" \
--benchmark "telematics" \
--benchmark-run-query-names "SINGLE_TAG_LATEST_QUERY" \
--benchmark-run-query-names "MULTI_TAG_LATEST_QUERY" \
--benchmark-run-query-names "SINGLE_TAG_DETAIL_QUERY" \
--benchmark-parallel 64 \
--benchmark-run-times 0 \
--benchmark-runtime-in-second 60
Running MxBench with Example Config 2 is equivalent to:
mxbench run \
--db-database "testdb3" \
--db-master-port 5432 \
--workspace "/home/mxadmin/mxbench/workspace" \
--ddl-file-path "/home/mxadmin/mxbench/ddl.sql" \
--table-name "table3" \
--generator "file" \
--generator-file-paths "/home/mxadmin/mxbench/data.csv" \
--writer "stdin" \
--benchmark "nil"
Load only, no query
Set benchmark = "nil".
Query only, no load
Set generator = "nil".
Run loading and querying simultaneously
Set simultaneous-loading-and-query = true in global settings.
Generate and dump CSV data file
Set dump = true in global settings. Output file is in <unix-timestamp> subdirectory under workspace.
View generated DDL and queries
Check the <unix-timestamp> subdirectory under the workspace directory.
Use custom DDL
Specify the absolute path to the DDL file in ddl-file-path under global settings.
Run custom queries
Specify custom queries in benchmark-custom-queries under telematics benchmark, enclosed in quotes. Parameter randomization is not supported.
Skip recommended GUCs and retain existing ones
When MxBench detects differences between current and recommended GUCs, it prompts and asks whether to reset GUCs and restart the database. Enter "N" to keep original GUCs. MxBench will then confirm whether to proceed. Enter "Y" to continue.
What are the parameter validity requirements?
In global settings:
ts-end must be later than ts-start;table-name and schema-name must not be empty;tag-num must be greater than 0;ts-step-in-second must not be 0.Example:
● Stdin Writer Report
● period start: 2022-04-29 10:08:11, end: 2022-04-29 10:08:16, period: 5.000129734s
● count written in total: 637025 rows/ 1500000 rows 42%, 637025 rows in this period
● size written in total: 160878718 bytes/ 360000003 bytes 44%, 160878718 bytes in this period
● size written to MxGate in total: 350075440 bytes, 350075440 bytes in this period
● Telematics Benchmark Report
● stats for query SINGLE_TAG_LATEST_QUERY with parallel 8: progress: 100%
● stats for query MULTI_TAG_LATEST_QUERY with parallel 8: progress: 43%
Explanation: Consists of two parts — writer and benchmark progress reports.
Execution progress of a query at a given concurrency level (parallel). When queries run concurrently with data loading, queries may loop multiple times before loading completes. This report shows progress of the most recent iteration only.
--- SPLIT ---
3.2.1 writer
┌───────────────────────────────────────────────────────┐
│ Summary Report for Stdin Writer │
├─────────────────────────────────┬─────────────────────┤
│ start time: │ 2022-04-27 13:29:01 │
├─────────────────────────────────┼─────────────────────┤
│ stop time: │ 2022-04-27 13:29:58 │
├─────────────────────────────────┼─────────────────────┤
│ size written to MxGate (bytes): │ 848333400 │
├─────────────────────────────────┼─────────────────────┤
│ lines inserted: │ 1500000 │
├─────────────────────────────────┼─────────────────────┤
│ compress ratio: │ 1.561276 : 1 │
└─────────────────────────────────┴─────────────────────┘
For each query, a report is generated per parallel setting and printed in real time.
┌─────────────────┬───────────────┐
│ Overall Latency │ 29.948370582s │
│ Average Latency │ 13.723304ms │
│ P75 Latency │ 14.35972ms │
│ P50 Latency │ 13.654556ms │
│ P25 Latency │ 12.925912ms │
│ TPS │ 582 │
└─────────────────┴───────────────┘
Summary report:
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Summary Report for Telematics Benchmark │
├─────────────────────┬─────────────────────────────────────┬─────────────────────────────────────┬─────────────────────────────────────┤
│ Parallel\Query Name │ SINGLE_TAG_LATEST_QUERY │ MULTI_TAG_LATEST_QUERY │ SINGLE_TAG_DETAIL_QUERY │
├─────────────────────┼─────────────────────────────────────┼─────────────────────────────────────┼─────────────────────────────────────┤
│ 8 │ ┌─────────────────┬───────────────┐ │ ┌─────────────────┬───────────────┐ │ ┌─────────────────┬───────────────┐ │
│ │ │ Overall Latency │ 30.008625133s │ │ │ Overall Latency │ 36.407016561s │ │ │ Overall Latency │ 29.948370582s │ │
│ │ │ Average Latency │ 23.81758ms │ │ │ Average Latency │ 7.270212105s │ │ │ Average Latency │ 13.723304ms │ │
│ │ │ P75 Latency │ 24.911171ms │ │ │ P75 Latency │ 8.139682559s │ │ │ P75 Latency │ 14.35972ms │ │
│ │ │ P50 Latency │ 23.411248ms │ │ │ P50 Latency │ 7.161237157s │ │ │ P50 Latency │ 13.654556ms │ │
│ │ │ P25 Latency │ 20.899622ms │ │ │ P25 Latency │ 6.642459805s │ │ │ P25 Latency │ 12.925912ms │ │
│ │ │ TPS │ 335 │ │ │ TPS │ 1 │ │ │ TPS │ 582 │ │
│ │ └─────────────────┴───────────────┘ │ └─────────────────┴───────────────┘ │ └─────────────────┴───────────────┘ │
│ │ progress: 100% │ progress: 100% │ progress: 100% │
└─────────────────────┴─────────────────────────────────────┴─────────────────────────────────────┴─────────────────────────────────────┘