MxBench is a data loading and query benchmarking tool for MatrixDB. It can quickly generate random data, automatically create tables, and perform serial or concurrent data loading and querying based on user-defined configurations such as number of devices, time range, and number of metrics. Users can flexibly configure metric types, collection frequency, null value ratio, randomness level, query thread count, and custom query statements. MxBench can be run via command line or configuration file. The tool is located in the bin/mxbench directory of the MatrixDB installation.
A running MatrixDB cluster is required.
Since MxBench calls commands such as createdb, gpconfig, and gpstop, users must set up the necessary environment variables to ensure these commands execute correctly.
Specifically, run source<MatrixDB安装目录>/greenplum_path.sh``, and properly set the following environment variables:
Additionally, test-run commands like createdb mxbench, gpconfig -s log_rotation_size, gpstop -rai to confirm they execute successfully.
To quickly try MxBench on a local development machine, use either a configuration file or command-line interface.
Use the following configuration file, name it mxbench.conf, and run:
mxbench --config mxbench.conf.
Note: The benchmark-parallel parameter should match your machine's capabilities; it is recommended to set it less than or equal to the number of CPU cores.
[database]
db-database = "testdb1"
db-master-port = 5432
[global]
# Enable progress monitoring (default is true)
watch = true
# Directory for generated DDL, GUC recommendations, queries, etc.
workspace = "/tmp/mxbench"
table-name = "table1"
# Number of devices
tag-num = 25000
# Metric data type; supports int4, int8, float4, float8
metrics-type = "float8"
# Total number of metrics; if >998, first 997 are simple columns,
# remaining stored as JSON in a column named 'ext'
total-metrics-count = 300
# Data generation start timestamp; ts-end must be later than ts-start
ts-start = "2022-04-19 09:00:00"
ts-end = "2022-04-19 09:03:00"
# Metrics collected every N seconds
ts-step-in-second = "2"
[generator]
generator = "telematics"
[generator.telematics]
# Number of rows per device per timestamp uploaded; merged into one tuple in DB
generator-batch-size = 1
# Percentage of out-of-order data (1–100), with timestamps moved back by 1 hour
generator-disorder-ratio = 0
# Null value ratio in generated data (1–100)
generator-empty-value-ratio = 90
# Data randomness level: OFF/S/M/L; default is "OFF"; M is suitable for general scenarios
generator-randomness = "OFF"
[writer]
writer = "stdin"
[benchmark]
benchmark = "telematics"
[benchmark.telematics]
# Array of query concurrency levels
benchmark-parallel = [8]
# Provided queries: single-tag latest, 10-tag latest, single-tag detail
benchmark-run-query-names = ["SINGLE_TAG_DETAIL_QUERY"]
# Number of times or duration each query runs per round; set count to 0 to enable time-based execution
benchmark-run-times = 0
benchmark-runtime-in-second = "30"
Alternatively, run MxBench using the command line. The following command is equivalent to running MxBench with the above configuration file:
mxbench run \
--db-database "testdb1" \
--db-master-host "localhost" \
--db-master-port 5432 \
--db-user "mxadmin" \
--workspace "/tmp/mxbench" \
--table-name "table1" \
--ts-start "2022-04-19 09:00:00" \
--ts-end "2022-04-19 09:02:00" \
--ts-step-in-second 2 \
--generator "telematics" \
--writer "stdin" \
--benchmark "telematics" \
--benchmark-run-query-names "SINGLE_TAG_DETAIL_QUERY" \
--benchmark-parallel 8 \
--benchmark-run-times 0 \
--benchmark-runtime-in-second 30
As shown in the example configuration file, the configuration is divided into the following sections:
Global configurations:
database: Database-related settings.global: Table structure, data volume information, and administrative options such as whether to dump data files or actually run tests, and whether to enable progress monitoring.Plugin-based configurations:
generator: Data generator:
telematics (default): Generates data for telematics scenarios.file: Reads data from files.nil: Does not generate any data.writer: Writes generated data to MatrixDB via MxGate:
http (default): Starts MxGate via HTTP and loads data.stdin: Starts MxGate via stdin and loads data.nil: Does not start MxGate or write data.benchmark: Plugin-based module to generate and execute query statements:
telematics (default): Generates and executes common queries in telematics scenarios; supports custom queries.nil: Does not generate or execute any queries.Below are detailed explanations of parameters in each module.
[database]
# Database name; MxBench creates it if it does not exist; no error if already exists
db-database = "postgres"
## Hostname of the MatrixDB master instance
db-master-host = "localhost"
## Port number of the MatrixDB master instance
db-master-port = 5432
# Password for MatrixDB user
db-password = ""
# MatrixDB username
db-user = "mxadmin"
[global]
# Directory for MxBench-generated files including DDL, GUC restore scripts,
# recommended GUC changes, CSV data files, and query files.
# MxBench creates this directory if it does not exist; errors if it exists but is not a directory.
# Permissions may need attention.
# Each MxBench run creates a subdirectory named with a Unix timestamp under this path.
# Default is "/tmp/mxbench"
workspace = "/tmp/mxbench"
# Whether to dump CSV data files; options are true or false.
# Default is false: executes DDL, data loading, and queries.
# In the Unix-timestamp-named subdirectory under workspace, generates:
# 1. mxbench_ddl.sql: DDL script
# 2. mxbench_gucs_setup.sh: Recommended GUC settings script (may require restart to take effect)
# 3. mxbench_gucs_backup.sh: Backup of current system GUCs (not generated if same as recommendations)
# 4. mxbench_<benchmark-plugin>_query.sql: Queries generated by the benchmark plugin
# If set to true, does not execute DDL, loading, or queries; additionally generates:
# 5. mxbench_<generator-plugin>_data.csv: CSV data file from the generator plugin
dump = false
# Whether to enable progress monitoring; options are true or false.
# (Default) If true, prints progress of writer and benchmark modules every 5 seconds.
# Set to false to disable.
watch = true
# Whether to run data loading and querying simultaneously; options are true or false.
# If true: runs mixed workload of loading and querying.
# Queries loop until data loading finishes.
# (Default) If false: runs data loading first, then queries.
simultaneous-loading-and-query = false
# Path to a custom DDL file, if needed.
# (Default) If empty, MxBench generates DDL based on other configurations.
ddl-file-path = ""
# Type of metrics. Default is "float8" (double precision).
# Only supports "int4", "int8", "float4", "float8".
metrics-type = "float8"
# Schema name; default is "public".
schema-name = "public"
# Table name. Default is ""; must be manually set.
# If a table with the same name exists in the specified database and schema, MxBench exits with an error.
table-name = "test_table"
# Number of devices. Default is 25000.
tag-num = 25000
# Total number of metrics. Default is 300.
# If greater than 998, excess metrics are stored as JSON in a column named "ext".
# For example, if set to 1000, c0–c997 store 998 metrics as simple columns,
# and the remaining two are stored as JSON in the "ext" column.
total-metrics-count = 300
# Start timestamp for generated data. Generated data may be earlier due to delayed reporting.
ts-start = "2022-04-25 09:00:00"
# End timestamp for generated data. ts-end must be later than ts-start, otherwise an error occurs.
ts-end = "2022-04-25 09:01:00"
# Interval in seconds between metric collections. Default is "1".
ts-step-in-second = "1"
Generates telematics scenario data. Use generator="telematics".
[generator]
generator = "telematics"
[generator.telematics]
# Number of rows per device per timestamp sent to MatrixDB.
# For example, if set to 5, data for device "tag1" at "2022-04-25 09:00:03"
# is split into 5 rows and upserted into one tuple in the database.
# Default is 1 (no splitting).
generator-batch-size = 1
# Ratio of out-of-order (delayed) data. Value from 1 to 100.
# Default is 0 (no delayed data).
generator-disorder-ratio = 0
# Null value ratio per row. Value from 1 to 100. Default is 90 (90% nulls).
generator-empty-value-ratio = 90
# Data randomness level: OFF/S/M/L. Default is OFF.
generator-randomness = "OFF"
Reads data from CSV files. Use generator="file".
[generator]
generator = "file"
[generator.file]
# Absolute paths to CSV data files. Accepts an array for multiple files.
generator-file-paths = []
# Same as [generator.telematics]; setting these is optional,
# but proper values help generate better DDL. Has no effect if custom DDL is used.
generator-batch-size = 1
generator-empty-value-ratio = 90
Does not generate or load any data. Use generator="nil".
[generator]
generator = "nil"
Starts MxGate via HTTP and loads data.
[writer]
writer = "http"
[writer.http]
# Whether to use gzip compression when sending HTTP messages. Default is false.
writer-use-gzip = false
# Concurrency level when sending data to MxGate.
writer-parallel = 8
Starts MxGate via stdin and loads data.
[writer]
writer = "stdin"
Does not start MxGate or write data.
[writer]
writer = "nil"
[benchmark]
benchmark = "telematics"
[benchmark.telematics]
# List of query names to execute sequentially from the telematics plugin.
# Available queries:
# 1. "SINGLE_TAG_LATEST_QUERY"
# 2. "MULTI_TAG_LATEST_QUERY"
# 3. "SINGLE_TAG_DETAIL_QUERY"
# These correspond to:
# "SINGLE_TAG_LATEST_QUERY": Get latest metric values for one device.
# "MULTI_TAG_LATEST_QUERY": Randomly select 10 devices and get their latest metric values.
# "SINGLE_TAG_DETAIL_QUERY": Get metric values for one device over a time range.
# Note: For very wide tables, retrieving all metrics at once may not be supported.
# Parameters below allow tuning the number of metrics and time range for "SINGLE_TAG_DETAIL_QUERY".
# Example: [ "SINGLE_TAG_LATEST_QUERY", "MULTI_TAG_LATEST_QUERY", "SINGLE_TAG_DETAIL_QUERY" ] runs all three.
# Remove names to skip corresponding queries. Invalid names are ignored.
# Default is empty (no queries executed).
benchmark-run-query-names = [ "SINGLE_TAG_DETAIL_QUERY" ]
# Custom queries, separated by commas.
# Example: ["SELECT COUNT(*) FROM t1", "SELECT MAX(ts) FROM t1"]
# Default is empty (no custom queries).
benchmark-custom-queries = []
# Query concurrency levels; multiple values trigger sequential runs at each level, separated by commas.
# Example: [1, 8] runs queries first at concurrency 1, then at 8.
# Default is empty.
benchmark-parallel = [8]
# Number of times each query runs at each concurrency level; used for latency and TPS statistics.
# Default is 0.
benchmark-run-times = 0
# Duration (in seconds) each query runs at each concurrency level; used for latency and TPS statistics.
# Only effective when benchmark-run-times is 0. Default is 60.
benchmark-runtime-in-second = "60"
Set benchmark = "nil" if no queries should be executed.
[benchmark]
benchmark = "nil"
[database]
db-database = "testdb2"
db-master-port = 5432
[global]
# Enable progress monitoring (default is true)
watch = true
# Directory for generated DDL, GUC best practices, queries, etc.
workspace = "/home/mxadmin/mxbench/workspace"
# Whether to run data loading and querying simultaneously
simultaneous-loading-and-query = true
table-name = "table2"
# Number of devices
tag-num = 20000
# Metric data type; supports int4, int8, float4, float8
metrics-type = "float8"
# Number of metrics; if >998, first 997 are simple columns,
# rest stored as JSON in a column named "ext"
total-metrics-count = 5000
# Data generation start timestamp; ts-end must be later than ts-start
ts-start = "2022-04-19 00:00:00"
ts-end = "2022-04-19 00:01:00"
[generator]
generator = "telematics"
[generator.telematics]
# Number of rows per device per timestamp; merged into one tuple in DB
generator-batch-size = 1
# Percentage of delayed data (1–100), timestamp shifted back by 1 hour
generator-disorder-ratio = 0
# Null value ratio in generated data (1–100)
generator-empty-value-ratio = 90
# Data randomness level: OFF/S/M/L; default is "OFF"
generator-randomness = "OFF"
[writer]
writer = "stdin"
[benchmark]
benchmark = "telematics"
[benchmark.telematics]
# Array of query concurrency levels
benchmark-parallel = [64]
# Provided queries: single-tag latest, 10-tag latest, single-tag detail
benchmark-run-query-names = [ "SINGLE_TAG_LATEST_QUERY", "MULTI_TAG_LATEST_QUERY", "SINGLE_TAG_DETAIL_QUERY" ]
# Number of times or duration each query runs per round; set count to 0 to enable time-based execution
benchmark-run-times = 0
benchmark-runtime-in-second = "60"
[database]
db-database = "testdb3"
db-master-port = 5432
[generator]
# Read data from CSV file
generator = "file"
[generator.file]
generator-file-paths = ["/home/mxadmin/mxbench/data.csv"]
[global]
table-name = "table3"
watch = true
workspace = "/home/mxadmin/mxbench/workspace"
ddl-file-path = "/home/mxadmin/mxbench/ddl.sql"
[writer]
writer = "stdin"
[benchmark]
benchmark = "nil"
Running MxBench with Example Config 1 is equivalent to:
mxbench run \
--db-database "testdb2" \
--db-master-port 5432 \
--db-user "mxadmin" \
--workspace "/home/mxadmin/mxbench/workspace" \
--simultaneous-loading-and-query \
--table-name "table2" \
--tag-num 25000 \
--metrics-type "float8" \
--total-metrics-count 5000 \
--ts-start "2022-04-19 00:00:00" \
--ts-end "2022-04-19 00:01:00" \
--generator "telematics" \
--generator-batch-size 1 \
--generator-disorder-ratio 0 \
--generator-empty-value-ratio 90 \
--generator-randomness "OFF" \
--writer "stdin" \
--benchmark "telematics" \
--benchmark-run-query-names "SINGLE_TAG_LATEST_QUERY" \
--benchmark-run-query-names "MULTI_TAG_LATEST_QUERY" \
--benchmark-run-query-names "SINGLE_TAG_DETAIL_QUERY" \
--benchmark-parallel 64 \
--benchmark-run-times 0 \
--benchmark-runtime-in-second 60
Running MxBench with Example Config 2 is equivalent to:
mxbench run \
--db-database "testdb3" \
--db-master-port 5432 \
--workspace "/home/mxadmin/mxbench/workspace" \
--ddl-file-path "/home/mxadmin/mxbench/ddl.sql" \
--table-name "table3" \
--generator "file" \
--generator-file-paths "/home/mxadmin/mxbench/data.csv" \
--writer "stdin" \
--benchmark "nil"
Load only, no query
Set benchmark = "nil".
Query only, no load
Set generator = "nil".
Run loading and querying simultaneously
Set simultaneous-loading-and-query = true in the global section.
Generate and dump CSV data file
Set dump = true in global; output file is in <unix-timestamp> subdirectory under workspace.
View generated DDL and queries
Check the <unix-timestamp> subdirectory under the workspace directory.
Use custom DDL
Specify the absolute path to the DDL file in ddl-file-path under global.
Run custom queries
Add custom queries in benchmark-custom-queries under telematics benchmark, enclosed in quotes. Parameter randomization is not supported.
Do not apply recommended GUCs; keep existing GUCs
When MxBench detects differences between current and recommended GUCs, it prompts whether to reset GUCs and restart. Enter "N" to keep original GUCs. MxBench will then confirm whether to proceed. Enter "Y" to continue.
What are the parameter validity requirements?
In global configuration:
ts-end must be later than ts-start.table-name and schema-name must not be empty.tag-num must be greater than 0.ts-step-in-second must not be 0.Example:
● Stdin Writer Report
● period start: 2022-04-29 10:08:11, end: 2022-04-29 10:08:16, period: 5.000129734s
● count written in total: 637025 rows/ 1500000 rows 42%, 637025 rows in this period
● size written in total: 160878718 bytes/ 360000003 bytes 44%, 160878718 bytes in this period
● size written to MxGate in total: 350075440 bytes, 350075440 bytes in this period
● Telematics Benchmark Report
● stats for query SINGLE_TAG_LATEST_QUERY with parallel 8: progress: 100%
● stats for query MULTI_TAG_LATEST_QUERY with parallel 8: progress: 43%
Explanation: The report consists of two parts—writer and benchmark progress.
period start, period end, period: Start time, end time, and duration of the reporting window.count written: Total rows written vs. expected, percentage complete, and rows written in this period.size written: Total bytes written vs. expected, percentage complete, and bytes written in this period.size written to MxGate: Bytes sent to MxGate after text conversion, and bytes sent in this period.Execution progress of a query at a given concurrency level (parallel). When queries run concurrently with data loading, they may loop multiple times before loading completes. This report shows progress for the most recent iteration only.
--- SPLIT ---
[TRANSLATION FAILED] 3.2.1 writer
┌───────────────────────────────────────────────────────┐
│ Summary Report for Stdin Writer │
├─────────────────────────────────┬─────────────────────┤
│ start time: │ 2022-04-27 13:29:01 │
├─────────────────────────────────┼─────────────────────┤
│ stop time: │ 2022-04-27 13:29:58 │
├─────────────────────────────────┼─────────────────────┤
│ size written to MxGate (bytes): │ 848333400 │
├─────────────────────────────────┼─────────────────────┤
│ lines inserted: │ 1500000 │
├─────────────────────────────────┼─────────────────────┤
│ compress ratio: │ 1.561276 : 1 │
└─────────────────────────────────┴─────────────────────┘
每条query,在每个parallel参数下都会产生一个报告,会实时打印出来。
┌─────────────────┬───────────────┐
│ Overall Latency │ 29.948370582s │
│ Average Latency │ 13.723304ms │
│ P75 Latency │ 14.35972ms │
│ P50 Latency │ 13.654556ms │
│ P25 Latency │ 12.925912ms │
│ TPS │ 582 │
└─────────────────┴───────────────┘
汇总报告:
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Summary Report for Telematics Benchmark │
├─────────────────────┬─────────────────────────────────────┬─────────────────────────────────────┬─────────────────────────────────────┤
│ Parallel\Query Name │ SINGLE_TAG_LATEST_QUERY │ MULTI_TAG_LATEST_QUERY │ SINGLE_TAG_DETAIL_QUERY │
├─────────────────────┼─────────────────────────────────────┼─────────────────────────────────────┼─────────────────────────────────────┤
│ 8 │ ┌─────────────────┬───────────────┐ │ ┌─────────────────┬───────────────┐ │ ┌─────────────────┬───────────────┐ │
│ │ │ Overall Latency │ 30.008625133s │ │ │ Overall Latency │ 36.407016561s │ │ │ Overall Latency │ 29.948370582s │ │
│ │ │ Average Latency │ 23.81758ms │ │ │ Average Latency │ 7.270212105s │ │ │ Average Latency │ 13.723304ms │ │
│ │ │ P75 Latency │ 24.911171ms │ │ │ P75 Latency │ 8.139682559s │ │ │ P75 Latency │ 14.35972ms │ │
│ │ │ P50 Latency │ 23.411248ms │ │ │ P50 Latency │ 7.161237157s │ │ │ P50 Latency │ 13.654556ms │ │
│ │ │ P25 Latency │ 20.899622ms │ │ │ P25 Latency │ 6.642459805s │ │ │ P25 Latency │ 12.925912ms │ │
│ │ │ TPS │ 335 │ │ │ TPS │ 1 │ │ │ TPS │ 582 │ │
│ │ └─────────────────┴───────────────┘ │ └─────────────────┴───────────────┘ │ └─────────────────┴───────────────┘ │
│ │ progress: 100% │ progress: 100% │ progress: 100% │
└─────────────────────┴─────────────────────────────────────┴─────────────────────────────────────┴─────────────────────────────────────┘