Basic Query

MatrixDB is developed based on PostgreSQL, so it supports the standard SELECT statements provided by PostgreSQL for querying, including complex statements such as WHERE, GROUP BY, ORDER BY, JOIN, etc.

The following is an example of how to use advanced analysis using statistics disk usage.

To be simplified, the amount of disk information we want to count only includes read and write speed, that is: read and write.

The index table pattern is as follows:

CREATE TABLE disk(
    time timestamp with time zone,
    tag_id int,
    read float,
    write float
)
Distributed by (tag_id);

In addition, create a disk_tags table to record the meta information related to the disk, including the operating system, host name, and area to which it belongs:

CREATE TABLE disk_tags(
    tag_id serial,
    os text,
    hostname text,
    region text
)
Distributed replicated;

1. Filter

Use the WHERE clause to filter the results:

  • Query records with read and write speeds greater than 10:

    SELECT * FROM disk WHERE read > 10 AND write > 10;

2. Sort

Use the ORDER BY clause to select columns for sorting, and the LIMIT clause can limit the number of result entries:

  • Query nearly 100 pieces of data:

    SELECT * FROM disk ORDER BY time DESC LIMIT 100;

3. Gather

MatrixDB provides the following basic aggregation functions:

  1. COUNT: The number of valid values
  2. SUM: Accumulated sum
  3. AVG: Average value of a column
  4. MIN: The minimum value of a column
  5. MAX: Maximum value of a column
  • Total data in 12 hours of query:

    SELECT COUNT(*) FROM disk WHERE time > NOW() - INTERVAL '12 hours';

  • Query the maximum read and write volume of a device with tag_id of 1: SELECT MAX(read), MAX(write) FROM disk WHERE tag_id = 1;

4. Grouping

Use the GROUP BY clause to group by columns, and for non-grouped columns, you need to use an aggregation function:

  • Query the total number of devices on each zone and host:

    SELECT region, hostname, COUNT(*) FROM disk_tags GROUP BY region, hostname;

5. Connect

Use the JOIN clause and specify the connection key with USING to perform multi-table join operations:

  • Query nearly 100 data and their host names:

    SELECT disk_tags.hostname, disk.* FROM disk JOIN disk_tags USING (tag_id) ORDER BY disk.time DESC LIMIT 100;