Quick onboard
Deployment
Data Modeling
Connecting
Migration
Query
Operations and Maintenance
Common Maintenance
Partition
Backup and Restore
Expansion
Mirroring
Resource Management
Security
Monitoring
Performance Tuning
Troubleshooting
Reference Guide
Tool guide
Data type
Storage Engine
Executor
Stream
DR (Disaster Recovery)
Configuration
Index
Extension
SQL Reference
This document describes the continuous aggregation feature in YMatrix.
Aggregate queries are common in real-world scenarios. Since they typically operate on large datasets and perform grouping and summarization, they are generally considered relatively expensive in terms of execution time and are often used in analytical workloads. In practice, aggregate queries are being executed more frequently, with increasing demands for lower response times.
Continuous aggregation is a mechanism designed to accelerate the response to such queries. Once a continuous aggregation view is created, the system automatically performs real-time aggregation during data ingestion. Because the aggregation results are transactionally synchronized with the source table, users are advised to query the continuous aggregation view directly whenever possible.
Continuous aggregation depends on the matrixts extension. First, create the extension:
=# CREATE EXTENSION matrixts;
The data table refers to the original source table:
=# CREATE TABLE metrics(
time timestamp,
tag_id int,
sensor float4
)
USING MARS3
DISTRIBUTED BY (tag_id)
ORDER BY (time,tag_id);
Continuous aggregation uses materialized views to store aggregated results. Use the WITH (CONTINUOUS) keyword when creating the view, along with the aggregation statement:
=# CREATE VIEW cv1 WITH (CONTINUOUS) AS
SELECT tag_id, COUNT(*), SUM(sensor)
FROM metrics GROUP BY tag_id;
=# INSERT INTO metrics VALUES(NOW(), 1, 1.1);
=# INSERT INTO metrics VALUES(NOW(), 1, 1.2);
=# INSERT INTO metrics VALUES(NOW(), 2, 2.1);
=# INSERT INTO metrics VALUES(NOW(), 2, 2.2);
With continuous aggregation enabled, the materialized view is automatically updated as data is inserted into the source table:
=# SELECT * FROM cv1 ORDER BY tag_id;
tag_id | count | sum
--------+-------+-----------
1 | 2 | 2.3000002
2 | 2 | 4.3
(2 rows)
The result from the materialized view matches that of aggregating the source table directly:
=# SELECT tag_id, COUNT(*), SUM(sensor)
FROM metrics
GROUP BY tag_id
ORDER BY tag_id;
tag_id | count | sum
--------+-------+-----------
1 | 2 | 2.3000002
2 | 2 | 4.3
(2 rows)
If the source table already contains data when the continuous aggregation view is created, the existing data is not synchronized by default. To enable synchronization, set the populate parameter to true explicitly during view creation (default is false):
=# CREATE VIEW cv1 WITH (CONTINUOUS, POPULATE=false) AS
SELECT tag_id, COUNT(*), SUM(sensor)
FROM metrics GROUP BY tag_id;
YMatrix provides UDFs (User Defined Functions) for maintaining continuous aggregation views:
matrixts_internal.analyze_continuous_view(rel REGCLASS)
Performs ANALYZE on the materialized view to update statistics, which helps the optimizer generate better execution plans.
SELECT matrixts_internal.analyze_continuous_view('cv1');
matrixts_internal.vacuum_continuous_view(rel REGCLASS, full BOOL)
VACUUMs the materialized view to reclaim space occupied by dead tuples and improve space reuse. If the second parameter is set to true, the table file is rebuilt to reduce storage footprint.
SELECT matrixts_internal.vacuum_continuous_view('cv1', true);
matrixts_internal.rebuild_continuous_view(rel REGCLASS)
Rebuilds the materialized view. This is useful when updates or deletions have been made to the source table and a full resynchronization is required.
SELECT matrixts_internal.rebuild_continuous_view('cv1');
Is the data in the materialized view synchronized with the source table in real time?
Does continuous aggregation guarantee ACID (Atomicity, Consistency, Isolation, Durability)?
How are UPDATEs and DELETEs handled?
last_not_null aggregate function for non-null latest values are supported.Does continuous aggregation support window functions?
Does continuous aggregation support partitioned tables?
Can continuous aggregation perform aggregation across multiple tables?
Can multiple continuous aggregation views be defined on a single table?
Does continuous aggregation apply compression parameters by default?