Quick onboard
Deployment
Data Modeling
Connecting
Migration
Query
Operations and Maintenance
Common Maintenance
Partition
Backup and Restore
Expansion
Mirroring
Resource Management
Security
Monitoring
Performance Tuning
Troubleshooting
Reference Guide
Tool guide
Data type
Storage Engine
Executor
Stream
DR (Disaster Recovery)
Configuration
Index
Extension
SQL Reference
The Domino streaming computation engine is a streaming computing module introduced in YMatrix 6.0. It incrementally maintains computation results to improve the efficiency of data analysis and query performance.
The current database version defaults to Domino v1. To use Domino v2, follow the steps below.
Domino v2 is available as an experimental component in YMatrix 6.3.X. Additional plugin creation and GUC settings are required.
-- For version 6.3, manually enable Domino v2 support; restart required for changes to take effect
gpconfig -c domino.enabled -v on
gpstop -ar
-- Create the extension
CREATE EXTENSION domino;
-- 1. Set stream version at session level; streams created in this session will be v2
SET mx_stream_default_version TO 2;
-- Verify version
SELECT version FROM mx_stream WHERE streamrelid = 'your_stream_name'::regclass::oid;
-- 2. Alternatively, enable globally
gpconfig -c mx_stream_default_version -v 2
gpstop -u
For more information on GUC parameters, refer to Technical Parameters.
The Domino architecture consists of two main components: the execution framework and the computation engine.
Domino v2 aims to support more streaming tasks while maintaining the same query latency, addressing resource contention and high load issues in v1 when the number of streams increases.
The main differences lie in resource management and execution model. v2 improves scalability and stability through optimized design:
Shared Decoder
Shared Worker Processes
Incremental Execution
Functionality: No significant functional differences between v1 and v2.
Compatibility: v1 and v2 can coexist. However, if the database is downgraded, v2 streams become unavailable, while v1 streams remain unaffected.
Domino v2 optimizes the execution framework with the following core modules:
Ticker
Splits XLog into manageable chunks. Controls the scope of each stream computation (minimum unit: one Tick), preventing large transactions that could slow the system. Note: Large transactions for historical data processing with WITH DATA are exempt from this restriction.
Scheduler
Manages stream scheduling. Reuses worker processes to limit overall CPU usage while improving individual worker utilization.
Decoder
Parses XLog and generates tuples and snapshots (TLog) for stream consumption.
TLog
Stores decoded change records. Acts as intermediate storage between the Decoder and stream computation, allowing stream processing to read and process changes.
| Capability | Supported | Notes |
|---|---|---|
| Upstream Table Storage Engine | HEAP / MARS3 | |
| Upstream Table Distribution Type | Hash / Random / Segment Set | Does not support master-only or replicated tables as upstream |
| Upstream Table Partitioning | Supported | |
| Stream Table Storage Engine | HEAP / MARS3 / AO | |
| Stream Table Distribution Key | Customizable, can differ from upstream | Best practice: use same distribution key. For aggregate streams, matching keys enable localized computation |
| Stream Table Storage Attributes | Independent selection of engine, partitioning, and distribution key | |
| Multi-Column Upstream Tables | Supported | Supports ≥ 300 columns |
| Multiple Streams per Table | Supported | Multiple streams can share the same upstream table |
| Dimensional Enrichment | Supported | Supports joining ≥ 10 dimension tables |
| Aggregation | Supported | Supports ≥ 32 grouping fields. Internally combines multiple field types into a composite type for aggregation |
| Upstream Table DDL | Not Supported | Creating indexes on upstream tables has no effect on downstream streams. Dropping indexes may break stream execution |
| Stream Table DDL | Not Supported | DDL operations on stream table columns are not supported. Rebuild the stream if changes are needed. Note: If the upstream table also undergoes DDL, stream rebuild is recommended |
| Stream Table Indexes | Supported | Indexes can be independently created and maintained on stream tables |
| Dimension Filtering | Supported | Supports filter conditions on dimension tables during join operations |
| Failover Support | Supported | Streams continue after segment failover; however, a small number of transactions at the switchover point may be lost |
| Performance Overhead | Minimal impact on upstream write performance; stream results updated within seconds | |
| Large Transaction Handling | Supported | Optimized batching and memory usage for transaction log decoding. More stable for large transactions. However, use streaming cautiously on tables with frequent large transactions |
| Historical Data Processing | Supported | Use WITH DATA when creating a stream to process existing upstream data. If the upstream table is very large, this creates a large transaction that blocks creation of other streams until complete |
| Dual-Stream JOIN | Supported | Supports non-equi joins; upstream tables may have different distribution keys; stream and upstream tables may have different distribution keys |
Stream table objects cannot be retrieved via JDBC metadata. Use dedicated SQL statements for querying.
Only superusers can create stream objects.
The SELECT clause in stream definition must not contain duplicate column names. After using aggregate functions (especially in aggregate streams), assign unique aliases, e.g., select avg(col1) as avg_col1, avg(col2) as avg_col2. Alternatively, use column projection in CREATE STREAM.
Avoid direct DML operations on stream tables (controlled by GUC mx_stream_internal_modify).
WITH clauses are not allowed in stream definitions.
Aggregate Stream Restrictions:
FROM STREAMING clause is allowed.GROUP BY keys must include all distribution keys of the corresponding stream table.GROUP BY is not allowed.HAVING clause is not supported. Use nested subqueries with outer WHERE as a workaround.avg(col1)+1 is invalid, but avg(col+1) is supported.Dual-Stream JOIN Restrictions:
GROUP BY is not allowed in dual-stream JOIN computations.INNER JOIN between two upstream tables is supported. UNION, LEFT JOIN, or joins with non-stream tables are not supported.DISTINCT, SUM, MAX are not allowed.ORDER BY clause is not supported.In multi-level streaming, intermediate streams cannot be aggregate streams.