Quick onboard
Deployment
Data Modeling
Connecting
Migration
Query
Operations and Maintenance
Common Maintenance
Partition
Backup and Restore
Expansion
Mirroring
Resource Management
Security
Monitoring
Performance Tuning
Troubleshooting
Reference Guide
Tool guide
Data type
Storage Engine
Executor
Stream
DR (Disaster Recovery)
Configuration
Index
Extension
SQL Reference
The Domino streaming computation engine is a streaming computing module introduced in YMatrix 6.0. It incrementally maintains computation results to improve the efficiency of data analysis and query performance.
Domino v2 is enabled by default starting from YMatrix 6.4.X and requires no additional configuration.
For more information on GUC parameters, refer to Technical Parameters.
The Domino architecture consists of two main components: the execution framework and the computation engine.
The primary goal of Domino v2 is to support more streaming tasks while maintaining the same query latency, addressing resource contention and high load issues in v1 when the number of streams increases.
The main differences lie in resource management and execution mechanisms. Domino v2 improves scalability and stability through optimized design.
Shared Decoder
Shared Worker Processes
Incremental Execution (Small-step Processing)
Functionality
Compatibility
Domino v2 focuses on optimizing the execution framework. The core modules are:
Ticker
Splits XLog into manageable segments. Controls the scope of each streaming computation (minimum unit is one Tick), avoiding large transactions that could slow down the system. Note: Large transactions for historical data processing during WITH DATA stream creation are exempt from this limitation.
Scheduler
Manages stream scheduling. Reuses worker processes to cap overall CPU usage while improving individual worker CPU utilization.
Decoder
Parses XLog and generates tuples and snapshots (TLog) for stream consumption.
TLog
Stores decoded change records. Acts as intermediate storage between the Decoder and stream computation, allowing stream processing to read and process changes.
| Capability | Supported | Notes |
|---|---|---|
| Upstream Table Storage Engine | HEAP / MARS3 | |
| Upstream Table Distribution Type | Hash / Random / Segment Set | Does not support master-only or replicated tables as upstream tables |
| Upstream Table Partitioning | Supported | |
| Stream Table Storage Engine | HEAP / MARS3 / AO | |
| Stream Table Distribution Key | Flexible choice, can differ from upstream | Best practice: use the same distribution key. For aggregate streams, matching keys enable localized computation |
| Stream Table Storage Properties | Independent selection of engine, partitioning, and distribution key | |
| Multi-Column Upstream Tables | Supported | Supports upstream tables with ≥ 300 columns |
| Multiple Streams per Table | Supported | Multiple streams can share the same upstream table. "One table" refers to the same upstream source |
| Dimensional Enrichment | Supported | Supports joining ≥ 10 dimension tables |
| Aggregation | Supported | Supports grouping by ≥ 32 fields. Internally combines multiple field types into a composite type for aggregation |
| Upstream Table DDL | Not Supported | Creating indexes on upstream tables has no effect on downstream streams. Dropping indexes may break stream execution |
| Stream Table DDL | Not Supported | DDL operations on stream table columns (e.g., ADD/DROP COLUMN) are not supported. Rebuild the stream if changes are needed. Note: If the upstream table also undergoes DDL, stream rebuild is recommended |
| Stream Table Indexes | Supported | Indexes can be independently created and maintained on stream tables |
| Dimension Filtering | Supported | Supports filter conditions on dimension tables during dimensional enrichment |
| Failover Support | Supported | Streams continue working after segment failover. However, a small number of transactions at the switchover point may be lost |
| Performance Overhead | Minimal impact on upstream write performance; stream results updated within seconds | |
| Large Transaction Handling | Supported | Optimized batching and memory usage during transaction log decoding improves stability for large transactions. However, use streaming cautiously on tables with frequent large transactions |
| Historical Data Processing | Supported | Use WITH DATA when creating a stream to process existing upstream data. If the upstream table is very large, this creates a long-running transaction that blocks creation of other streams until completion |
| Stream-to-Stream JOIN | Supported | Supports non-equi joins. Upstream tables may have different distribution keys. Stream and upstream tables can have different distribution keys |
Stream table objects cannot be retrieved via JDBC metadata. Use dedicated SQL statements to query them.
Only superusers can create stream objects.
The SELECT clause in stream definition must not contain duplicate column names. Especially for aggregate streams using aggregate functions, assign unique aliases such as select avg(col1) as avg_col1, avg(col2) as avg_col2. Alternatively, use column projection in the CREATE STREAM statement.
Avoid direct DML operations on stream tables (controlled by GUC mx_stream_internal_modify).
WITH clauses are not allowed in stream definitions.
Aggregate Stream Restrictions:
FROM STREAMING clause is allowed in the stream definition.GROUP BY key must include all distribution keys of the corresponding stream table.GROUP BY is not allowed.HAVING clause is not supported. Nested subqueries with outer WHERE filters cannot simulate HAVING.avg(col1)+1 is invalid, but avg(col+1) is allowed.Stream-to-Stream JOIN Restrictions:
GROUP BY is not allowed in stream-to-stream JOIN computations.INNER JOIN between two upstream tables is supported. UNION, LEFT JOIN, or joins with non-streaming tables are not supported.DISTINCT, SUM, MAX are not allowed.ORDER BY clause is not supported.In multi-level streaming pipelines, intermediate streams cannot be aggregate streams.