Basic Capabilities

Capability Overview

Capability Supported Notes
Source Table Storage Engine HEAP / MARS3
Source Table Distribution Type Hash / Random / Segment Set Does not support master-only or replicated tables as source tables
Source Table Partitioning Supported
Stream Table Storage Engine HEAP / MARS3 / AO
Stream Table Distribution Key Custom distribution key allowed, can differ from source table Best practice is to keep it consistent. For aggregate streams, matching distribution keys enable localized stream computation
Stream Table Storage Properties Independent selection of storage engine, partitioning, and distribution key is supported
Multi-Column Support in Source Table Supported Supports source tables with ≥ 300 columns
One Table to Multiple Streams Supported Multiple streams at the same level can share the same source table. "One table" refers to the same upstream source table
Dimension Enrichment (Join) Supported Supports joining with ≥ 10 dimension tables
Aggregation Supported Supports grouping by ≥ 32 fields. Internally, multiple fields may be combined into a composite type for aggregation
Source Table DDL Not Supported Creating an index on the source table has no effect on downstream streams. Dropping an index may cause stream execution to fail
Stream Table DDL Not Supported DDL operations on stream table columns are not supported. Rebuild the stream if changes are needed. Note: If the source table also undergoes DDL changes, stream rebuild is recommended
Stream Table Indexes Supported Indexes can be independently created and maintained on stream tables
Dimension Filtering Supported Filter conditions on dimension tables are supported during dimension enrichment joins
Failover Support Supported Streams continue to operate after segment failover. However, a small number of transactions around the switchover time may be lost
Performance Overhead Minimal impact on source table writes; stream results delivered within seconds
Large Transaction Handling Supported Internal optimizations include batch processing and reduced memory usage during transaction log decoding, improving stability for large transactions. However, use stream processing cautiously on tables subject to large transactions
Historical Data Processing Supported When creating a stream, use the WITH DATA option to process existing source table data. If the source table is very large, this creates a massive transaction that blocks creation of other streams until completion
Stream-to-Stream JOIN Supported Supports non-equi joins, source tables with different distribution keys, and joins between source tables and streams with different distribution keys

Additional Limitations

  • Stream objects cannot be retrieved via JDBC metadata queries. Use dedicated SQL statements to query them.
  • Only superusers can create stream objects.
  • The SELECT clause in stream definition must not contain duplicate column names. Especially when using aggregate functions in aggregate streams, assign unique aliases such as select avg(col1) as avg_col1, avg(col2) as avg_col2. Alternatively, use field projection in the CREATE STREAM clause.
  • Avoid direct DML operations on stream tables (controlled by GUC mx_stream_internal_modify).
  • WITH clauses are not allowed in stream definitions.
  • Aggregate Stream Restrictions:
    • Only one FROM STREAMING clause is allowed in the stream definition.
    • The GROUP BY key must include all distribution keys of the corresponding stream table.
    • Global aggregation without GROUP BY is not allowed.
    • HAVING clause is not supported. Nested subqueries with outer WHERE clauses cannot emulate HAVING.
    • Aggregated expressions cannot be reused in other expressions. For example, avg(col1)+1 is invalid, but avg(col+1) is allowed.
  • Stream-to-Stream JOIN Restrictions:
    • GROUP BY is not allowed in stream-to-stream JOIN computations.
    • Only equi-inner joins between two source tables are supported. UNION, LEFT JOIN, and joins with non-streamed tables are not supported.
    • Aggregate functions such as DISTINCT, SUM, MAX are not allowed.
    • Window functions are not supported.
    • ORDER BY clause is not supported.
  • In multi-level streaming, intermediate streams cannot be aggregate streams.

Single-Stream Computation

  • Single Stream: A stream definition that declares only one object in the FROM STREAMING clause. The following example is also considered a single stream:
CREATE TABLE t3 (id int, c1 text, ts timestamp); 

CREATE STREAM s3(id, c1, ts, arrive_s3) 
AS ( 
SELECT *, clock_timestamp() 
FROM STREAMING t3 WHERE id > 7 
UNION ALL 
SELECT *, clock_timestamp() 
FROM STREAMING t3 WHERE id < 3 
WITH NO DATA 
) 
DISTRIBUTED BY (arrive_s3);

Multi-Stream Join Computation

  • Multi-Stream: A stream definition that includes multiple FROM STREAMING clauses. Used to dynamically track data from two or more source tables and generate new JOIN results.