YMatrix - Hyper-Converged Database

Core Features

YMatrix is a distributed database product developed from the open-source PostgreSQL/Greenplum ecosystem, featuring the following key characteristics:

  • Supports clusters of up to 100 nodes, enabling multi-node, multi-core parallel computing.
  • Supports online cluster expansion.
  • Provides financial-grade high availability with automatic failover within 3 seconds.
  • Suitable for TB to PB-scale data processing.
  • Integrates analytical, transactional, and time-series capabilities, widely used in smart manufacturing, finance, and connected vehicle scenarios.

In addition to the commercial edition, YMatrix also offers a free community edition. Your experience and feedback are welcome.

The "Hyper-Convergence" Concept

A hyper-converged database integrates transactional (OLTP), analytical (OLAP), time-series, and data lake capabilities into a single database system.

YMatrix’s hyper-convergence philosophy eliminates data processing fragmentation by unifying compute, storage, and network resources within one system. Based on original database types, versions, cluster topology, and business characteristics, YMatrix delivers tailored combinations of storage and execution engines atop a shared database foundation. This enables specialized micro-kernels optimized for write, storage, and query performance across diverse business scenarios.

Integrated Capabilities

YMatrix emphasizes full-scenario functionality and performance, including ingestion, querying, analytics, and machine learning. By integrating multiple capabilities into a single database, it addresses complex use cases and achieves multi-model support, scalability, and cost efficiency.

· Analytical Capabilities

· Transactional Capabilities

· Time-Series Capabilities

Unified Interface

YMatrix uses SQL as the unified interface for all data services at the application layer.

Open Architecture

YMatrix provides strong extensibility.

On one hand, YMatrix continues to expand into new business scenarios such as connected vehicles, smart manufacturing, finance, and vector data processing. On the other hand, through features like machine learning and federated data access, heterogeneous and external data sources can run efficiently within YMatrix via database extensions.


By simplifying infrastructure architecture, YMatrix significantly reduces technology stack complexity, improves performance across diverse scenarios, minimizes risks from multi-system coexistence and integration, and helps enterprises build robust data governance frameworks—unlocking the full digital potential of the data era.

Proprietary Core Technologies

YMatrix leverages several key self-developed technologies to realize its hyper-convergence vision.

Storage Engine: MARS3

Designed for OLAP, OLTP, and time-series workloads, MARS3 offers two modes: columnar and hybrid row-columnar storage, allowing users to choose based on workload needs. The hybrid mode ensures both high ingestion performance and efficient storage (including compression and health diagnostics). Both modes implement MVCC. For partitioned tables, MARS3 supports automatic partition management and automatic storage tiering.

Execution Engine: Vectorized

The vectorized execution engine is designed specifically for column-oriented storage engines such as MARS3, MARS2, and AOCO. It delivers one to two orders of magnitude better performance than traditional row-based execution engines for common queries.

High Availability Architecture: ALOHA

ALOHA (Advanced Least Operation High Availability) is a cluster state management service introduced in YMatrix 5.X. Running independently from the main cluster, ALOHA can be configured with dedicated disks and monitoring. It ensures low-latency node status detection and management even under harsh conditions, completing automatic failover within 3 seconds.

Platform Capabilities

MatrixUI: Visual Installation and Operations

  • Graphical installation: Deploy a cluster in 10 minutes; simulate time-series write and query workloads in 3 minutes.
  • Graphical operations and monitoring: One-click health checks and second-level cluster expansion.

MatrixGate: High-Concurrency Data Ingestion

  • Low latency, high concurrency: Enables parallel ingestion of massive data volumes, compresses data to maximize bandwidth utilization, improving write speed up to 100x.
  • Supports various data sources and formats.
  • Supports batch and streaming ingestion.
  • Supports UPSERT: Handles out-of-order and batched data merging in complex ingestion scenarios.

MatrixShift: Peer-to-Peer Data Migration

  • Efficient peer-to-peer migration: Implements direct Segment-to-Segment data transfer, eliminating single-point bottlenecks common in traditional migration methods.
  • Full-scenario support: Supports full, incremental, and conditional-filtered migrations.
  • Greenplum replacement: Migrates data from Greenplum 4.3.X/5/6 clusters to YMatrix.

Enterprise-Grade Security

  • Authentication: Supports multiple authentication methods including trust, password, and PAM.
  • Access Control: Role-based access control (RBAC) simplifies user-permission mapping.
  • Encryption: Offers multiple encryption levels—encrypted password storage, field-level encryption, SSL host authentication, client-side encryption, network data encryption, cross-network password encryption, and tablespace encryption.
  • Auditing: Logs user login/logout events and database activities, with audit levels configurable by security requirements.
  • Resource Control: Enforces strict IP access restrictions; allows configuration of maximum concurrent connections per user; includes default connection timeout policies.

Enhanced Compatibility

  • Fully compatible with the PostgreSQL/Greenplum ecosystem and associated tools.

Supported Business Scenarios

Enterprise Data Warehouse Scenario

Powerful analytical computing capabilities

Traditional data warehouse workflows rely on the Hadoop ecosystem: storing historical data in Hadoop and using Spark for report computation—resulting in complex pipelines.

YMatrix resolves this complexity through hyper-convergence while enhancing analytical performance. By supporting structured and unstructured data types and federated data access, YMatrix handles BI and reporting tasks in classic OLAP scenarios across finance, telecom, government, energy, and manufacturing. Advanced query optimizations such as vectorization, Runtime Filter, sliding windows, and continuous aggregation deliver superior analytical performance.

Complex Time-Series Analytics Scenario

Balancing high-speed ingestion, low-cost storage, and real-time querying

Time-series data demands high performance in ingestion, storage, and querying due to its real-time nature.

YMatrix is optimized for time-centric workloads. Thanks to physical time ordering in the MARS storage engine, asynchronous and batched uploads, and MatrixGate’s high-concurrency, high-throughput ingestion, YMatrix exceeds expectations in real-time data loading, querying, and transaction guarantees.

YMatrix supports graphical, non-disruptive expansion—enabling simple, second-level scaling without service interruption, ensuring business continuity, minimizing downtime costs, and reducing operational risk.

Unified Technology Stack Scenario

Leveraging hyper-convergence to unify data pipelines

Data silos are common in traditional enterprises. Isolated data cannot be shared or utilized effectively, hindering management, operations, and growth—and blocking digital transformation.

YMatrix’s hyper-converged architecture has been successfully deployed in real-world production environments such as factory data platforms, enterprise group data warehouses, intelligent connected vehicles, and IoT device operations. It significantly lowers technical barriers in selection, procurement, deployment, and maintenance. For example, in smart manufacturing, a single YMatrix instance can collect, store, compute, model, query, and analyze data from ERP, MES, and equipment systems—all within one database.