English

YMatrix - The "Hyper-Converged" Database Trusted by Large Enterprises

YMatrix is an enterprise-grade distributed database product based on PostgreSQL. Integrating time-series, analytics (OLAP), transaction processing (OLTP), and AI capabilities into a single platform, YMatrix delivers full-scenario support, low cost, high performance, high availability, easy scalability, and compliance with security standards. With its "hyper-converged" architecture, YMatrix addresses the challenges of complex traditional systems and high operational costs, offering enterprises a unified data storage solution.

Full-Scenario Capabilities

Time-Series Scenarios

Optimized for time-series workloads, YMatrix provides high concurrency and is deeply tuned for applications such as connected vehicles and smart factories. It supports advanced SQL features like CTEs and window functions, along with native time-series functions. It enables out-of-order and batched writes in complex network environments. Cluster expansion with zero business interruption allows flexible scaling for growing data volumes. Cold data can be automatically offloaded to object storage, significantly reducing storage costs.

Analytics Scenarios

Supports TB to PB-scale data volumes, delivering reliable and high-performance data processing and service capabilities for enterprise reporting and BI applications. Offers powerful performance and excels at multi-table JOIN operations. Supports advanced analytical features such as window functions and materialized views. Beyond traditional batch processing, YMatrix introduces the Domino streaming engine, enabling real-time data processing via SQL—replacing tools like Flink or Spark.

Transactional Scenarios

Provides full ACID compliance, ensuring financial-grade data reliability. Meets the stringent performance, correctness, and consistency requirements of critical systems such as finance and ERP. Supports stored procedures, triggers, and cross-site disaster recovery, making it suitable for complex OLTP use cases.

AI Scenarios

Enables vector search for large language models (LLM), helping enterprises rapidly build AI agents using business data. Supports in-database execution of PL/Python without requiring Spark, fully utilizing hardware resources and improving machine learning efficiency. Offers multimodal data management and hybrid search capabilities.

Core Advantages

Full-Scenario Support

Single database handles time-series, analytics, transactional, and AI workloads
Full ACID compliance, supporting SQL : 2016 standards
Manages structured (relational tables), semi-structured (JSON, XML, Vector), and unstructured data (text, images, video) with mixed retrieval support
Supports vector storage, vector indexing, and quantization algorithms for rapid development of "enterprise AI agents"

Low Cost

Supports row-store (HEAP), column-store(AO, and hybrid row-column storage (MARS3)
Proprietary encoding-chain compression algorithm applies optimal compression per data type and pattern, achieving up to 20:1 compression ratio
Supports automatic storage tiering, moving cold data to object storage (S3) to reduce hardware costs

High Performance

Parallel computing across multiple nodes and cores for batch data analysis
High-concurrency ingestion with support for out-of-order and batched writes in complex network environments
Integrated HTAP architecture enhances complex query and in-database analytics performance
Domino streaming engine enables real-time in-database data flow and fast processing, supporting second-level, real-time, and incremental analytics
Real-time data processing dynamically reflects analytical results based on data changes

High Availability

Cluster state managed via strongly consistent etcd
Financial-grade HA with automatic failover within 3 seconds (Failover)
Supports disaster recovery clusters and incremental backup and restore, ensuring data availability under extreme conditions

Easy Scalability

Supports online, seamless cluster expansion via CLI or GUI
Scales to over 100 nodes, suitable for TB to PB-scale data processing
Fully compatible with PostgreSQL/Greenplum ecosystem and downstream toolchains

Security and Compliance

Access control: Role-based access control, row- and column-level security
Authentication: Trust, password, and PAM authentication methods
Encryption: Multiple levels of encryption including storage encryption (supports SM4 national cipher), field-level encryption, GSSAPI authentication, client-side encryption, SSL encrypted transmission, cross-network password encryption, and tablespace encryption
Security auditing: Logs user login/logout and database activities; audit levels configurable by security requirement
Resource control: Strict IP access restrictions, configurable maximum concurrent connections per user, and default connection timeout policies

Advanced Components

Visual Operations – MatrixUI

MatrixUI is a graphical operations and management tool designed for simplicity and comprehensive monitoring.

Graphical installation: Deploy a cluster in 10 minutes; simulate time-series write and query scenarios in 3 minutes
GUI-based monitoring and maintenance: One-click health checks, instant scaling, cluster inspection, Kafka ingestion configuration, and workload analysis

High-Concurrency Ingestion – MatrixGate

MatrixGate is a high-performance data loader that distributes data evenly across all segments for parallel ingestion.

Supports various data sources and formats
Enables batch and streaming data ingestion
Low latency, high concurrency: Achieves up to 100x faster ingestion by leveraging bandwidth and data compression
Supports UPSERT: Handles out-of-order and batched data merges efficiently, ideal for high-throughput, low-latency streaming scenarios

Incremental Backup – MatrixArchive

MatrixArchive captures a running YMatrix cluster’s data at a specific point in time, saving it according to defined rules to ensure data integrity and consistency. From these backup files, a fully functional YMatrix cluster can be restored, matching the original cluster’s state at that moment.

Peer-to-Peer Migration – MatrixShift

MatrixShift is a dedicated data migration tool supporting full, incremental, and conditional migrations between different versions of Greenplum and YMatrix. Features include high efficiency (peer-to-peer transfer, small-table optimization, data compression) and flexible configuration.

Migrates data from Greenplum to YMatrix
Full-scenario migration: Supports full, incremental, and filtered migrations
Efficient peer-to-peer transfer: Direct Segment-to-Segment data movement eliminates bottlenecks common in traditional migration approaches