YMatrix - The "Hyper-Converged" Database Trusted by Large Enterprises

YMatrix is an enterprise-grade distributed database product based on PostgreSQL. Integrating time-series, analytics (OLAP), transaction processing (OLTP), and AI capabilities into a single platform, YMatrix delivers full-scenario support, low cost, high performance, high availability, easy scalability, and compliance with security standards. With its "hyper-converged" architecture, YMatrix addresses the challenges of complex traditional systems and high operational costs, offering enterprises a unified data storage solution.

Full-Scenario Capabilities

Time-Series Scenarios

Optimized for time-series workloads, YMatrix provides high concurrency and is deeply tuned for applications such as connected vehicles and smart factories. It supports advanced SQL features like CTEs and window functions, along with native time-series functions. It enables out-of-order and batched writes in complex network environments. Cluster expansion with zero business interruption allows flexible scaling for growing data volumes. Cold data can be automatically offloaded to object storage, significantly reducing storage costs.

Analytics Scenarios

Supports TB to PB-scale data volumes, delivering reliable and high-performance data processing and service capabilities for enterprise reporting and BI applications. Offers powerful performance and excels at multi-table JOIN operations. Supports advanced analytical features such as window functions and materialized views. Beyond traditional batch processing, YMatrix introduces the Domino streaming engine, enabling real-time data processing via SQL—replacing tools like Flink or Spark.

Transactional Scenarios

Provides full ACID compliance, ensuring financial-grade data reliability. Meets the stringent performance, correctness, and consistency requirements of critical systems such as finance and ERP. Supports stored procedures, triggers, and cross-site disaster recovery, making it suitable for complex OLTP use cases.

AI Scenarios

Enables vector search for large language models (LLM), helping enterprises rapidly build AI agents using business data. Supports in-database execution of PL/Python without requiring Spark, fully utilizing hardware resources and improving machine learning efficiency. Offers multimodal data management and hybrid search capabilities.

Core Advantages

Full-Scenario Support

  • Single database handles time-series, analytics, transactional, and AI workloads
  • Full ACID compliance, supporting SQL : 2016 standards
  • Manages structured (relational tables), semi-structured (JSON, XML, Vector), and unstructured data (text, images, video) with mixed retrieval support
  • Supports vector storage, vector indexing, and quantization algorithms for rapid development of "enterprise AI agents"

Low Cost

High Performance

  • Parallel computing across multiple nodes and cores for batch data analysis
  • High-concurrency ingestion with support for out-of-order and batched writes in complex network environments
  • Integrated HTAP architecture enhances complex query and in-database analytics performance
  • Domino streaming engine enables real-time in-database data flow and fast processing, supporting second-level, real-time, and incremental analytics
  • Real-time data processing dynamically reflects analytical results based on data changes

High Availability

Easy Scalability

  • Supports online, seamless cluster expansion via CLI or GUI
  • Scales to over 100 nodes, suitable for TB to PB-scale data processing
  • Fully compatible with PostgreSQL/Greenplum ecosystem and downstream toolchains

Security and Compliance

Advanced Components

Visual Operations – MatrixUI

MatrixUI is a graphical operations and management tool designed for simplicity and comprehensive monitoring.

  • Graphical installation: Deploy a cluster in 10 minutes; simulate time-series write and query scenarios in 3 minutes
  • GUI-based monitoring and maintenance: One-click health checks, instant scaling, cluster inspection, Kafka ingestion configuration, and workload analysis

High-Concurrency Ingestion – MatrixGate

MatrixGate is a high-performance data loader that distributes data evenly across all segments for parallel ingestion.

  • Supports various data sources and formats
  • Enables batch and streaming data ingestion
  • Low latency, high concurrency: Achieves up to 100x faster ingestion by leveraging bandwidth and data compression
  • Supports UPSERT: Handles out-of-order and batched data merges efficiently, ideal for high-throughput, low-latency streaming scenarios

Incremental Backup – MatrixArchive

MatrixArchive captures a running YMatrix cluster’s data at a specific point in time, saving it according to defined rules to ensure data integrity and consistency. From these backup files, a fully functional YMatrix cluster can be restored, matching the original cluster’s state at that moment.

Peer-to-Peer Migration – MatrixShift

MatrixShift is a dedicated data migration tool supporting full, incremental, and conditional migrations between different versions of Greenplum and YMatrix. Features include high efficiency (peer-to-peer transfer, small-table optimization, data compression) and flexible configuration.

  • Migrates data from Greenplum to YMatrix
  • Full-scenario migration: Supports full, incremental, and filtered migrations
  • Efficient peer-to-peer transfer: Direct Segment-to-Segment data movement eliminates bottlenecks common in traditional migration approaches

Learn More About YMatrix

YMatrix System Architecture
Quick Start Guide
Standard Cluster Deployment
Use Cases