YMatrix - Enterprise-Grade Hyper-Converged Database

What is YMatrix?


YMatrix is a hyper-converged database product developed by 4D Matrix (YMatrix) based on the classic open-source databases PostgreSQL and Greenplum. In addition to excelling in time-series scenarios, it also supports traditional use cases such as Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP).
It meets enterprise requirements for high availability, security, high performance, automated operations, visualized installation, and data processing, ensuring reliable deployment for enterprise users.
Its core value lies in cost-effectiveness, ease of use, high read/write performance, high storage efficiency, and high availability.
YMatrix also offers a community edition—your experience and feedback are welcome.

What Are the Core Features of YMatrix?


YMatrix provides the following key features:

  1. Hyper-Converged Architecture
    The hyper-converged architecture of YMatrix addresses the "data silo" problem commonly found in traditional databases, enabling "one database, multiple uses." This is achieved through two main components: micro-kernels and MPP (Massively Parallel Processing).

    • Micro-Kernel: In YMatrix, a micro-kernel includes the storage engine and execution engine. Different micro-kernels are optimized for different scenarios. For example:

      • OLTP micro-kernel (HEAP storage engine + Volcano execution engine) suits transactional (TP) workloads.
      • Time-series micro-kernel (MARS2 storage engine + Vectorized execution engine) suits time-series workloads.
        The storage engine used by a micro-kernel is typically fixed, while the executor is selected based on the optimizer's cost evaluation. As a result, you can choose optimal plugin combinations for different business scenarios, enabling rapid and flexible database expansion without compromising system stability.
    • Distributed MPP Architecture, also known as Shared-Nothing architecture. This refers to systems with two or more processors that collaborate on a single operation, each having its own memory, operating system, and disk. YMatrix leverages this high-performance architecture to distribute database workload and utilize all system resources in parallel for query processing, achieving superior performance.

  2. High Performance
    YMatrix delivers strong performance across all scenarios, including data ingestion, time-series queries, OLAP analytics, machine learning (ML), and OLTP capabilities. Key aspects include:

    • Data Ingestion: MatrixGate, a streaming ingestion tool, supports high-speed loading of various data types. With features like high concurrency, distributed processing, streaming, and batch loading, it exceeds expectations for real-time data ingestion in enterprise time-series scenarios, while providing full transactional guarantees.

    • Query Performance: Supports hybrid row-column storage. Built on the highly compressed MARS2 storage engine and powered by a cost-based optimizer (CBO), YMatrix selects the most efficient execution plan. Starting from version 5.0, the vectorized execution engine is enabled by default. Rigorous testing using benchmarks such as SSB (Star Schema Benchmark) and TSBS (Time Series Benchmark Suite) confirms that YMatrix delivers query performance far exceeding comparable products.

  3. High Availability

    • Automatic Failover: Thanks to YMatrix’s new automated operations mechanism (version 5.0 and above), when the cluster master node (Master) or data node (Segment) fails, the system automatically switches to the standby node, completing failover seamlessly.
    • Streaming Replication: Both Master and Segment nodes support streaming replication to ensure data high availability.
  4. Simple and Easy to Use

    • Graphical Installation: Deploy a cluster in 10 minutes; simulate time-series write and query operations within 3 minutes.
    • Graphical Operations & Monitoring: Simple interface with rich information display. Enables one-click, second-level cluster scaling.
  5. Enterprise-Grade Security
    YMatrix provides comprehensive 360-degree access security mechanisms, including authentication, privilege control, encryption, auditing, and resource management.

    • Authentication: Supports multiple methods, including trust authentication, password authentication, and PAM authentication.
    • Privilege Control: Implements Role-Based Access Control (RBAC), simplifying user-permission associations.
    • Encryption: Offers multi-level encryption options:
      • Encrypted password storage
      • Column-level encryption
      • SSL host authentication
      • Client-side encryption
      • Network data encryption
      • Password encryption over networks
      • Tablespace encryption
    • Auditing: Logs user login/logout events and database activities. Audit levels can be configured according to security requirements.
    • Resource Control: Enforces strict IP address access restrictions to ensure trusted sources; allows configuration of maximum concurrent connections per user; includes default connection timeout policies.
  6. Complete Ecosystem

    • Fully compatible with the upstream and downstream toolchains of the PostgreSQL / Greenplum ecosystem.

What Business Scenarios Does YMatrix Support?


  • Complex Data Processing Requiring a Converged Architecture
    In traditional industrial enterprises, massive amounts of data are often scattered across departments, systems, and applications due to organizational strategy, architectural design, or digital transformation efforts. These isolated data stores form “data silos”, which cannot interconnect or be effectively utilized. Beyond technical complexity, these silos severely hinder competitive advantage in business operations. Data isolation significantly restricts enterprise management, operations, and growth—it is a critical barrier to overcome in digital transformation.
    YMatrix’s hyper-converged architecture has been successfully deployed in real-world production environments such as factory data platforms, corporate group data warehouses, intelligent connected vehicles, and IoT device operations. It greatly reduces technical barriers related to selection, procurement, usage, and maintenance, receiving positive feedback. For example, in smart manufacturing, a single YMatrix database can collect, store, compute, model, query, and analyze data from ERP (Enterprise Resource Planning), MES (Manufacturing Execution Systems), and equipment systems.

  • Scenarios with Complex Time-Series Analysis Needs
    Time-series data forms the foundation of IoT,车联网 (connected vehicles), Industrial Internet, and Smart Cities. Its defining characteristic is real-time processing, placing high demands on database write and storage performance. Enterprises must address challenges such as maintaining performance while controlling costs, safely and quickly scaling capacity to avoid data backlog, and lowering technical barriers to respond rapidly and accurately to evolving data needs.
    YMatrix is optimized for time-series workloads. Leveraging the MARS2 storage engine’s physical sorting, asynchronous uploads, and batched ingestion, combined with MatrixGate’s high-concurrency, distributed, streaming, and bulk-write capabilities, YMatrix exceeds expectations in real-time data ingestion, high-speed writes, real-time queries, and transactional consistency.
    YMatrix supports graphical scaling—simple operations enable second-level expansion. It also supports smooth scaling without service interruption, ensuring business continuity, minimizing downtime losses, and reducing risks.

  • Broad IoT Scenarios with Massive Devices
    Common broad IoT scenarios include smart campuses, smart homes, intelligent transportation, smart water systems, smart agriculture, and smart meteorology. A large number of devices generate massive volumes of data requiring efficient write, storage, and query capabilities. Storage cost (compression ratio) and access efficiency (decompression speed) are decisive factors in building stable data infrastructure. High-speed ingestion and real-time query performance directly impact end-user experience.
    Besides supporting PB-scale clusters, YMatrix employs patented Chain-Encoding compression technology. It allows business users to select optimal encoding schemes tailored to individual column characteristics, achieving the best cost-performance ratio. This reduces enterprise storage costs by over 50%, making massive data storage manageable.
    Combined with hardware capabilities and MatrixGate’s high-concurrency, distributed, streaming, and bulk-write features, YMatrix achieves second-level data ingestion.
    With full vectorization (starting from version 5.0), YMatrix achieves 1.24x the SSB performance of ClickHouse in testing, delivering world-class high-throughput, low-latency query performance.

  • Traditional Data Warehouse OLAP Scenarios
    YMatrix is compatible with the PostgreSQL/Greenplum ecosystem and supports classic OLAP scenarios in industries such as finance, telecommunications, government, energy, and manufacturing. It enables Business Intelligence (BI) and reporting analytics.
    These scenarios typically involve non-time-series data and rely on the Hadoop ecosystem for data production and consumption: historical data stored in Hadoop, with Spark used to compute reporting metrics—a complex process.
    YMatrix simplifies this workflow by integrating structured and unstructured data handling, federated data access, graphical Kafka stream integration, and hot/cold data separation. It provides an all-in-one solution for data consumption, along with automatic failover and recovery mechanisms—secure, simple, and easy to use.