YMatrix is a distributed database product developed based on the PostgreSQL/Greenplum open-source database, featuring the following core characteristics:
In addition to the commercial version, YMatrix also offers a free community version. We welcome your experience and feedback. https://ymatrix.cn/download
A hyper-converged database refers to a database product that integrates transaction-oriented databases (OLTP), analytical databases (OLAP), time-series databases, and data lake capabilities.
YMatrix's hyper-convergence philosophy rejects the fragmentation of data processing, instead integrating computing, storage, and network resources into a single system. By considering factors such as the original database types, versions, cluster topologies, and business characteristics of specific business scenarios, YMatrix provides different storage and execution engine combinations on top of the database's common core components to achieve different microkernels, thereby delivering targeted improvements in write, storage, and query performance.
YMatrix believes that databases should focus more on full-scenario functionality and performance, including writing, querying, analysis, machine learning, and more. By integrating various capabilities into a single database product, it can address a variety of complex scenarios, thereby achieving multi-model, scalability, and cost control for business applications from multiple perspectives.
· Analysis Capabilities
· Transaction Capabilities
· Time Series Capabilities
YMatrix provides services using SQL as the unified interface for all data at the upper layer.
YMatrix has strong scalability.
On one hand, YMatrix has expanded into an increasing number of business scenarios during its iterative development, including vehicle networking, smart manufacturing, finance, and vector computing; on the other hand, we provide capabilities such as machine learning and data federation, enabling more heterogeneous and cross-source business scenarios to run efficiently on YMatrix through database extensions (Extension).
YMatrix hyper-converged databases help users simplify their infrastructure architecture, significantly reducing the complexity of their technology stacks and improving the performance of data infrastructure in different scenarios. They also reduce the risks associated with the coexistence and interaction of multiple systems, thereby helping enterprises build a comprehensive data governance mechanism and fully unleash the digital potential of the data era.
YMatrix drives the implementation of the “hyper-convergence” concept in its products through a number of key proprietary technologies.
With the goal of simultaneously adapting to analysis, transactions, and time series scenarios, MARS3 provides two modes for users to choose from: column storage and row-column mixed storage. In addition to providing excellent storage performance (including compression, status diagnosis, etc.), the row-column mixed storage mode also ensures high-performance writing. Both modes implement the MVCC mechanism. For partitioned tables, they support automatic partition management and automatic storage degradation.
The vectorized execution engine is a high-performance execution engine designed specifically for column-oriented storage engines (such as MARS3, MARS2, and AOCO). For common queries, it offers a performance improvement of one to two orders of magnitude compared to traditional row-oriented execution engines.
ALOHA (Advanced Least Operation High Availability) is the cluster state data management service introduced in YMatrix 5.X. It operates independently of the cluster, allowing separate disk and monitoring configurations. Even in harsh environments, it ensures low-latency node state detection and management, completing failure automatic transfer within 3 seconds.。
Powerful analytical computing capabilities
The primary query scenario for data warehouses is historical data analysis. Traditionally, this is accomplished using the Hadoop ecosystem for data production and consumption: historical data is first stored on the Hadoop platform, then Spark is used to calculate report metrics, a complex process.
YMatrix not only leverages hyper-converged capabilities to address complex ecosystem challenges but also enhances analytical performance through targeted optimizations: by integrating structured and unstructured data types, data federation access, and other methods, it completes business intelligence (BI) and reporting analysis tasks in classic OLAP scenarios such as finance, telecommunications, government, energy, and manufacturing. Through vectorization, Runtime Filter, sliding windows, continuous aggregation, and other query optimization technologies, it achieves powerful analytical computing capabilities.
Balancing high-speed writing, low-cost storage, and real-time queries
Due to the real-time nature of time series data, time series scenarios place high demands on database write, storage, and query capabilities.
YMatrix is optimized for time. Thanks to the MARS series storage engine's physical sorting, different frequency uploads, batch uploads, and MatrixGate's high concurrency and high-performance batch data write capabilities, YMatrix can exceed expectations in meeting the needs of real-time warehousing, real-time queries, and transaction guarantees in enterprise time series scenarios.
YMatrix supports graphical scaling with simple operations, enabling rapid scaling in seconds; it also supports smooth scaling without interrupting business operations, ensuring business safety and smoothness, reducing downtime losses, and lowering risks.
Leveraging hyper-converged capabilities to integrate data pipelines
Data silos are a common phenomenon in traditional industrial enterprises. The inability to circulate and utilize data constrains enterprise management, operations, and development, severely impacting the acquisition of competitive advantages in business operations. This is a critical challenge that must be overcome in enterprise digital transformation.
Currently, YMatrix's hyper-converged architecture has been successfully applied in real production scenarios such as factory data foundations, large corporate group data warehouses, intelligent connected vehicles, and IoT device intelligent operations, significantly lowering technical barriers during enterprise selection, procurement, use, and maintenance, and receiving positive feedback. For example, in smart manufacturing scenarios, a single repository can handle the collection, storage, computation, modeling, querying, and analysis of data from enterprise resource planning systems (ERP), manufacturing execution systems (MES), and equipment data.