The stable operation of a YMatrix cluster relies on three core process modules, each performing its own responsibilities while working collaboratively to form a complete distributed data processing architecture.
Known as the "general manager" of cluster processes, it is a resident system service that starts with the installation of the cluster and runs with root privileges. Its core functions include launching and managing all other YMatrix service processes, maintaining persistent process states, ensuring high availability, and exposing RPC interfaces for external management operations. It serves as the foundation for starting and monitoring cluster processes.
Acting as the "scheduling hub" of the cluster, these processes are centered around mxbox series processes, integrating various service capabilities required for distributed management. They handle cluster topology management (cluster module), data sharding scheduling (shard module), data replication and failover (replication module), automated deployment (deployer module), and rely on etcd for storing cluster metadata and ensuring consistency across nodes.
These processes serve as the "processing core" of the cluster, running on Master, Standby, and Segment nodes. They receive and parse SQL requests, execute data storage and computation tasks, record logs, manage data persistence, handle distributed transactions, and support auxiliary processes like logging, storage, and monitoring.
/etc/matrixdb6/supervisor.conf.Core Position: The "functional carrier" of SOA services.
Main Modules and Functions | Module | Core Function | |---|---| | mxbox cluster | Manages cluster topology, node joining/leaving, state synchronization, etc. | | mxbox shard | Responsible for data sharding allocation, scheduling, load balancing, mapping shards to Segment nodes. | | mxbox replication | Controls cluster data replication policies, achieving synchronization and failover between Master-Standby and Primary-Mirror. | | mxbox deployer | Provides automated deployment capabilities, supporting node initialization, software installation, configuration synchronization, and other O&M operations. |
Key Role: As an "executor" of SOA services, it divides cluster management logic into independent modules, reducing maintenance complexity, and through standardized interfaces collaborates with supervisord and etcd to achieve flexible expansion and efficient O&M of the cluster.
Core Position: The "core engine" for data processing.
Node-Specific Configurations and Functions
| Node Type | Process Role (gp_role) | Core Configuration Example | Core Function |
|---|---|---|---|
| Master | Dispatch (Scheduling) | -p 5432 -D /mxdata/master/mxseg-1 |
Receives client connections, generates distributed query plans, schedules QE execution, aggregates query results. |
| Standby | Dispatch (Scheduling) | -p 5432 -D /mxdata/standby/mxseg-1 |
Synchronizes Master data, provides failover capability, takes over Master functions when the primary node fails. |
| Segment (Primary) | Execute (Execution) | -p 6000 -D /mxdata/primary/mxseg0 |
Stores raw data, executes Slice tasks dispatched by QD, sends WAL logs to Mirror. |
| Segment (Mirror) | Execute (Execution) | -p 7001 -D /mxdata/mirror/mxseg6 |
Synchronizes Primary data, provides data redundancy, switches to Primary node when Primary fails. |
Derived Process Dependencies: Each postgres main process spawns auxiliary processes such as log handling, storage optimization, transaction assurance, collectively supporting core data processing capabilities, making them the "ultimate executors" of cluster data operations.
Core Position: The "frontline executor" for queries.
Core Functions:
Key Characteristics:
Core Position: The "guardian" against deadlocks in distributed transactions.
Core Functions:
Key Significance: As a distributed database, YMatrix frequently encounters cross-node transaction scenarios. This process effectively addresses the challenge of detecting deadlocks in distributed environments, ensuring stability during concurrent transaction executions.
Core Position: The "iron triangle" ensuring data consistency.
Division of Labor and Collaboration Logic:
walwriter: Deployed on Master/Primary nodes, part of the postgres-derived processes, responsible for writing Write-Ahead Logs (WAL) from memory to disk, ensuring local data persistence.walsender: Deployed on Master/Primary nodes, part of the postgres-derived processes, responsible for sending WAL logs in real-time to Standby/Mirror nodes.walreceiver: Deployed on Standby/Mirror nodes, part of the postgres-derived processes, responsible for receiving WAL logs sent by walsender and writing them to local disks.Core Value: Through real-time synchronization of WAL logs, it achieves data redundancy for Standby to Master and Mirror to Primary. In case of a primary node failure, the standby node can quickly recover data through WAL logs, ensuring high availability of the cluster.
YMatrix cluster processes are rooted at supervisord, forming a hierarchical tree structure. Below, key processes' associations and roles are highlighted according to core groups:
supervisord (root process)
├─ SOA Services Layer
│ ├─ mxbox cluster (Cluster Topology Management)
│ ├─ mxbox shard (Sharding Scheduling Management)
│ ├─ mxbox replication (Data Replication and Failover)
│ ├─ mxbox deployer (Automated Deployment)
│ └─ etcd (Cluster Metadata Storage)
├─ System Management and Monitoring Layer
│ ├─ mxui (Web Management Interface)
│ ├─ cylinder (Periodic Maintenance Tasks)
│ └─ telegraf (Cluster Monitoring Data Collection and Metrics Exposure)
└─ Database Instance Layer (Branching by Node Types)
├─ Master Nodes
│ ├─ postgres (Primary Instance Process, gp_role=dispatch)
│ │ ├─ Logging and Basic Processes (master logger, mxlogstat, etc.)
│ │ ├─ Storage Optimization Processes (checkpointer, walwriter, etc.)
│ │ ├─ Distributed Feature Processes (dtx recovery, global deadlock detector, etc.)
│ │ └─ Query Scheduling Process (QD: Query Dispatcher)
│ └─ matrixgate warden (Gateway Connection Management)
├─ Standby Nodes
│ └─ postgres (Backup Instance Process, gp_role=dispatch)
│ ├─ Basic Assurance Processes (logger, checkpointer, etc.)
│ └─ Data Synchronization Processes (walreceiver, startup recovering)
└─ Segment Nodes (Primary/Mirror)
├─ postgres (Instance Process, gp_role=execute)
│ ├─ Basic Assurance Processes (logger, mxlogstat, etc.)
│ └─ Query Execution Process (QE: Query Executor)
├─ Primary Unique Processes (walsender: Sends WAL logs)
└─ Mirror Unique Processes (walreceiver: Receives WAL logs, startup recovering: Data Recovery)
Clients establish connections with the Master node via the libpq protocol. QD (Query Dispatcher) links all relevant Segment nodes' QE (Query Executor) processes to form an end-to-end session link, enabling collaborative execution of distributed queries.
Client Session
├─ Master Node: postmaster (listening for connections) → fork child process QD (session-specific scheduler)
│ └─ QD connects to each Segment node's postmaster through the libpq protocol
└─ Segment Nodes: postmaster → fork child process QE (session-specific executor)
├─ Bidirectional communication between QD and QE: libpq protocol (control commands/status feedback), interconnect (data exchange)
└─ Session lifecycle: QD uniformly controls query parsing, plan distribution, result aggregation; QE executes specific data processing tasks
Core function: By dividing complex queries into multiple parallel tasks, each executed independently by QE processes on different nodes, significantly enhancing query processing efficiency.
Query Plan
└─ Divided into multiple Slices (independent processing units) based on Motion (data movement operations)
├─ Each Slice corresponds to an independent query processing task (e.g., data filtering, aggregation, joins)
├─ Physical carrier: A single QE process is responsible for executing one Slice
└─ Hierarchical relationship: Producer Slice (data output) → Motion operation → Consumer Slice (data consumption)
Slice Task
└─ Gang (process group)
├─ Composition: All QE processes across different Segment nodes executing the same Slice task
├─ Characteristics: QE processes within the same Gang work synchronously executing identical Slice logic
└─ Role: Ensures consistency in parallel execution of the same query unit under distributed scenarios, achieving cross-node data collaboration processing