Deployment on Kubernetes

This document describes how to install and deploy YMatrix on a Kubernetes cluster.

1 Before Installation

1.1 Runtime Environment

  • Kubernetes cluster
    • Kubernetes version >= 1.20
    • RBAC enabled
    • DNS enabled
    • Available StorageClass for database storage
    • Internet access to download required packages and container images

1.2 Required Tools

You need the Kubernetes client kubectl and helm to access your Kubernetes cluster.
You must have sufficient permissions to create namespaces and deploy resources within them.
When installing components using helm, you can adjust parameters as needed for your Kubernetes environment.

1.3 Prerequisite Knowledge

Familiarity with the usage of kubectl and helm is required.

2 Install Dependencies

2.1 (Optional) Install OpenEBS to Provide StorageClass openebs-hostpath

If your Kubernetes cluster does not have a suitable StorageClass for YMatrix,
you can deploy OpenEBS to provide the openebs-hostpath StorageClass for YMatrix database storage.

See OpenEBS Helm installation guide

Run:

helm repo add openebs https://openebs.github.io/charts
helm repo update
helm upgrade --install openebs --namespace openebs openebs/openebs --create-namespace

This creates the openebs namespace and installs the latest version of OpenEBS in it.

2.2 Install cert-manager (Third-party Dependency)

YMatrix depends on cert-manager to create and manage certificates and keys.
You must install cert-manager in your Kubernetes cluster for matrixdb-operator to function properly.
Typically, only one instance of cert-manager should be installed per Kubernetes cluster.
Ensure that cert-manager is not already installed before proceeding.

If cert-manager is already installed, verify its version is >= 1.6.0.

See cert-manager Helm installation guide

Run:

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm upgrade --install \
  cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --set installCRDs=true

This creates the cert-manager namespace and installs the latest version of cert-manager in it.

3 Install matrixdb-operator

matrixdb-operator is the tool used to deploy YMatrix on Kubernetes.
It must be installed in the cluster before deploying YMatrix.

See our Helm chart repository

Note!

There are compatibility requirements between matrixdb-operator and YMatrix versions. Use the corresponding version of the tool accordingly.
For detailed version mapping, see YMatrix and YMatrix Operator Version Mapping

Run:

helm repo add ymatrix https://ymatrix-data.github.io/charts
helm repo update
helm upgrade --install \
  matrixdb-operator ymatrix/matrixdb-operator \
  --namespace matrixdb-system \
  --create-namespace \
  --version 0.13.0

This creates the matrixdb-system namespace and installs the matrixdb-operator in the latest version.

Note!

If your Kubernetes cluster domain is not the default cluster.local, you must configure the matrixdb-operator installation to use your custom domain.

For example, if your cluster domain is custom.domain, add the following parameter to the matrixdb-operator helm upgrade command:

--set kubernetesClusterDomain=custom.domain

After installation, run:

helm list --namespace matrixdb-system

Expected output:

NAME                     NAMESPACE                       REVISION        UPDATED                                     STATUS          CHART                          APP VERSION
matrixdb-operator        matrixdb-system                 1               2022-08-08 17:45:52.919849 +0800 CST        deployed        matrixdb-operator-0.13.0        0.13.0

Verify the installed version of matrixdb-operator.

When deploying a database, ensure the container image's supported matrixdb-operator version matches the installed matrixdb-operator version.

4 Deploy a YMatrix Database Cluster

4.1 Prepare the YMatrix CRD Definition File

A sample file db0.yaml is shown below:

apiVersion: deploy.ymatrix.cn/v1
kind: MatrixDBCluster
metadata:
  name: db0 # Name of the YMatrix cluster
spec:
  image:
    repository: matrixdb/matrixdb-community # Change this for enterprise edition to your private registry
    tag: <DB-TAG-TO-SPECIFY> # Update tag to match a DB image compatible with your installed matrixdb-operator version
  master:
    enableStandby: true
    memory: "500Mi"
    cpu: "0.5"
    storageClassName: openebs-hostpath # Choose another supported StorageClass if needed
    storage: "1Gi"
    workerSelector: {}
  segments:
    count: 1
    enableMirror: false
    memory: "500Mi"
    cpu: "0.5"
    storageClassName: openebs-hostpath # Same as spec.master.storageClassName
    storage: "1Gi"
    workerSelector: {}
  gate:
    memory: "100Mi"
    cpu: "0.1"
    storageClassName: openebs-hostpath # Same as spec.master.storageClassName
    storage: "1Gi"
    workerSelector: {}

For detailed configuration options, refer to CRD Configuration Documentation.

4.2 Prepare Namespace for YMatrix Deployment

Run the following command to create a namespace matrixdb-ns for YMatrix deployment:

kubectl create namespace matrixdb-ns

4.3 Deploy the YMatrix Database

With the following prepared:

  • The YMatrix CRD definition file, assumed to be named db0.yaml.
  • The target namespace for deployment, assumed to be matrixdb-ns.

Run:

kubectl apply -f db0.yaml --namespace matrixdb-ns

After successful execution, check the status of the deployed YMatrix cluster db0 with:

kubectl get mxdb --namespace matrixdb-ns

5 Adjust Cluster Configuration

Before using the cluster, you may want to tune YMatrix parameters for performance optimization.

5.1 Configure Cluster Parameters Using gpconfig

5.1.1 Access the Master Node

gpconfig must be run on the Master node. The Master hostname follows the format <cluster-name>-0-0.
In the above deployment, the Master hostname is db0-0-0.

Run:

kubectl exec -it db0-0-0 --namespace matrixdb-ns -- sudo -u mxadmin -i

to open a shell inside the Master pod.

5.1.2 Use gpconfig to Modify Parameters

In the shell, use gpconfig to view and modify configuration parameters.
For example, list all configurable parameters:

gpconfig -l

For more information on gpconfig, see: gpconfig Documentation

5.1.3 Restart the Database

After making changes, restart the cluster for the new settings to take effect:

mxstop -a -r -M fast

5.1.4 Exit the Shell

Finally, run exit to exit the shell session on the Master pod.

6 Using the Cluster

6.1 Default Cluster Credentials

The default administrator credentials for the deployed YMatrix are:

Username Password
mxadmin changeme

Change the password before production use.

6.2 View Deployed Services

To connect to the database, identify the correct Service.

Run:

kubectl get svc --namespace matrixdb-ns

Sample output:

NAME           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
db0            ClusterIP   None             <none>        22/TCP     7d22h
db0-cylinder   ClusterIP   172.20.148.232   <none>        4637/TCP   7d22h
db0-gate       ClusterIP   None             <none>        4617/TCP   7d22h
db0-pg         ClusterIP   172.20.216.189   <none>        5432/TCP   7d22h
db0-ui         ClusterIP   172.20.157.163   <none>        8240/TCP   7d22h

Here, db0-pg is the service for connecting to the database. db0-ui is for accessing the UI.

6.3 Connect Using PostgreSQL 12 psql Client (Replace Sample IP with Your Actual IP)

6.3.1 Direct Access to db0-pg Service

In environments where ClusterIP is accessible, run:

psql -h 172.20.216.189 -p 5432 -U mxadmin

to connect to the database. (Enter the password for mxadmin when prompted.)

6.3.2 Use port-forward

Use kubectl port-forward for temporary access to the service.

6.4 Access YMatrix UI (Replace Sample IP with Your Actual IP)

6.4.1 Direct Access to db0-ui Service

In environments with ClusterIP access, open http://172.20.157.163:8240 in a browser.
Log in with the mxadmin user password.

6.4.2 Use proxy

Use a proxy created via kubectl proxy for temporary access.

6.4.3 Import Data Using Kafka in UI

See: Import Data Using Kafka

6.5 Connect Database Applications to the Cluster

Connection methods are the same as for psql.

7 Cluster Management

7.1 Obtain Shell Access to Master

Currently, matrixdb-operator deploys the YMatrix cluster as multiple StatefulSets.
For example, a cluster db0 may appear as:

$ kubectl get statefulset
NAME       READY   AGE
db0-0      2/2     9d
db0-1      2/2     9d
db0-2      2/2     9d
db0-gate   1/1     9d

Here, each db0-{StatefulSet序号} StatefulSet is part of the database cluster.

The StatefulSet with index 0 deploys the master segment.
Its corresponding db0-0-{Replication序号} pod runs the master segment's master (replication index 0) and standby (replication index 1).

StatefulSets with non-zero indices deploy data segments.
Each such pod hosts the data segment's primary (replication index 0) and mirror (replication index 1).

Most gp* and mx* management tools must be run on the master.

Run the following command to get a shell on the pod running master (as user mxadmin):

kubectl exec -n matrixdb-ns -it db0-0-0 -- sudo -u mxadmin -i

7.2 Data PVs

Data stored in the database resides in Persistent Volumes (PVs).
These PVs are provisioned via PersistentVolumeClaims (PVCs) created by matrixdb-operator.

Example:

$ kubectl get pvc
NAME                  STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
db0-data-db0-0-0      Bound    pvc-7d7fdf0c-0922-4d2a-9cdd-f72ce9cd8441   1Gi        RWO            gp2            9d
db0-data-db0-0-1      Bound    pvc-966dbad2-f7b0-4af4-b7b1-62073492833d   1Gi        RWO            gp2            9d
db0-data-db0-1-0      Bound    pvc-eaa0bd6f-d9bc-4578-aeec-4375a86d6c21   1Gi        RWO            gp2            9d
db0-data-db0-1-1      Bound    pvc-fac53281-9ffd-423e-ba71-0e3d381b3cc8   1Gi        RWO            gp2            9d
db0-data-db0-2-0      Bound    pvc-a0ddc01a-6cc7-4640-8b8e-1471ccc5a8ab   1Gi        RWO            gp2            9d
db0-data-db0-2-1      Bound    pvc-b5ee24f9-d7e6-4448-8c54-ebdb60488bcb   1Gi        RWO            gp2            9d
db0-data-db0-gate-0   Bound    pvc-ad685e25-8f18-4d30-9227-a3868bb19f90   1Gi        RWO            gp2            9d

These PVs are mounted under the /mxdata directory in their respective pods.
Lifecycle management of these PVCs and PVs must be handled manually by the database administrator.

8 Troubleshooting

8.1 How to Obtain the YMatrix Database Image?

Community edition images are hosted on DockerHub and can be pulled directly.
For enterprise edition images, contact us to obtain the download link.
You can then use docker or nerdctl to import the image into your private container registry.

Assume you have downloaded the image matrixdb-v5.0.0.enterprise-v0.12.0.tar.gz.

Run the following commands to load it into your local registry and push it to your Kubernetes cluster's image repository:

# Load image using docker load
gunzip -c matrixdb-v5.0.0.enterprise-v0.12.0.tar.gz | docker load
# Output omitted
... 
Loaded image: matrixdb/matrixdb-enterprise:v5.0.0.enterprise-v0.12.0

# Retag the image
docker tag matrixdb/matrixdb-enterprise:v5.0.0.enterprise-v0.12.0 \
           <your-image-repo>/matrixdb-enterprise:v5.0.0.enterprise-v0.12.0

# Push the tagged image to your registry
docker push <your-image-repo>/matrixdb-enterprise:v5.0.0.enterprise-v0.12.0

# You can now reference this custom repo and tag in your CRD during deployment