YMatrix
Quick Start
Simulate Time Series Scenarios
Standard Cluster Deployment
Data Modeling
Connecting to The database
Data Writing
Data Migration
Data Query
Scene Application Examples
Federal Query
Maintenance and Monitoring
Global Maintenance
Partition Maintenance
Backup and Restore
Cluster Expansion
Monitoring
Performance Tuning
Troubleshooting
Reference Guide
Tool Guide
Data Type
Storage Engine
Execution Engine
Configuration Parameters
Index
Extension
SQL Reference
FAQ
File access refers to importing existing text data files into the target data table. Text files are usually in CSV format.
Create a CSV file:
$ vi rows.csv
The test CSV file format is as follows, the first behavior is the field name, followed by the data row, and the types of the 3 columns are timestamps, integers, and strings:
time,c1,c2
2021-01-01 00:00:00,1,a1
2021-01-01 00:00:00,2,a2
2021-01-01 00:00:00,3,a3
The target library is test
, and the schema of the target table dest
is as follows, which is consistent with the CSV file format:
=# CREATE TABLE dest(
time timestamp,
c1 int,
c2 text
)USING MARS3
DISTRIBUTED BY(c1)
ORDER BY(time,c1);
The following are several commonly used methods to import file content into the target table:
COPY
is a SQL command that comes with YMatrix, which can import data files on the master node (Master) into the target table.
First use psql to connect to the target library, and then execute the COPY
command.
[mxadmin@mdw ~]$ psql test
psql (12)
Type "help" for help.
test=# COPY dest FROM '/home/mxadmin/rows.csv' DELIMITER ',' HEADER;
COPY 3
The DELIMITER parameter specifies a delimiter; HEADER means filtering file headers when importing data.
The method of COPY
is relatively simple, but there are the following problems:
The MatrixGate method is to use the high-speed data access tool mxgate
provided by YMatrix for data import.
Compared with COPY
, mxgate
performs better when the data volume is large, and data files and master can also be deployed separately.
$ tail -n +2 rows.csv | mxgate --source stdin --db-database test --db-master-host localhost --db-master-port 5432 --db-user mxadmin --time-format raw --target public.dest --parallel 2 --delimiter ','
The above command redirects the data part of rows.csv to mxgate through pipeline. mxgate accesses data into the dest
table of the test
library in the test
method.
Notes!
Because mxgate does not require file headers when accessing data, use thetail -n +2
command to output from the second line.
For more usage methods of mxgate, please refer to Document