YMatrix
Quick Start
Standard Cluster Deployment
Data Modeling
Connecting to The database
Data Migration
Data Query
Maintenance and Monitoring
Performance Tuning
Troubleshooting
Reference Guide
SQL Reference
FAQ
File access refers to importing existing text data files into the target data table. Text files are usually in CSV format.
The test CSV file format is as follows. The first behavior is HEADER, followed by the data row. The types of the 3 columns are timestamps, integers and strings:
time,c1,c2
2021-01-01 00:00:00,1,a1
2021-01-01 00:00:00,2,a2
2021-01-01 00:00:00,3,a3
The target library is test
, and the schema of the target table dest
is as follows, which is consistent with the CSV file format:
CREATE TABLE dest(
time timestamp,
c1 int,
c2 text
)DISTRIBUTED BY(c1);
The following are several commonly used methods to import file content into the target table:
COPY
is a SQL command that comes with YMatrix, which can import data files on the master node (Master) into the target table.
First use psql to connect to the target library, and then execute the COPY command.
[mxadmin@mdw ~]$ psql test
psql (12)
Type "help" for help.
test=# COPY dest FROM '/home/mxadmin/rows.csv' DELIMITER ',' HEADER;
COPY 3
The DELIMITER parameter specifies a delimiter; HEADER means filtering file headers when importing data.
The method of COPY
is relatively simple, but there are the following problems:
The MatrixGate method is to use the high-speed data access tool mxgate
provided by YMatrix for data import.
Compared with COPY
, mxgate
performs better when the data volume is large, and data files and master can also be deployed separately.
tail -n +2 rows.csv | mxgate --source stdin --db-database test --db-master-host localhost --db-master-port 5432 --db-user mxadmin --time-format raw --target public.dest --parallel 2 --delimiter ','
The above command redirects the data part of rows.csv to mxgate through pipeline. mxgate accesses data into the dest
table of the test
library in the test
method.
Notes!
Because mxgate does not require file headers when accessing data, use thetail -n +2
command to output from the second line.
For more usage methods of mxgate, please refer to Document