Quick onboard
Deployment
Data Modeling
Connecting
Migration
Query
Operations and Maintenance
Common Maintenance
Partition
Backup and Restore
Expansion
Mirroring
Resource Management
Security
Monitoring
Performance Tuning
Troubleshooting
Reference Guide
Tool guide
Data type
Storage Engine
Executor
Stream
DR (Disaster Recovery)
Configuration
Index
Extension
SQL Reference
File ingestion refers to importing existing text data files into a target table. Text files are typically in CSV format.
Create a CSV file:
$ vi rows.csv
The test CSV file has the following format. The first row contains column names, and subsequent rows contain data. The three columns are of timestamp, integer, and string types respectively:
time,c1,c2
2021-01-01 00:00:00,1,a1
2021-01-01 00:00:00,2,a2
2021-01-01 00:00:00,3,a3
The target database is test, and the schema of the target table dest matches the CSV file format:
=# CREATE TABLE dest(
time timestamp,
c1 int,
c2 text
)USING MARS3
DISTRIBUTED BY(c1)
ORDER BY(time,c1);
Below are several common methods for importing file contents into the target table:
COPY is a built-in SQL command in YMatrix that imports data files located on the master node (Master) into the target table.
First, connect to the target database using psql, then execute the COPY command:
[mxadmin@mdw ~]$ psql test
psql (12)
Type "help" for help.
test=# COPY dest FROM '/home/mxadmin/rows.csv' DELIMITER ',' HEADER;
COPY 3
The DELIMITER parameter specifies the field delimiter. The HEADER option indicates that the header line should be skipped during import.
The COPY method is simple to use, but has the following limitations:
MatrixGate uses YMatrix's high-speed data ingestion tool mxgate to import data.
Compared to COPY, mxgate offers better performance for large volumes of data and allows deployment of data files independent from the Master node.
$ tail -n +2 rows.csv | mxgate --source stdin --db-database test --db-master-host localhost --db-master-port 5432 --db-user mxadmin --time-format raw --target public.dest --parallel 2 --delimiter ','
The above command uses a pipe to redirect the data portion of rows.csv (starting from the second line) into mxgate. mxgate ingests the data from stdin into the dest table in the test database.
Note!
Sincemxgatedoes not require a file header during ingestion, thetail -n +2command is used to output the file content starting from the second line.
For more information about mxgate, refer to the documentation.