Values, strings, dates and time

This document mainly introduces common data types supported by YMatrix.

There are mainly three types:

  • Numerical
  • String type
  • Date and time type

1 Value

The numeric types supported by YMatrix are mainly divided into 4 categories:

  • Integer. Different integer data have different value ranges, all with positive and negative symbols:
    1. smallint
    2. int
    3. bigint
  • Fixed points. Data used to represent fixed decimal places:
    1. decimal
  • Floating point number. Data used to represent the number of decimal places that are not fixed:
    1. real
    2. double precision
  • Self-increasing type. Usually used for self-increment ID columns:
    1. serial
    2. bigserial

The specific information is as follows:

Type Alias *Size** Scope Description
smallint int2 2 bytes -32768 ~ 32767 Small range integer
int int4 4 bytes -2147483648 ~ 2147483647 Integer Type
bigint int8 8 bytes -9223372036854775808 ~ 9223372036854775807 Large range integer
serial serial4 4 bytes 1 ~ 2147483647 Auto-incremental integer
bigserial serial8 8 bytes 1 ~ 9223372036854775807 Large range self-increasing integer
real float4 4 bytes 6 bit precision variable precision
double precision float8 8 bytes 15 bit precision variable precision
decimal numeric variable unlimited fixed precision

2 strings

There are 2 main types of strings supported by YMatrix:

  • Fixed length: char
  • Get longer
    1. varchar: There is a length limit
    2. text: No length limit

The specific information is as follows:

Type Alias *Size** Scope Description
character [ (n) ] char [ (n) ] 1 byte + n Longest n characters string Fixed length string, unspecified characters are filled with spaces
character varying [ (n) ] varchar [ (n) ] 1 byte + string size string with a maximum n characters variable length string with length limit, upper limit 10485760 bytes
text 1 byte + string length any length string variable length string without length limit

3 Date and time

There are three main types of date and time in YMatrix:

  • date. It can only represent year-month-day, and cannot represent time:

    1. date
  • time. It can only represent time, but cannot represent date:

    1. time [without time zone] (without time zone)
    2. time with time zone (with time zone)
  • Date and time. It can represent both date and time:

    1. timestamp [without time zone] (without time zone)
    2. timestamp with time zone (with time zone)

The specific information is as follows:

Type Alias *Size** Scope Description
date 4 bytes 4713 BC ~ 294,277 AD Calendar date (year, month, day)
time [ (p) ] [ without time zone ] 8 bytes 00:00:00[.0000000] ~ 24:00:00[.0000000] Time in one day
time [ (p) ] with time zone timetz 12 bytes 00:00:00+1359 ~ 24:00:00-1359 Time within one day, with time zone
timestamp [ (p) ] [ without time zone ] 8 bytes 4713 BC ~ 294,277 AD Date and time
timestamp [ (p) ] with time zone timestamptz 8 bytes 4713 BC ~ 294,277 AD Date and time, with time zone

3.1 The impact of time zone on time type

You can see a special message from the time type: time zone, real-time zone. By default, without time zone, that is, no time zone. So what is the time zone and what impact does it have on the time type?
You will understand it through the process of creating a test table.

=# CREATE TABLE test (
    c1 int,
    c2 timestamp,
    c3 timestamp with time zone
   )
   USING MARS3
   DISTRIBUTED BY (c1)
   ORDER BY (c1,c2);

In addition to the distribution key, the test table contains two timestamp columns, one without time zone by default and the other with time zone. Insert test data and query:

=# INSERT INTO test VALUES(1, now(), now());

=# SELECT * FROM test;
 c1 |            c2            |             c3
----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  1 | 2021-12-01 14:54:09.4783 | 2021-12-01 14:54:09.4783+08
(1 row)

=# SHOW timezone;
   TimeZone
------------------------
 Asia/Shanghai
(1 row)

From the query results, we can see that the time displayed by the two timestamp columns is the same. There is a +08 behind the column with time zone, indicating that the time zone to which the timestamp belongs is "East Eight District". Then, through the SHOW timezone command, I saw that the current time zone is 'Asia / Shanghai', that is, Beijing time.

Let's modify the time zone and view the results:

=# SET timezone TO 'Japan';

=# SELECT * FROM test;
 c1 |            c2            |             c3
----+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  1 | 2021-12-01 14:54:09.4783 | 2021-12-01 15:54:09.4783+09
(1 row)

It can be seen that after changing the database time zone to Japan, the time displayed in the c3 column with time zone is different from that of c2, and the subsequent time zone information becomes +09, that is, "East Jiu District".

In summary, the time stamp column with time zone stores time zone information and will display different times in different time zones; while the time stamp column without time zones displays the same time in any time zone. When the device is deployed in a region of different time zones, the timestamp must have time zone information. And whether the timestamp type has a time zone occupies the same storage space, why not?

3.2 Best practices for common data types in timing scenarios

Timing scene data is divided into the following categories:

  • Timestamp: Indicates the time of the metric collection.
  • Device ID: A unique identifier for each device.
  • Collection indicators: Time series data indicators collected from each device.

The specific information is as follows:

Type Description Best Practice
Timestamp indicates the time of the index collection. Including date and specific time timestamp with time zone
Device ID String ID varchar / text
Equipment ID Equipment number information of fixed length, such as serial number char
Device Identification Create a device table separately to uniquely identify the device with self-increment ID serial / bigserial
Acquisition index Acquisition result is an integer smallint/int/bigint
Acquisition index Collection results are decimal, and the number of digits with the exact decimal point is fixed, and the value range is relatively fixed, such as temperature decimal
Acquisition index The acquisition result has a decimal number, and the number of digits with the exact decimal point is not fixed, and the value range is large float / double precision

Notes!
In addition to commonly used data types, YMatrix also supports expandable data types JSON, JSONB, MXKV and Spatial Data Type.