Numeric, String, Date, and Time

This document describes the commonly used data types supported by YMatrix.

The main types covered are:

  • Numeric
  • String
  • Date and time

1 Numeric Types

YMatrix supports four categories of numeric types:

  • Integers: Integer types with different ranges, all signed (support positive and negative values):

    1. smallint
    2. int
    3. bigint
  • Fixed-point numbers: Used for values with a fixed number of decimal places:

    1. decimal
  • Floating-point numbers: Used for values with variable precision:

    1. real
    2. double precision
  • Auto-increment types: Typically used for auto-incrementing ID columns:

    1. serial
    2. bigserial

Detailed information is as follows:

Type Alias Size Range Description
smallint int2 2 bytes -32768 ~ 32767 Small-range integer
int int4 4 bytes -2147483648 ~ 2147483647 Standard integer
bigint int8 8 bytes -9223372036854775808 ~ 9223372036854775807 Large-range integer
serial serial4 4 bytes 1 ~ 2147483647 Auto-incrementing integer
bigserial serial8 8 bytes 1 ~ 9223372036854775807 Large-range auto-incrementing integer
real float4 4 bytes 6-digit precision Variable precision
double precision float8 8 bytes 15-digit precision Variable precision
decimal numeric Variable No limit Fixed precision

2 String Types

YMatrix supports two main string types:

  • Fixed-length: char
  • Variable-length:
    1. varchar: Length-limited
    2. text: Unlimited length

Detailed information is as follows:

Type Alias Size Range Description
character [ (n) ] char [ (n) ] 1 byte + n Up to n characters Fixed-length string; padded with spaces if not fully filled
character varying [ (n) ] varchar [ (n) ] 1 byte + string size Up to n characters Variable-length string with limit; maximum 10,485,760 bytes
text 1 byte + string length Any length Variable-length string with no limit

3 Date and Time Types

YMatrix supports three main date and time types:

  • Date: Represents only the calendar date (year, month, day), without time:

    1. date
  • Time: Represents only the time of day, without date:

    1. time [without time zone] (without time zone)
    2. time with time zone (with time zone)
  • Timestamp: Represents both date and time:

    1. timestamp [without time zone] (without time zone)
    2. timestamp with time zone (with time zone)

Detailed information is as follows:

Type Alias Size Range Description
date 4 bytes 4713 BC ~ 294,277 AD Calendar date (year, month, day)
time [ (p) ] [ without time zone ] 8 bytes 00:00:00[.000000] ~ 24:00:00[.000000] Time of day
time [ (p) ] with time zone timetz 12 bytes 00:00:00+1359 ~ 24:00:00-1359 Time of day with time zone
timestamp [ (p) ] [ without time zone ] 8 bytes 4713 BC ~ 294,277 AD Date and time
timestamp [ (p) ] with time zone timestamptz 8 bytes 4713 BC ~ 294,277 AD Date and time with time zone

3.1 Impact of Time Zone on Time Types

A notable feature in time types is time zone. By default, time types are defined as without time zone. What is a time zone, and how does it affect time data?

You can understand this by creating a test table.

=# CREATE TABLE test (
    c1 int,
    c2 timestamp,
    c3 timestamp with time zone
   )
   USING MARS3
   DISTRIBUTED BY (c1)
   ORDER BY (c1,c2);

The test table includes two timestamp columns: one without time zone (c2) and one with time zone (c3). Insert test data and query:

=# INSERT INTO test VALUES(1, now(), now());

=# SELECT * FROM test;
 c1 |            c2            |             c3
----+--------------------------+-----------------------------
  1 | 2021-12-01 14:54:09.4783 | 2021-12-01 14:54:09.4783+08
(1 row)

=# SHOW timezone;
   TimeZone
---------------
 Asia/Shanghai
(1 row)

The query result shows that both timestamps display the same time. However, the c3 column (with time zone) includes +08, indicating it belongs to UTC+8 (China Standard Time). The SHOW timezone command confirms the current session time zone is set to Asia/Shanghai (Beijing time).

Now, change the time zone and requery:

=# SET timezone TO 'Japan';

=# SELECT * FROM test;
 c1 |            c2            |             c3
----+--------------------------+-----------------------------
  1 | 2021-12-01 14:54:09.4783 | 2021-12-01 15:54:09.4783+09
(1 row)

After changing the time zone to Japan, the value in c3 (with time zone) is now displayed as 15:54:09 with +09, reflecting UTC+9. The c2 column (without time zone) remains unchanged.

In summary:

  • Columns of type timestamp with time zone store time zone-aware timestamps and adjust display based on the session time zone.
  • Columns of type timestamp without time zone store raw timestamps without time zone information and display the same value regardless of session time zone.
  • When systems are deployed across multiple time zones, using timestamp with time zone is recommended.
  • Both timestamp variants use the same storage size. Using time zone support is therefore advantageous with no additional cost.

3.2 Best Practices for Data Types in Time Series Scenarios

Time series data typically falls into the following categories:

  • Timestamp: The time when a metric is collected.
  • Device identifier: A unique identifier for each device.
  • Collected metrics: Time series data collected from devices.

Recommended data types:

Type Description Best Practice
Timestamp Collection time, including date and time timestamp with time zone
Device identifier String-based identifier varchar / text
Device identifier Fixed-length device ID, such as serial number char
Device identifier Use a separate device table with auto-incrementing ID serial / bigserial
Collected metric Integer-valued measurements smallint / int / bigint
Collected metric Decimal values with fixed precision and limited range (e.g., temperature) decimal
Collected metric Decimal values with variable precision and large range float / double precision

Note!
In addition to the common data types described above, YMatrix supports extensible data types such as JSON, JSONB, MXKV and spatial data types.