Time-Series Data Modeling Example for Connected Vehicles

This document is the third in the "Time-Series Data Modeling" section. YMatrix believes that data model design directly impacts the value of data consumption and usage. Therefore, beyond technical explanations, this entire section aims to provide you with a clear understanding of time-series data models—their concepts, applications, and evolution.

  • The first article, "What Is a Time-Series Data Model?", answers a series of progressively deeper questions to help you fully grasp the concept of time-series data modeling.
  • The second, "Approaches to Time-Series Modeling," provides theoretical guidance on YMatrix relational model design principles.
  • The third and fourth articles present data modeling examples for connected vehicles (this document) and smart home scenarios. These demonstrate best practices for different time-series use cases in YMatrix, guided by the principles from "Approaches to Time-Series Modeling."

Note!
This document serves only as a reference example. Before formal modeling design, we recommend thoroughly reading the YMatrix Architecture and component principles. Refer to the "Reference Guide" section for details on component mechanisms.

1 What Are Connected Vehicles?

Connected vehicles are a typical application of the Internet of Things (IoT). According to Hong Kong's Applied Science and Technology Research Institute, connected vehicle technology refers to low-latency communication systems between vehicles, pedestrians, road infrastructure, and the cloud. Through real-time information exchange within such systems, coordination among drivers, vehicles, and infrastructure is achieved. This enables timely traffic updates and warnings to road users, enhancing safety and supporting driving assistance. Additionally, connected vehicle technology supports real-time traffic monitoring, incident management, and route planning to improve traffic efficiency. In the long term, it can support autonomous driving by helping detect hidden risks and improving overall road safety.

The main categories of connected vehicle communication are shown in the table below:

Communication Endpoint Technical Implementation Application Scenarios
Vehicle-to-Vehicle (V2V) Information exchange and sharing between vehicles, including position, speed, and other status data Assessing traffic flow conditions
Vehicle-to-Pedestrian (V2P) Users communicate with vehicles via Wi-Fi, Bluetooth, or cellular networks Enables users to monitor and control vehicles using mobile devices
Vehicle-to-Infrastructure (V2I) Information exchange between vehicles and roads via fixed roadside communication facilities Monitoring road surface conditions and guiding optimal vehicle routes
Vehicle-to-Cloud (V2C) Vehicles transmit data to cloud platforms via satellite or cellular networks Receiving control commands from platforms; real-time data sharing in the cloud
In-Vehicle Devices Data transmission among internal vehicle components Real-time monitoring and control of in-vehicle devices, enabling digital in-car control systems

2 Best Practices for Connected Vehicle Data Modeling

Imagine you are a vehicle-side application developer responsible for managing your company’s vehicle metrics platform A. You want to design a data model for this platform in YMatrix. Your approach might follow these steps:

Step Description
1 Requirements Gathering
2 Modeling Design and Implementation
3 Model Testing

2.1 Requirements Gathering

  • Characteristics of vehicle metrics platform A:

    • Massive data volume: at the PB scale.
    • Large number of devices: potentially thousands of sensors simultaneously collecting and transmitting data.
    • Vehicle metrics: mostly fixed, but must allow dynamic expansion.
    • Metric data types:
      1. Integer (int) metrics: e.g., battery level represented as integers from 0 to 100.
      2. Floating-point (float) metrics: e.g., cumulative mileage.
      3. Boolean (boolean) metrics: e.g., left-turn signal on/off status.
      4. Text (text) metrics: e.g., unique device identifier.
  • Query patterns of vehicle metrics platform A:

    • Primarily point queries and detailed record lookups based on device.

Note!
In this scenario, "device" refers to individual sensors on each vehicle. Each sensor is considered one "device".

  • Complete data flow:

Vehicle-side precomputed sensor data / telemetry → Cloud → YMatrix → Monitoring dashboard

Conclusion
After thorough preliminary research, we conclude that the metrics for vehicle platform A are largely deterministic. However, a dynamic column should be included to accommodate future or changing metrics. Given the large data volume and high number of devices, a structured + semi-structured wide-table variant model is recommended. Use the MARS3 storage engine, which offers strong compression performance, as the foundation for table creation.

2.2 Modeling Design and Implementation

2.2.1 Table Structure

Based on the requirements analysis, the following table structure is proposed:

Use the MXKV data type for dynamic columns. Creating this type requires installing an extension first.

=# CREATE EXTENSION matrixts;
=# CREATE TABLE V2X ( 
    ts timestamp with time zone, 
    device_id text,
    vehicle_type text,
    longitude float,
    latitude float,
    altitude float,
    speed float,
    left_turn_signal boolean,
    right_turn_signal boolean,
    emergency_flashers boolean,
    power int,
    gas int,
    windshield_wiper boolean,
    mileage float,
    signal_strength text,
    power_mode text,
    control_mode text,
    charging_status text,
    mxkv mxkv_int4
)
USING MARS3
DISTRIBUTED BY (device_id,vehicle_type)
ORDER BY (device_id,ts,vehicle_type)
PARTITION BY range(ts)
(
 START ('2023-01-15') INCLUSIVE 
 END ('2023-01-30') EXCLUSIVE 
 EVERY (interval '1 day'), 
 DEFAULT PARTITION default_p
);

In this example:

  • Label columns: device_id, vehicle_type.
  • Metric columns: longitude, latitude, altitude, speed, left_turn_signal, right_turn_signal, emergency_flashers, power, gas, windshield_wiper, mileage, signal_strength, power_mode, control_mode, charging_status — fifteen in total.
  • Structured data columns listed above; semi-structured data stored in the mxkv column for dynamically extended metrics.
  • Supported metric types: int, float, text, boolean.
  • Distribution key: (device_id, vehicle_type).
  • Sort key: (device_id, ts, vehicle_type).
  • Partition key: ts.
  • One partition per day from 2023-01-15 to 2023-01-30.

2.2.2 Model Testing

Step Test Plan SQL Statement
1 View latest 10 records SELECT * FROM <表名> ORDER BY ts DESC LIMIT 10
2 Query total row count SELECT COUNT(*) FROM <表名>
3 Retrieve all latest metrics for a specific device SELECT * FROM <表名> WHERE <设备标签列名> = '<设备标签值>' ORDER BY <时间戳列名> DESC LIMIT 1
...

Insert 100 test rows for structured metrics.

=# INSERT INTO V2X (ts, device_id, vehicle_type, longitude, latitude, altitude, speed, left_turn_signal, right_turn_signal, emergency_flashers, power, gas, windshield_wiper, mileage, signal_strength, power_mode, control_mode, charging_status)
SELECT 
    timestamp '2023-01-15 00:00:00+00' + (random() * (timestamp '2023-01-30 00:00:00+00' - timestamp '2023-01-15 00:00:00+00')) AS ts, 
    'device_' || (random() * 100 + 1)::int AS device_id, 
    CASE (random() * 3)::int 
        WHEN 0 THEN 'car'
        WHEN 1 THEN 'truck'
        WHEN 2 THEN 'motorcycle'
        ELSE 'others'
    END AS vehicle_type, 
    -180 + random() * 360 AS longitude, 
    -90 + random() * 180 AS latitude, 
    random() * 500 AS altitude, 
    random() * 100 AS speed, 
    random() < 0.5 AS left_turn_signal, 
    random() < 0.5 AS right_turn_signal, 
    random() < 0.5 AS emergency_flashers, 
    random() * 100 AS power, 
    random() * 100 AS gas, 
    random() < 0.5 AS windshield_wiper, 
    random() * 10000 AS mileage, 
    CASE (random() * 3)::int 
        WHEN 0 THEN 'low'
        WHEN 1 THEN 'medium'
        WHEN 2 THEN 'high'
        ELSE 'others'
    END AS signal_strength, 
    CASE (random() * 3)::int 
        WHEN 0 THEN 'normal'
        WHEN 1 THEN 'eco'
        WHEN 2 THEN 'sport'
        ELSE 'others'
    END AS power_mode, 
    CASE (random() * 3)::int 
        WHEN 0 THEN 'auto'
        WHEN 1 THEN 'manual'
        ELSE 'others'
    END AS control_mode, 
    CASE (random() * 3)::int 
        WHEN 0 THEN 'not_charging'
        WHEN 1 THEN 'charging'
        WHEN 2 THEN 'charged'
        ELSE 'others'
    END AS charging_status
FROM generate_series(1, 100);

Test results:

  1. View latest 10 records
=# SELECT * FROM V2X ORDER BY ts DESC LIMIT 10;
              ts               | device_id  | vehicle_type |      longitude      |      latitude      |      altitude      |       speed        | left_turn_signal | right_turn_signal | emergency_flashers | power | gas | 
windshield_wiper |      mileage       | signal_strength | power_mode | control_mode | charging_status | mxkv 
-------------------------------+------------+--------------+---------------------+--------------------+--------------------+--------------------+------------------+-------------------+--------------------+-------+-----+-
-----------------+--------------------+-----------------+------------+--------------+-----------------+------
 2023-01-29 18:26:28.2555+00   | device_7   | others       |    77.7250493925452 | -71.44711665175421 |  470.3520368782552 |  10.08919814806859 | t                | f                 | f                  |    91 |  60 | 
f                | 3335.6115546297715 | medium          | others     | manual       | not_charging    | 
 2023-01-29 09:31:03.475343+00 | device_10  | truck        |  110.79930700971727 |  80.10310285177269 |  363.9362759505627 | 0.8470622257185312 | t                | t                 | f                  |    68 |  59 | 
f                |  8477.632404196207 | high            | normal     | others       | charged         | 
 2023-01-29 08:13:59.117597+00 | device_99  | truck        | -109.73747608981242 | -89.35249080260903 |  48.85333218970267 | 37.727205912382544 | t                | f                 | t                  |    31 |  36 | 
t                |  7473.970943404033 | high            | eco        | others       | charging        | 
 2023-01-29 05:58:26.571911+00 | device_46  | others       |    51.8998425853313 |  88.27679475399947 | 480.07287781433484 | 49.226744573126524 | f                | t                 | f                  |    42 |   4 | 
t                |  7143.216304908968 | high            | sport      | others       | not_charging    | 
 2023-01-29 04:15:40.989367+00 | device_40  | motorcycle   | -155.34683651071745 |  6.175953177055931 |  181.5449423808442 |  50.05742613215105 | f                | f                 | t                  |    85 |  87 | 
f                |  3206.801779860875 | low             | normal     | others       | charging        | 
 2023-01-29 00:00:25.771633+00 | device_100 | truck        |  151.84348070359067 |  61.94081118610583 |  79.71495904976855 |  37.50073341871314 | f                | f                 | f                  |    12 |  63 | 
f                | 3344.5220642645168 | others          | sport      | manual       | not_charging    | 
 2023-01-28 20:35:29.229556+00 | device_29  | motorcycle   |   -90.3624239076261 | -9.007833078214986 | 62.318822164742826 |  20.92006269193405 | t                | t                 | t                  |    27 |  97 | 
f                |  9388.986611159318 | others          | eco        | manual       | charging        | 
 2023-01-28 13:41:36.964772+00 | device_80  | car          | -18.009506474683548 |  64.13247721331942 |  452.8918234900896 | 19.506606969701323 | f                | t                 | t                  |    43 |  62 | 
t                |  9398.868263826578 | low             | normal     | manual       | charging        | 
 2023-01-28 11:12:06.491849+00 | device_55  | car          |     88.971600816884 | -10.29589404625213 |  372.6116487798965 |  68.00809686681362 | f                | f                 | t                  |    21 |  76 | 
f                |  430.6571575941476 | others          | others     | others       | charging        | 
 2023-01-28 09:24:02.536764+00 | device_5   | car          | -5.8755335883314785 | 14.557768126530704 | 351.21329889642715 |  9.202603983735003 | t                | t                 | f                  |     5 |  16 | 
t                | 1403.2798564383952 | others          | sport      | others       | others          | 
(10 rows)
  1. Query total row count
=# SELECT COUNT(*) FROM V2X;
 count
-------
   100
(1 row)
  1. Query latest non-null value of a specific metric for a given device

Retrieve the latest speed for device 5.

=# SELECT last_not_null(speed,ts) AS last_speed FROM V2X WHERE device_id = 'device_10' LIMIT 1;
                       last_speed                       
--------------------------------------------------------
 ["0.8470622257185312","2023-01-29 09:31:03.475343+00"]

Add a new MXKV-type metric car_length to the dynamic column mxkv, with data type mxkv_int4 and value 350 (unit: cm). Before insertion, to optimize compression and query performance, use the UDF mxkv_import_keys() to define the key in advance and declare the set of keys expected in the data.

=# SELECT mxkv_import_keys('{"car_length": 350}');
 mxkv_import_keys
------------------
 car_length
(1 row)

Insert the data.

=# INSERT INTO V2X(mxkv) values('{"car_length":350}');
INSERT 0 1

Query the inserted key-value pair.

=# SELECT mxkv->'car_length' as car_length FROM V2X;

Note!
For more details on MXKV usage, see Extended Data Types.

Note!
Only basic test examples are provided here. For more testing scenarios, refer to the Getting Started section in the graphical interface. In production environments, design test queries according to actual query requirements.