Example of Data Modeling in a Smart Home Scenario

This document is the fourth in the "Time-series Data Modeling" section. YMatrix believes that data model design directly impacts the value of data consumption and usage. Therefore, beyond technical explanations, this entire section aims to give you a clear understanding of time-series data models—their concepts, applications, and evolution.

  • The first article, "What Is a Time-series Data Model?", answers a series of progressively deeper questions to help you clearly understand the concept of time-series data modeling.
  • The second article, "Time-series Modeling Approach", provides theoretical guidance on YMatrix relational model design principles.
  • The third and fourth articles present data modeling examples for connected vehicle and smart home scenarios (this document), respectively. These demonstrate best practices for different time-series use cases in YMatrix, guided by the "Time-series Modeling Approach".

Note!
This article serves only as a reference example. Before formal modeling design, we recommend thoroughly reading the YMatrix Architecture and component principles. See the "Reference Guide" section for component details.

1 What Is a Smart Home?

A smart home is a typical application scenario of the Internet of Things (IoT). Smart Home, or Home Automation uses residential spaces as platforms, integrating technologies such as structured cabling, network communication, security, automatic control, and audio/video systems. It connects household facilities into an efficient management system for residential infrastructure and daily routines, enhancing safety, convenience, comfort, aesthetics, and enabling energy-efficient, eco-friendly living environments.

2 Best Practices for Smart Home Data Modeling

Imagine you are a developer or operations engineer for a smart air conditioning metrics platform B. You want to design a data model for this platform in YMatrix. A possible approach includes:

Step Action
1 Requirements gathering
2 Modeling design and implementation
3 Model testing

2.1 Requirements Gathering

  • Data structure of smart AC metrics platform B:

Here, category, product, device, and metric form the unique identifier for each metric data point. Category, product, and device are static tags (Tagset) in the AC metrics table, while metric represents the dynamic metric name.

  • Data characteristics of smart AC metrics platform B:

    • Data scale: Around 100,000 product types, each with varying numbers of shipped devices—typically between 1 million and 10 million. The number of metrics per device is generally under 100.
    • Product metrics: Metrics across products of the same category are similar but not identical.
    • Metric types: Not fixed or predefined.
  • Query patterns of smart AC metrics platform B:

    • Latest value query: Retrieve the most recent value of a specific metric from a specific device of a specific product.
    • Aggregation query per product: Sum a specific metric across all devices of a given product for a given day.
  • Complete data flow:

Sensor (product) → Pre-computed and raw data → Cloud → YMatrix → Smart Home Application

Conclusion: After thorough preliminary research, we conclude that for this smart home scenario in YMatrix, the set of metrics cannot be fully determined at design time. Given the large data volume and high number of devices, a narrow table model should be used. Base data types should be text and float, and the MARS3 storage engine—known for strong compression—should be selected when creating tables.

2.2 Modeling Design and Implementation

2.2.1 Table Structure

Create multiple narrow tables based on different metric data types.

=# CREATE TABLE public.iot_float (
    product_type text, 
    product text,
    device text,
    metric text,
    value float4,
    ts timestamp with time zone
)
USING MARS3
DISTRIBUTED BY (product_type,product,device)
ORDER BY (device,ts,product_type,product)
PARTITION BY range(ts)
(
 START ('2023-01-15') INCLUSIVE
 END ('2023-01-22') EXCLUSIVE
 EVERY (interval '1 hour'),
 DEFAULT PARTITION default_p
);
=# CREATE TABLE public.iot_text (
    product_type text, 
    product text,
    device text, 
    metric text,
    value text,
    ts timestamp with time zone
)
USING MARS3
DISTRIBUTED BY (product_type,product,device)
ORDER BY (device,ts,product_type,product)
PARTITION BY range(ts)
(
 START ('2023-01-15') INCLUSIVE
 END ('2023-01-22') EXCLUSIVE
 EVERY (interval '1 hour'),
 DEFAULT PARTITION default_p
);

In this example:

  • Tags are product_type, product, and device.
  • The metric name is stored in the metric column; values are in the value column.
  • Two metric types are supported: float and text.
  • Distribution key: (product_type, product, device).
  • Sort key: (device, ts, product_type, product).
  • Partition key: ts.
  • Partitions span hourly intervals from 2023-01-15 to 2023-01-22.

2.3 Model Testing

Step Test Plan SQL Statement
1 View latest 10 records SELECT * FROM <表名> ORDER BY ts DESC LIMIT 10
2 Query total row count SELECT COUNT(*) FROM <表名>
3 Query all latest metrics from a specific device SELECT * FROM <表名> WHERE <设备标签列名> = '<设备标签值>' ORDER BY <时间戳列名> DESC LIMIT 1
...

Insert 100 test records into the iot_float table.

=# INSERT INTO public.iot_float (product_type, product, device, metric, value, ts)
SELECT
    CASE (random() * 3)::int
        WHEN 0 THEN 'sensor'
        WHEN 1 THEN 'actuator'
        WHEN 2 THEN 'gateway'
        ELSE 'others'
    END AS product_type,
    'product_' || (random() * 10 + 1)::int AS product,
    'device_' || (random() * 100 + 1)::int AS device,
    CASE (random() * 5)::int
        WHEN 0 THEN 'temperature'
        WHEN 1 THEN 'humidity'
        WHEN 2 THEN 'pressure'
        WHEN 3 THEN 'voltage'
        WHEN 4 THEN 'current'
        ELSE 'others'
    END AS metric,
    random() * 100 AS value,
    timestamp '2023-01-15 00:00:00+00' + (random() * (timestamp '2023-01-22 00:00:00+00' - timestamp '2023-01-15 00:00:00+00')) AS ts
FROM generate_series(1, 100);

Insert 100 test records into the iot_text table.

=# INSERT INTO public.iot_text (product_type, product, device, metric, value, ts)
SELECT
    CASE (random() * 3)::int
        WHEN 0 THEN 'sensor'
        WHEN 1 THEN 'actuator'
        WHEN 2 THEN 'gateway'
        ELSE 'others'
    END AS product_type,
    'product_' || (random() * 10 + 1)::int AS product,
    'device_' || (random() * 100 + 1)::int AS device,
    CASE (random() * 5)::int
        WHEN 0 THEN 'status'
        WHEN 1 THEN 'message'
        WHEN 2 THEN 'error'
        WHEN 3 THEN 'warning'
        WHEN 4 THEN 'log'
        ELSE 'others'
    END AS metric,
    'value_' || (random() * 1000 + 1)::int AS value,
    timestamp '2023-01-15 00:00:00+00' + (random() * (timestamp '2023-01-22 00:00:00+00' - timestamp '2023-01-15 00:00:00+00')) AS ts
FROM generate_series(1, 100);

Test Results:

  1. View latest 10 rows from iot_float table
=# SELECT * FROM iot_float ORDER BY ts DESC LIMIT 10;
product_type |  product   |  device   |   metric    |   value    |              ts
--------------+------------+-----------+-------------+------------+-------------------------------
 actuator     | product_3  | device_24 | others     |  38.504875 | 2023-01-21 18:38:55.188439+08
 others      | product_10 | device_59 | voltage     |  35.519894 | 2023-01-21 18:11:00.799357+08
 sensor       | product_11 | device_15 | temperature |  74.955025 | 2023-01-21 17:04:56.706237+08
 others      | product_7  | device_93 | pressure    |  62.177837 | 2023-01-21 14:52:45.494772+08
 gateway      | product_5  | device_1  | humidity    | 0.23014386 | 2023-01-21 14:51:14.290224+08
 others      | product_6  | device_61 | voltage     |  46.473114 | 2023-01-21 13:55:51.427582+08
 gateway      | product_4  | device_50 | pressure    |  53.413925 | 2023-01-21 10:11:41.670681+08
 actuator     | product_9  | device_13 | pressure    |  24.377035 | 2023-01-21 09:09:19.720012+08
 gateway      | product_3  | device_26 | current     |  58.887447 | 2023-01-21 07:38:12.482037+08
 gateway      | product_8  | device_82 | pressure    |  76.386024 | 2023-01-21 06:14:41.306756+08
(10 rows)

View latest 10 rows from iot_text table

=# SELECT * FROM iot_text ORDER BY ts DESC LIMIT 10;
 product_type |  product   |  device   | metric  |   value   |              ts
--------------+------------+-----------+---------+-----------+-------------------------------
 gateway      | product_10 | device_70 | status  | value_933 | 2023-01-21 22:03:51.85526+08
 others      | product_10 | device_46 | status  | value_471 | 2023-01-21 21:27:29.235879+08
 actuator     | product_4  | device_65 | log     | value_72  | 2023-01-21 21:20:27.515338+08
 gateway      | product_1  | device_34 | error   | value_864 | 2023-01-21 20:17:05.031651+08
 gateway      | product_6  | device_79 | warning | value_681 | 2023-01-21 18:03:51.699731+08
 actuator     | product_6  | device_92 | message | value_463 | 2023-01-21 17:51:20.218522+08
 gateway      | product_8  | device_44 | log     | value_526 | 2023-01-21 17:28:13.374733+08
 actuator     | product_4  | device_2  | warning | value_78  | 2023-01-21 16:52:34.770392+08
 sensor       | product_7  | device_49 | log     | value_258 | 2023-01-21 10:40:49.467672+08
 actuator     | product_7  | device_36 | error   | value_981 | 2023-01-21 08:05:55.809313+08
(10 rows)
  1. Count total rows in iot_float table
=# SELECT COUNT(*) FROM iot_float;
count
-------
   100
(1 row)

Count total rows in iot_text table

=# SELECT COUNT(*) FROM iot_text;
count
-------
   100
(1 row)
  1. Query all latest metrics reported by a specific device in iot_float table
=# SELECT * FROM iot_float WHERE device = 'device_1' ORDER BY ts DESC LIMIT 1;
 product_type |  product  |  device  |  metric  |   value    |              ts
--------------+-----------+----------+----------+------------+-------------------------------
 gateway      | product_5 | device_1 | humidity | 0.23014386 | 2023-01-21 14:51:14.290224+08
(1 row)

Query all latest metrics reported by a specific device in iot_text table

=# SELECT * FROM iot_text WHERE device = 'device_66' ORDER BY ts DESC LIMIT 1;
product_type |  product  |  device   | metric |  value   |              ts
--------------+-----------+-----------+--------+----------+-------------------------------
 others      | product_5 | device_66 | error  | value_94 | 2023-01-20 20:00:48.991428+08
(1 row)

Note!
Only simple test cases are shown here. For more testing scenarios, refer to the Getting Started section in the graphical interface. In production environments, design test queries according to actual query requirements.