Quick onboard
Deployment
Data Modeling
Connecting
Migration
Query
Operations and Maintenance
Common Maintenance
Partition
Backup and Restore
Expansion
Mirroring
Resource Management
Security
Monitoring
Performance Tuning
Troubleshooting
Reference Guide
Tool guide
Data type
Storage Engine
Executor
Stream
DR (Disaster Recovery)
Configuration
Index
Extension
SQL Reference
This document provides examples of typical queries in OLAP and OLTP scenarios. Since most real-world query patterns are hybrid in nature, the classification here is not strictly exclusive.
Note!
Understanding the query requirements of a scenario is one of the key steps in building an accurate data model.
OLAP stands for Online Analytical Processing. It is commonly referred to as a data warehousing scenario and is primarily used for decision support. OLAP enables complex data analysis, allowing data scientists, business analysts, and knowledge workers to examine data from multiple perspectives. It supports business intelligence (BI), data mining, and other decision-support applications.
Example:
=# SELECT *,
row_number() OVER (ORDER BY c1 DESC),
rank() OVER (ORDER BY c1 DESC),
dense_rank() OVER (ORDER BY c1 DESC)
FROM t1;
Key Points:
row_number(), rank(), and dense_rank() to compute different types of rankings:row_number() function assigns a unique sequential rank to each row based on the descending order of column c1.rank() function assigns a rank to each distinct value in c1, ordered by c1 in descending order. Ties receive the same rank, and subsequent ranks are skipped accordingly.dense_rank() function assigns a rank to each distinct value in c1, ordered by c1 in descending order. Ties receive the same rank, but no ranks are skipped afterward.Example:
=# SELECT *,
first_value(c1) OVER (ORDER BY c1 DESC),
last_value(c1) OVER (ORDER BY c1 DESC)
FROM t1;
Example:
=# SELECT *,
SUM(c1) OVER (ORDER BY c1 DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS sum_c1 -- Frame declaration includes all rows from first to last
FROM t1;
Window framing can also control the scope of aggregate calculations. For example, computing monthly revenue and its average over the current month and two months before and after:
=# SELECT month,
revenue,
avg(revenue) OVER (ORDER BY month ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)
FROM t;
Key Points:
Retrieve records of individuals whose income falls within the 40%–70% range (inclusive) in each city:
=# SELECT id,name,salary,city
FROM (
SELECT *,
ntile(10) OVER (
PARTITION BY city
ORDER BY salary DESC
) AS bucket
FROM city_salary
) AS t
WHERE t.bucket >= 4 AND t.bucket <= 7;
Key Points:
ntile() to divide records in the city_salary table into 10 equal-sized buckets. It then filters to return only those in buckets 4 through 7.Calculate the percentage change in monthly revenue compared to the previous month:
=# WITH detail(m, r, l) AS (
SELECT month,
revenue,
lag(revenue, 1) OVER (ORDER BY month) AS last_month_revenue
FROM sales
)
SELECT m, ((r-l)::float8/l * 100)::decimal(18,2) || '%'
FROM detail;
Key Points:
(r-l)::float8/l * 100)::decimal(18,2) || '%': Computes growth rate by subtracting last month's revenue from the current month's, dividing by last month's revenue. Type casting converts the result to a percentage with two decimal places.Calculate each month’s revenue as a percentage of its quarterly total:
=# SELECT *,
revenue::real / quarter_revenue * 100 AS percent_of_quarter
FROM (
SELECT quarter,
month,
revenue,
sum(revenue) OVER (PARTITION BY quarter) AS quarter_revenue
FROM sales2
) AS t;
Identify records where values differ from the previous record. This type of query is useful for detecting changes in otherwise stable datasets:
=# SELECT time, read FROM (
SELECT time,
read,
read - lag(read) OVER (ORDER BY TIME) AS diff
FROM disk
WHERE time BETWEEN '2021-04-10 21:00:00'::timestamp AND '2021-04-10 21:01:00'::timestamp
AND tag_id = 1 ) ht
WHERE diff IS NULL OR diff != 0
ORDER BY time;
Key Points:
lag() to compute the difference between the current and previous read values.WHERE diff IS NULL OR diff != 0 retains only records where the difference is NULL or non-zero.Compute the 20th percentile of read speed and 30th percentile of write speed, ordered by tag_id:
=# SELECT tag_id,
percentile_cont(0.2) WITHIN GROUP (ORDER BY read) AS read,
percentile_cont(0.3) WITHIN GROUP (ORDER BY write) AS write
FROM disk
GROUP BY tag_id
ORDER BY tag_id;
If the function parameter is 0.5, it computes the median.
For example, use multi-table joins and aggregation to compare retail store metrics across holidays:
=# WITH holiday_rst AS (
SELECT year,holiday,date,row_number() OVER(PARTITION BY year,holiday ORDER BY date)
AS the_day_of_the_holiday
FROM dim_holiday_info
)
SELECT t.year,t.holiday,t.date,t.the_day_of_the_holiday,b.date AS corresponding_date_of_previous_year
FROM holiday_rst t
LEFT JOIN holiday rst b
ON t.year = b.year + 1
AND t.holiday = b.holiday
AND t.the_day_of_the_holiday = b.the_day_of_the_holiday;
Key Points:
the_day_of_the_holiday) to each day within a holiday period per year.In the field of Business Intelligence (BI), multidimensional analysis allows users to examine datasets across various dimensions. Through data visualization tools, insights hidden in large datasets are presented to decision-makers in the form of charts, graphs, and dashboards, enabling interactive exploration and forecasting.
Perform aggregations across multiple dimensions and present results together:
=# SELECT city,
season,
category,
sum(value)
FROM bi_sales
GROUP BY GROUPING SETS (city, season, category)
ORDER by (city);
city | season | category | sum
----------+--------+-----------+------
Beijing | | | 1098
Shanghai | | | 1089
Guangzhou| | | 1080
| Q3 | | 1989
| | Groceries | 1089
| Q1 | | 189
| | Electrics | 999
| | Books | 1179
| Q2 | | 1089
(9 rows)
The aggregated results show:
value for each city.value for each season.value for each product category.OLTP stands for Online Transaction Processing. It refers to transactional processing scenarios typically used for daily operations. OLTP systems handle fast, accurate processing of transactions such as ATM withdrawals, online banking, point-of-sale purchases, e-commerce orders, and hotel or flight reservations. They support real-time execution of numerous database operations including inserts, updates, deletes, and queries.
Ad hoc queries allow users to dynamically specify filtering conditions based on their immediate needs. The system generates reports or results on demand. Unlike predefined application queries, ad hoc queries are user-defined and flexible.
=# SELECT * FROM sales WHERE customer_id=100;