Help Center/ GeminiDB/ GeminiDB Influx API/ Best Practices/ Multi-Level Downsampling

Updated on 2025-07-29 GMT+08:00

View PDF

Multi-Level Downsampling

Application Scenarios

In DevOps or IoT scenarios, users focus on statistical metrics (such as maximum, minimum, and average values) instead of historical data details.

A conventional approach involves storing all granular data, querying it, and performing calculations on demand. If historical data is retained only for feature calculation purposes, the conventional approach presents several drawbacks: the longer granular data is retained, the higher the storage costs incurred.

Introduction to Multi-Level Downsampling

Multi-level downsampling policies are available to data across time ranges. For instance, data from the last week to month is aggregated every 15 minutes (regardless of aggregation methods), while historical data in the last month is aggregated every hour.

Users may be sensitive to recent data but have little requirements for long-term data, so different multi-level downsampling policies need to be used. Multi-level downsampling not only meets query requirements for high-value data, but also ensures storage efficiency.

For example, raw data in the last 7 days is directly imported to a database. Data from the last 7 to 30 days is downsampled every 15 minutes and then stored in a database. Data from the last 30 days to 12 months is downsampled every hour and then stored in a database. As shown in Figure 1, assume that today is December 31, 2022. The dark blue area on the left displays data within the last 7 days. The gray blue area in the middle displays data from the last 7 to 30 days. The light blue area on the right displays data from the last 30 days to 12 months. After a period of time, data in the dark blue area is aggregated into the gray blue area every 15 minutes, and data in the gray blue area is aggregated into the light blue area every hour. There are four 15-minute segments in an hour, so aggregation is available as a convenience to avoid extra implementation overhead.

Figure 1 Multi-level downsampling grid
Click to enlarge

Raw data from the last 7 to 30 days as well as 30 days to 12 months will not be retained after downsampling.

Creating a Downsampling Task

Syntax:

Create DownSample [on <rp_name>| on <dbname>.<rp_name>|  ]((dataType(aggregators)...)) With Duration <timeDuration> SampleInterval(time Durations) TimeInterval(time Durations)

**Table 1** Parameters
Duration	SampleInterval	TimeInterval
Data retention period after downsampling	Next-level downsampling time	Sampling interval

Aggregation method definition:

dataType(aggfunctions...)

Example of an aggregation method:

integer(first,sum,count,last,min,max)
integer(min,max),float(sum)

For example:

Create a retention rule named rp1. Data is retained for seven days, and one shard is created every day.

create retention policy rp1 on mydb duration 7d replication 1 shard duration 1d

To create a downsampling task based on rp1, set the retention period of the sampled data to seven days. Details about data of the last day are stored. Data of the last one to two days is sampled every minute. Data generated two days ago is downsampled every 3 minutes.

Create DownSample on rp1 (float(sum,last),integer(max,min)) With Duration 7d sampleinterval(1d,2d) timeinterval(1m,3m)

The second values of sampleinterval and timeinterval must be integer multiples of their first values.
Valid example:
- sampleinterval(1d,2d)
- timeinterval(1m,3m)
Invalid example:
- sampleinterval(2d,3d)
- timeinterval(5m,6m)
The number of values of sampleinterval and timeinterval must be consistent.
Valid example:
- sampleinterval(1d,2d) and timeinterval(1m,3m)
Invalid example:
- sampleinterval(1d,2d) and timeinterval(3m)
- sampleinterval(1d) and timeinterval(1m,3m)
Duration controls the retention period of downsampled data. The value specified for rp1 is updated synchronously and can be the same as that specified for rp. Only first, last, sum, max, min, mean, and count are allowed.
float(sum,last) indicates that FLOAT data is downsampled using the sum() and last() functions.
The mechanism of integer(max,min) is similar to that of float(sum,last).

Showing Downsampling Tasks

Show all downsampling tasks of the default database.

SHOW DOWNSAMPLES

Show all downsampling tasks of a specified database.

SHOW DOWNSAMPLES ON <database name>

For example:

> show downsamples on ExampleDatabase
+--------+-------------------------+----------+----------------+--------------+
| rpName |     field_operator      | duration | sampleInterval | timeInterval |
+--------+-------------------------+----------+----------------+--------------+
| rp1    | float{max},integer{min} | 24h0m0s  | 1h0m0s,2h0m0s  | 1m0s,5m0s    |
+--------+-------------------------+----------+----------------+--------------+

Deleting Downsampling Tasks

Delete all downsampling tasks of a specified database.

Drop DownSamples 
Drop DownSamples on db0

Delete all downsampling tasks of a specified RP.

Drop DownSample on rp1 
Drop DownSample on db0.rp1

Parent Topic: Best Practices

Previous topic: GeminiDB's Functions for Efficient Data Analysis

Next topic: Suggestions on Alarm Rules of GeminiDB Influx Instance Metrics

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.

Chatbot