Updated on 2024-05-30 GMT+08:00

Test Method

This section describes performance testing of GeminiDB Influx instances, including the test environment, procedure, and results.

Test Environment

  • Region: CN-Hong Kong
  • AZ: AZ1
  • Elastic Cloud Server (ECS): m6.2xlarge.8 with 8 vCPUs, 64 GB of memory, and CentOS 7.6 64-bit image
  • Nodes per instance: 3
  • Instance specifications: 4 vCPUs | 16 GB, 8 vCPUs | 32 GB, 16 vCPUs | 64 GB, and 32 vCPUs | 128 GB

Test Tool

Time Series Benchmark Suite (TSBS) of the open-source community is used.

Test Metrics

  • Write performance test: Data points per second
  • Query performance test: Latency and OPS (operations per second)

Test Procedure

  1. Run the following command to generate the data to be written:

    tsbs_generate_data --use-case="devops" --seed=123 --scale=10000 --timestamp-start="2016-01-01T00:00:00Z" --timestamp-end="2016-01-01T12:00:00Z" --log-interval="10s" --format="influx" | gzip > /tmp/influx-data.gz

    --scale indicates the number of time series to be generated.

    --log-interval indicates the interval for collecting data.

  2. Run the following command to test write performance and obtain the required data:

    NUM_WORKERS=${numWorkers} BATCH_SIZE=${batchSize} DATABASE_HOST=${influxIP} DATABASE_PORT=${influxPORT} BULK_DATA_DIR=/tmp scripts/load_influx.sh

  3. Run the following commands to generate query statements:

    tsbs_generate_queries --use-case="devops" --seed=123 --scale=10000 --timestamp-start="2016-01-01T00:00:00Z" --timestamp-end="2016-01-01T12:00:01Z" --queries=20 --query-type="high-cpu-all" --format="influx" | gzip > /tmp/influx-20queries-high-cpu-all-12h-frequency.gz

    tsbs_generate_queries --use-case="devops" --seed=123 --scale=10000 --timestamp-start="2016-01-01T00:00:00Z" --timestamp-end="2016-01-01T12:00:01Z" --queries=1000000 --query-type="single-groupby-1-8-1" --format="influx" | gzip > /tmp/influx-1000000queries-single-groupby-1-8-1-12h-frequency.gz

    tsbs_generate_queries --use-case="devops" --seed=123 --scale=10000 --timestamp-start="2016-01-01T00:00:00Z" --timestamp-end="2016-01-01T12:00:01Z" --queries=500 --query-type="double-groupby-1" --format="influx" | gzip > /tmp/influx-500queries-double-groupby-1-12h-frequency.gz

    tsbs_generate_queries --use-case="devops" --seed=123 --scale=10000 --timestamp-start="2016-01-01T00:00:00Z" --timestamp-end="2016-01-01T12:00:01Z" --queries=50 --query-type="double-groupby-all" --format="influx" | gzip > /tmp/influx-50queries-double-groupby-all-12h-frequency.gz

    tsbs_generate_queries --use-case="devops" --seed=123 --scale=10000 --timestamp-start="2016-01-01T00:00:00Z" --timestamp-end="2016-01-01T12:00:01Z" --queries=200 --query-type="lastpoint" --format="influx" | gzip > /tmp/influx-200queries-lastpoint-12h-frequency.gz

    tsbs_generate_queries --use-case="devops" --seed=123 --scale=10000 --timestamp-start="2016-01-01T00:00:00Z" --timestamp-end="2016-01-01T12:00:01Z" --queries=500 --query-type="groupby-orderby-limit" --format="influx" | gzip > /tmp/influx-500queries-groupby-orderby-limit-12h-frequency.gz

    Ensure that values of fields --use-case, -seed, --scale, and --timestamp-start must be the same as those values set when data is generated in 1.

    --timestamp-end indicates the second after data generation ends.

    --queries indicates the number of generated queries.

    --queries-type indicates the type of generated queries. For details, see Table 1.

  4. Run the following commands to query performance data:

    cat /tmp/influx-20queries-high-cpu-all-12h-frequency.gz | gunzip | tsbs_run_queries_influx --workers=${numWorkers} --print-interval 10 --urls=(http|https)://${influxIP}:${influxPORT}

    cat /tmp/influx-1000000queries-single-groupby-1-8-1-12h-frequency.gz | gunzip | tsbs_run_queries_influx --workers=${numWorkers} --print-interval 10000 --urls=(http|https)://${influxIP}:${influxPORT}

    cat /tmp/influx-500queries-double-groupby-1-12h-frequency.gz | gunzip | tsbs_run_queries_influx --workers=${numWorkers} --print-interval 50 -- urls=(http|https)://${influxIP}:${influxPORT}

    cat /tmp/influx-50queries-double-groupby-all-12h-frequency.gz | gunzip | tsbs_run_queries_influx --workers=${numWorkers} --print-interval 10 -- urls=(http|https)://${influxIP}:${influxPORT}

    cat /tmp/influx-200queries-lastpoint-12h-frequency.gz | gunzip | tsbs_run_queries_influx --workers=${numWorkers} --print-interval 10 -- urls=(http|https)://${influxIP}:${influxPORT}

    cat /tmp/influx-500queries-groupby-orderby-limit-12h-frequency.gz | gunzip | tsbs_run_queries_influx --workers=${numWorkers} --print-interval 50 -- urls=(http|https)://${influxIP}:${influxPORT}

Test Models

Table 1 Test models involved

Test Model

Description

Example Statement

load

100% insertion.

-

high-cpu-all

Queries all the readings where one metric is above a threshold across all hosts for a period of time.

SELECT * from cpu where usage_user > 90.0 and time >= '2020-11-01T05:24:55Z' and time < '2020-11-01T17:24:55Z'

single-groupby-1-8-1

Queries the maximum value of one metric for 8 hosts for a period of time.

SELECT max(usage_user) from cpu where (hostname = 'host_61885' or hostname = 'host_51710' or hostname = 'host_9380' or hostname = 'host_46446' or hostname = 'host_67623' or hostname = 'host_54344' or hostname = 'host_82215' or hostname = 'host_7458') and time >= '2020-11-01T19:38:15Z' and time < '2020-11-01T20:38:15Z' group by time(1m)

single-groupby-1-1-1

Queries the maximum value of one metric for 1 host for a period of time.

SELECT max(usage_user) from cpu where (hostname = 'host_6334') and time >= '2016-01-01T03:03:21Z' and time < '2016-01-01T04:03:21Z' group by time(1m)

cpu-max-all-1

Queries the maximum value of all metrics for 1 host for a period of time.

SELECT max(usage_user),max(usage_system),max(usage_idle),max(usage_nice),max(usage_iowait),max(usage_irq),max(usage_softirq),max(usage_steal),max(usage_guest),max(usage_guest_nice) from cpu where (hostname = 'host_1166') and time >= '2016-01-01T00:23:32Z' and time < '2016-01-01T08:23:32Z' group by time(1h)