OpenTSDB Overview

OpenTSDB is a distributed, scalable time series database based on HBase. It stores time series data. Time series data refers to the data collected at different time points. This type of data reflects the change status or degree of an object over time.

OpenTSDB Architecture

OpenTSDB consists of a Time Series Daemon (TSD) as well as a set of command line utilities. Interaction with OpenTSDB is primarily implemented by running one or more TSDs. Each TSD is independent. There is no master server and no shared state, so you can run as many TSDs as required to handle any load you throw at it. Each TSD uses HBase in a CloudTable cluster to store and retrieve time series data. The data schema is highly optimized for fast aggregations of similar time series to minimize storage space. TSD users never need to directly access the underlying storage. You can communicate with the TSD through an HTTP API. All communications happen on the same port (the TSD figures out the protocol of the client by looking at the first few bytes it receives).

Figure 1 OpenTSDB architecture

Basic Concepts

  • data point: A time series data point consists of a metric, a timestamp, a value, and a set of tags. The data point indicates the value of a metric at a specific time point.
  • metric: Metrics include CPU usage, memory, and I/Os in system monitoring.
  • timestamp: A UNIX timestamp (seconds or milliseconds since Epoch), that is, the time when the value is generated.
  • value: The value of a metric is a JSON formatted event or a histogram/digest.
  • tag: A tag is a key-value pair consisting of Tagk and Tagv. It describes the time series the point belongs to.

    Tags allow you to separate similar data points from different sources or related entities, so you can easily graph them individually or in groups. One common use case for tags consists in annotating a data point with the name of the machine that produced it as well as name of the cluster or pool the machine belongs to. This allows you to easily make dashboards that show the state of your service on a per-server basis as well as dashboards that show an aggregated state across logical pools of servers.

Introduction to an OpenTSDB System Table

OpenTSDB stores time series data based on HBase. After OpenTSDB is enabled in a cluster, the system will create four HBase tables in the cluster. Table 1 describes the OpenTSDB system tables.

Do not modify the four HBase tables manually, because this may cause unavailable OpenTSDB.

Table 1 OpenTSDB system table

Table Name

Description

OPENTSDB.DATA

It stores data points. All OpenTSDB data is stored in this table. OpenTSDB is partitioned based on salt. By default, 20 regions are supported. Currently, the number of regions cannot be configured.

OPENTSDB.UID

It stores unique identifier (UID) mappings. Each metric in a data point is mapped to a UID, and each tag in a data point is mapped to a UID. At the same time, each UID is reversely mapped to the metric or tag. These mappings are stored in this table.

OPENTSDB.TREE

It stores metric structure information. This feature is disabled by default.

OPENTSDB.META

It stores time series indexes and metadata. This feature is disabled by default.