ClickHouse Application Development Overview

Introduction to ClickHouse

ClickHouse is a column-oriented database for online analytical processing. It supports SQL query and provides good query performance. The aggregation analysis and query performance based on large and wide tables is excellent, which is one order of magnitude faster than other analytical databases.

Advantages:

High data compression ratio
Multi-core parallel computing
Vectorized computing engine
Supporting nested data structure
Supporting sparse indexes
Supporting INSERT and UPDATE

Application scenarios:

Real-time data warehouse
The streaming computing engine (such as Flink) is used to write real-time data to ClickHouse. With the excellent query performance of ClickHouse, multi-dimensional and multi-mode real-time query and analysis requests can be responded within subseconds.

Offline query
Large-scale service data is imported to ClickHouse to construct a large wide table with hundreds of millions to tens of billions of records and hundreds of dimensions. It supports personalized statistics collection and continuous exploratory query and analysis at any time to assist business decision-making and provide excellent query experience.

Introduction to the ClickHouse Development Interface

ClickHouse is developed using C++ and positioned as a DBMS. It supports HTTP and Native TCP network interface protocols and multiple driver modes such as JDBC and ODBC. clickhouse-jdbc of the community edition is recommended for application development.

Concepts

Cluster
Cluster is a logical concept in ClickHouse. It can be defined by users as required, which is different from the general understanding of cluster. Multiple ClickHouse nodes are loosely coupled and deployed independently.

Shard
A shard is a horizontal division of a cluster. A cluster can consist of multiple shards.
Replica
Multiple replicas can be created for one shard.
Partition
Partitions vertically divide local replicas into different parts.

MergeTree
ClickHouse has a huge table engine system. As the basic table engine of the family system, MergeTree provides functions such as data partitioning, primary indexes, and secondary indexes. When creating a table, you need to specify the table engine. Different table engines determine the characteristics of a data table, for example, the features of a data table, in what format data is stored, and how data is loaded.

Parent topic: ClickHouse Development Guide (Normal Mode)

Previous topic: ClickHouse Development Guide (Normal Mode)

Next topic: ClickHouse Application Development Process