ClickHouse Application Development Overview
Introduction
ClickHouse is a column-based database oriented to online analysis and processing. It supports SQL query and provides good query performance. The aggregation analysis and query performance based on large and wide tables is excellent, which is one order of magnitude faster than other analytical databases.
Advantages:
- High data compression ratio
- Multi-core parallel computing
- Vectorized computing engine
- Nested data structure
- Sparse indexes
- INSERT and UPDATE
Application scenarios:
- Real-time data warehouse
The streaming computing engine (such as Flink) is used to write real-time data to ClickHouse. With the excellent query performance of ClickHouse, multi-dimensional and multi-mode real-time query and analysis requests can be responded within subseconds.
- Offline query
Large-scale service data is imported to ClickHouse to construct a large wide table with hundreds of millions to tens of billions of records and hundreds of dimensions. It supports personalized statistics collection and continuous exploratory query and analysis at any time to assist business decision-making and provide excellent query experience.
ClickHouse Application Development APIs
ClickHouse is developed using C++ and positioned as a DBMS. It supports HTTP and Native TCP network interface protocols and multiple driver modes such as JDBC and ODBC. clickhouse-jdbc of the community edition is recommended for application development.
Concepts
- Cluster
Cluster is a logical concept in ClickHouse. It can be defined by users as required, which is different from the general understanding of cluster. Multiple ClickHouse nodes are loosely coupled and deployed independently.
- Shard
A shard is a horizontal division of a cluster. A cluster can consist of multiple shards.
- Replica
Multiple replicas can be created for one shard.
- Partition
Partitions vertically divide local replicas into different parts.
- MergeTree
ClickHouse has a huge table engine system. As the basic table engine of the family system, MergeTree provides functions such as data partitioning, primary indexes, and secondary indexes. When creating a table, you need to specify the table engine. Different table engines determine the characteristics of a data table, for example, the features of a data table, in what format data is stored, and how data is loaded.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot