ClickHouse Result Table
Function
DLI exports Flink job data to ClickHouse result tables.
ClickHouse is a column-based database oriented to online analysis and processing. It supports SQL query and provides good query performance. The aggregation analysis and query performance based on large and wide tables is excellent, which is one order of magnitude faster than other analytical databases.
Prerequisites
- Ensure your jobs run on an exclusive queue (non-shared queue) of DLI.
- You have established an enhanced datasource connection to ClickHouse and set the port in the security group rule of the ClickHouse cluster as needed.
For details about how to set up an enhanced datasource connection. For details, see "Enhanced Datasource Connection" in the Data Lake Insight User Guide.
Precautions
- When you create a ClickHouse cluster for MRS, set the cluster version to MRS 3.1.0 and do not enable Kerberos authentication.
- Do not define a primary key in Flink SQL statements. Do not use any syntax that generates primary keys, such as insert into clickhouseSink select id, cout(*) from sourceName group by id.
- Flink supports the following data types: string, tinyint, smallint, int, long, float, double, date, timestamp, decimal, and Array.
The array supports only the int, bigint, string, float, and double data types.
Syntax
1 2 3 4 5 6 7 8 9 |
create table clickhouseSink ( attr_name attr_type (',' attr_name attr_type)* ) with ( 'connector.type' = 'clickhouse', 'connector.url' = '', 'connector.table' = '' ); |
Parameters
Example
Read data from a DIS table and insert the data into the test table of ClickHouse database flinktest.
- Create a DIS source table disSource.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
create table disSource( attr0 string, attr1 TINYINT, attr2 smallint, attr3 int, attr4 bigint, attr5 float, attr6 double, attr7 String, attr8 string, attr9 timestamp(3), attr10 timestamp(3), attr11 date, attr12 decimal(38, 18), attr13 decimal(38, 18) ) with ( "connector.type" = "dis", "connector.region" = "cn-xxxx-x", "connector.channel" = "xxxx", "format.type" = 'csv' );
- Create ClickHouse result table clickhouse and insert the data from the disSource table to the result table.
create table clickhouse( attr0 string, attr1 TINYINT, attr2 smallint, attr3 int, attr4 bigint, attr5 float, attr6 double, attr7 String, attr8 string, attr9 timestamp(3), attr10 timestamp(3), attr11 date, attr12 decimal(38, 18), attr13 decimal(38, 18), attr14 array < int >, attr15 array < bigint >, attr16 array < float >, attr17 array < double >, attr18 array < varchar >, attr19 array < String > ) with ( 'connector.type' = 'clickhouse', 'connector.url' = 'jdbc:clickhouse://xx.xx.xx.xx:xx/flinktest', 'connector.table' = 'test' ); insert into clickhouse select attr0, attr1, attr2, attr3, attr4, attr5, attr6, attr7, attr8, attr9, attr10, attr11, attr12, attr13, array [attr3, attr3+1], array [cast(attr4 as bigint), cast(attr4+1 as bigint)], array [cast(attr12 as float), cast(attr12+1 as float)], array [cast(attr13 as double), cast(attr13+1 as double)], array ['TEST1', 'TEST2'], array [attr7, attr7] from disSource;
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.