JDBC Source Table
Function
The JDBC connector is a Flink's built-in connector to read data from a database.
Prerequisites
- An enhanced datasource connection with the instances has been established, so that you can configure security group rules as required.
- For details about how to set up an enhanced datasource connection, see Enhanced Datasource Connections in the Data Lake Insight User Guide.
- For details about how to configure security group rules, see Security Group Overview in the Virtual Private Cloud User Guide.
- In Flink cross-source development scenarios, there is a risk of password leakage if datasource authentication information is directly configured. You are advised to use the datasource authentication provided by DLI.
For details about datasource authentication, see Introduction to Datasource Authentication.
Precautions
When creating a Flink OpenSource SQL job, you need to set Flink Version to 1.12 on the Running Parameters tab of the job editing page, select Save Job Log, and set the OBS bucket for saving job logs.
Syntax
create table jbdcSource ( attr_name attr_type (',' attr_name attr_type)* (','PRIMARY KEY (attr_name, ...) NOT ENFORCED) (',' watermark for rowtime_column_name as watermark-strategy_expression) ) with ( 'connector' = 'jdbc', 'url' = '', 'table-name' = '', 'username' = '', 'password' = '' );
Parameters
Parameter |
Mandatory |
Default Value |
Data Type |
Description |
---|---|---|---|---|
connector |
Yes |
None |
String |
Connector to be used. Set this parameter to jdbc. |
url |
Yes |
None |
String |
Database URL. |
table-name |
Yes |
None |
String |
Name of the table where the data will be read from the database. |
driver |
No |
None |
String |
Driver required for connecting to the database. If you do not set this parameter, it will be automatically derived from the URL. |
username |
No |
None |
String |
Database authentication username. This parameter must be configured in pair with password. |
password |
No |
None |
String |
Database authentication password. This parameter must be configured in pair with username. |
scan.partition.column |
No |
None |
String |
Name of the column used to partition the input. For details, see Partitioned Scan. |
scan.partition.num |
No |
None |
Integer |
Number of partitions to be created. For details, see Partitioned Scan. |
scan.partition.lower-bound |
No |
None |
Integer |
Lower bound of values to be fetched for the first partition. For details, see Partitioned Scan. |
scan.partition.upper-bound |
No |
None |
Integer |
Upper bound of values to be fetched for the last partition. For details, see Partitioned Scan. |
scan.fetch-size |
No |
0 |
Integer |
Number of rows fetched from the database each time. If this parameter is set to 0, the SQL hint is ignored. |
scan.auto-commit |
No |
true |
Boolean |
Whether each statement is committed in a transaction automatically. |
pwd_auth_name |
No |
None |
String |
Name of datasource authentication of the password type created on DLI. If this parameter is set, you do not need to set the username and password in SQL statements. |
Partitioned Scan
To accelerate reading data in parallel Source task instances, Flink provides the partitioned scan feature for the JDBC table. The following parameters describe how to partition the table when reading in parallel from multiple tasks.
- scan.partition.column: name of the column used to partition the input. The data type of the column must be number, date, or timestamp.
- scan.partition.num: number of partitions.
- scan.partition.lower-bound: minimum value of the first partition.
- scan.partition.upper-bound: maximum value of the last partition.
- When a table is created, the preceding partitioned scan parameters must all be specified if any of them is specified.
- The scan.partition.lower-bound and scan.partition.upper-bound parameters are used to decide the partition stride instead of filtering rows in the table. All rows in the table are partitioned and returned.
Data Type Mapping
MySQL Type |
PostgreSQL Type |
Flink SQL Type |
---|---|---|
TINYINT |
- |
TINYINT |
SMALLINT TINYINT UNSIGNED |
SMALLINT INT2 SMALLSERIAL SERIAL2 |
SMALLINT |
INT MEDIUMINT SMALLINT UNSIGNED |
INTEGER SERIAL |
INT |
BIGINT INT UNSIGNED |
BIGINT BIGSERIAL |
BIGINT |
BIGINT UNSIGNED |
- |
DECIMAL(20, 0) |
BIGINT |
BIGINT |
BIGINT |
FLOAT |
REAL FLOAT4 |
FLOAT |
DOUBLE DOUBLE PRECISION |
FLOAT8 DOUBLE PRECISION |
DOUBLE |
NUMERIC(p, s) DECIMAL(p, s) |
NUMERIC(p, s) DECIMAL(p, s) |
DECIMAL(p, s) |
BOOLEAN TINYINT(1) |
BOOLEAN |
BOOLEAN |
DATE |
DATE |
DATE |
TIME [(p)] |
TIME [(p)] [WITHOUT TIMEZONE] |
TIME [(p)] [WITHOUT TIMEZONE] |
DATETIME [(p)] |
TIMESTAMP [(p)] [WITHOUT TIMEZONE] |
TIMESTAMP [(p)] [WITHOUT TIMEZONE] |
CHAR(n) VARCHAR(n) TEXT |
CHAR(n) CHARACTER(n) VARCHAR(n) CHARACTER VARYING(n) TEXT |
STRING |
BINARY VARBINARY BLOB |
BYTEA |
BYTES |
- |
ARRAY |
ARRAY |
Example
This example uses JDBC as the data source and Print as the sink to read data from the RDS MySQL database and write the data to the Print result table.
- Create an enhanced datasource connection in the VPC and subnet where RDS MySQL locates, and bind the connection to the required Flink elastic resource pool. For details, see Enhanced Datasource Connections.
- Set RDS MySQL security groups and add inbound rules to allow access from the Flink queue. Test the connectivity using the RDS address by referring to Testing Address Connectivity. If the connection is successful, the datasource is bound to the queue. Otherwise, the binding fails.
- Log in to the RDS MySQL database, create table orders in the Flink database, and insert data.
Create table orders in the Flink database.
CREATE TABLE `flink`.`orders` ( `order_id` VARCHAR(32) NOT NULL, `order_channel` VARCHAR(32) NULL, `order_time` VARCHAR(32) NULL, `pay_amount` DOUBLE UNSIGNED NOT NULL, `real_pay` DOUBLE UNSIGNED NULL, `pay_time` VARCHAR(32) NULL, `user_id` VARCHAR(32) NULL, `user_name` VARCHAR(32) NULL, `area_id` VARCHAR(32) NULL, PRIMARY KEY (`order_id`) ) ENGINE = InnoDB DEFAULT CHARACTER SET = utf8mb4 COLLATE = utf8mb4_general_ci;
Insert data into the table.insert into orders( order_id, order_channel, order_time, pay_amount, real_pay, pay_time, user_id, user_name, area_id) values ('202103241000000001', 'webShop', '2021-03-24 10:00:00', '100.00', '100.00', '2021-03-24 10:02:03', '0001', 'Alice', '330106'), ('202103251202020001', 'miniAppShop', '2021-03-25 12:02:02', '60.00', '60.00', '2021-03-25 12:03:00', '0002', 'Bob', '330110');
- Create a Flink OpenSource SQL job. Enter the following job script and submit the job.
When you create a job, set Flink Version to 1.12 on the Running Parameters tab. Select Save Job Log, and specify the OBS bucket for saving job logs. Change the values of the parameters in bold as needed in the following script.
CREATE TABLE jdbcSource ( order_id string, order_channel string, order_time string, pay_amount double, real_pay double, pay_time string, user_id string, user_name string, area_id string ) WITH ( 'connector' = 'jdbc', 'url' = 'jdbc:mysql://MySQLAddress:MySQLPort/flink',--flink is the database name created in RDS MySQL. 'table-name' = 'orders', 'username' = 'MySQLUsername', 'password' = 'MySQLPassword' ); CREATE TABLE printSink ( order_id string, order_channel string, order_time string, pay_amount double, real_pay double, pay_time string, user_id string, user_name string, area_id string ) WITH ( 'connector' = 'print' ); insert into printSink select * from jdbcSource;
- Perform the following operations to view the data result in the taskmanager.out file:
- Log in to the DLI console. In the navigation pane, choose Job Management > Flink Jobs.
- Click the name of the corresponding Flink job, choose Run Log, click OBS Bucket, and locate the folder of the log you want to view according to the date.
- Go to the folder of the date, find the folder whose name contains taskmanager, download the taskmanager.out file, and view result logs.
The data result is as follows:
+I(202103241000000001,webShop,2021-03-24 10:00:00,100.0,100.0,2021-03-24 10:02:03,0001,Alice,330106) +I(202103251202020001,miniAppShop,2021-03-25 12:02:02,60.0,60.0,2021-03-25 12:03:00,0002,Bob,330110)
FAQ
None
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot