Raw
Function
The Raw format allows to read and write raw (byte based) values as a single column.
- This format encodes null values as null of byte[] type. This may have limitation when used in upsert-kafka, because upsert-kafka treats null values as a tombstone message (DELETE on the key). Therefore, we recommend avoiding using upsert-kafka connector and the raw format as a value.format if the field can have a null value.
- The raw format connector is built-in, no additional dependencies are required. For details, see Raw Format.
Supported Connectors
- Kafka
- Upsert Kafka
- FileSystem
Parameter Description
Parameter |
Mandatory |
Default Value |
Type |
Description |
---|---|---|---|---|
format |
Yes |
None |
String |
Format to be used. Set this parameter to raw. |
raw.charset |
No |
UTF-8 |
String |
Charset to encode the text string. |
raw.endianness |
No |
big-endian |
String |
Endianness to encode the bytes of numeric value. Valid values are big-endian and little-endian. You can search for endianness for more details. |
Data Type Mapping
The table below details the SQL types the format supports, including details of the serializer and deserializer class for encoding and decoding.
Flink SQL Type |
Value |
---|---|
CHAR/VARCHAR/STRING |
A UTF-8 (by default) encoded text string. The encoding charset can be configured by raw.charse. |
BINARY / VARBINARY / BYTES |
The sequence of bytes itself. |
BOOLEAN |
A single byte to indicate boolean value, 0 means false, 1 means true. |
TINYINT |
A single byte of the signed number value. |
SMALLINT |
Two bytes with big-endian (by default) encoding. The endianness can be configured by raw.endianness. |
INT |
Four bytes with big-endian (by default) encoding. The endianness can be configured by raw.endianness. |
BIGINT |
Eight bytes with big-endian (by default) encoding. The endianness can be configured by raw.endianness. |
FLOAT |
Four bytes with IEEE 754 format and big-endian (by default) encoding. The endianness can be configured by raw.endianness. |
DOUBLE |
Eight bytes with IEEE 754 format and big-endian (by default) encoding. The endianness can be configured by raw.endianness. |
RAW |
The sequence of bytes serialized by the underlying TypeSerializer of the RAW type. |
Example
Use Kafka to send data and output the data to Print.
- Create a datasource connection for the communication with the VPC and subnet where Kafka locates and bind the connection to the queue. Set a security group and inbound rule to allow access of the queue and test the connectivity of the queue using the Kafka IP address. For example, locate a general-purpose queue where the job runs and choose More > Test Address Connectivity in the Operation column. If the connection is successful, the datasource is bound to the queue. Otherwise, the binding fails.
- Create a Flink OpenSource SQL job and select Flink 1.15. Copy the following statement and submit the job:
CREATE TABLE kafkaSource ( log string ) WITH ( 'connector' = 'kafka', 'topic' = 'kafkaTopic', 'properties.bootstrap.servers' = 'KafkaAddress1:KafkaPort,KafkaAddress2:KafkaPort', 'properties.group.id' = 'GroupId', 'scan.startup.mode' = 'latest-offset', 'format' = 'raw' ); CREATE TABLE printSink ( log string ) WITH ( 'connector' = 'print' ); insert into printSink select * from kafkaSource;
- Insert the following data to the corresponding topic in Kafka:
47.29.201.179 - - [28/Feb/2019:13:17:10 +0000] "GET /?p=1 HTTP/2.0" 200 5316 "https://domain.com/?p=1" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36" "2.75"
- Perform the following operations to view the data result in the taskmanager.out file:
- Log in to the DLI console. In the navigation pane, choose Job Management > Flink Jobs.
- Click the name of the corresponding Flink job, choose Run Log, click OBS Bucket, and locate the folder of the log you want to view according to the date.
- Go to the folder of the date, find the folder whose name contains taskmanager, download the .out file, and view result logs.
+I[47.29.201.179 - - [28/Feb/2019:13:17:10 +0000] "GET /?p=1 HTTP/2.0" 200 5316 "https://domain.com/?p=1" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36" "2.75"]
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot