Exporting the Doris Query Result Set
This section describes how to use the SELECT INTO OUTFILE command to export the Doris query result set to a specified storage system in a specified file format.
Notes and Constraints
- The export command does not check whether a file and the file path exist, whether a path is automatically created, or whether an existing file is overwritten. These items are determined by the semantics of the remote storage system.
- If an error occurs during the export, some exported files may remain on the remote storage system. Doris does not clear these files. You need to manually clear them.
- The timeout of the export command is the same as the timeout of the query. It can be set by SET query_timeout=xxx.
- For empty result query, there will be an empty file.
- File splitting ensures that a row of data is completely stored in a single file. Therefore, the file size is not exactly the same as the value of max_file_size.
- For functions that output invisible characters, such as BITMAP and HLL types, the result is displayed as \N, that is, NULL.
- Currently, the output type of some geographic information functions, such as ST_Point, is VARCHAR. These functions output garbled characters because the actual output values are encoded binary characters. For geographic functions, use ST_AsText for output.
Syntax
<query_stmt>
INTO OUTFILE "file_path"
[format_as]
[properties]
In the preceding command:
- query_stmt indicates the specific query statement.
- file_path indicates the path for storing the exported file.
- format_as indicates the format in which result is exported. The value can be CSV, PARQUET, CSV_WITH_NAMES, CSV_WITH_NAMES_AND_TYPES, or ORC. The default value is CSV.
For more information, see OUTFILE.
Example of Exporting Query Results
- Export to HDFS
Export the simple query result to the hdfs://path/tmp/result.txt file in CSV format using Broker.
- Kerberos authentication is enabled for the cluster (in security mode)
SELECT * FROM example_db.test_export_tbl INTO OUTFILE "hdfs://192.168.67.78:25000/tmp/result_" FORMAT AS CSV PROPERTIES ( "broker.name" = "broker_192_168_67_78", "column_separator" = ",", "line_delimiter" = "\n", "max_file_size" = "100MB", "broker.hadoop.security.authentication" = "kerberos", "broker.kerberos_principal" = "doris/hadoop.hadoop.com@HADOOP.COM", "broker.kerberos_keytab" = "/home/omm/doris_keytab/doris.keytab", "broker.dfs.nameservices" = "hacluster", "broker.dfs.ha.namenodes.hacluster" = "37,36", "broker.dfs.namenode.rpc-address.hacluster.37" = "IP address of the active NameNode instance:RPC port number", "broker.dfs.namenode.rpc-address.hacluster.36" = "IP address of the standby NameNode instance:RPC port number", "broker.dfs.client.failover.proxy.provider.hacluster" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider" );
- Kerberos authentication is disabled for the cluster (in normal mode)
SELECT * FROM example_db.test_export_tbl INTO OUTFILE "hdfs://192.168.67.78:25000/tmp/result_" FORMAT AS CSV PROPERTIES ( "broker.name" = "broker_192_168_67_78", "column_separator" = ",", "line_delimiter" = "\n", "max_file_size" = "100MB", "broker.username"="hdfs", "broker.password"="", "broker.dfs.nameservices" = "hacluster", "broker.dfs.ha.namenodes.hacluster" = "37,36", "broker.dfs.namenode.rpc-address.hacluster.37" = "IP address of the active NameNode instance:RPC port number", "broker.dfs.namenode.rpc-address.hacluster.36" = "IP address of the standby NameNode instance:RPC port number", "broker.dfs.client.failover.proxy.provider.hacluster" = "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider" );
- Kerberos authentication is enabled for the cluster (in security mode)
- Export to Local File
Before exporting data to a local file, you need to configure enable_outfile_to_local=true in the fe.conf file.
select * from tbl1 limit 10 INTO OUTFILE "file:///home/work/path/result_";
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot