Updated on 2024-10-09 GMT+08:00

Exporting the Query Result Set

This section describes how to use the SELECT INTO OUTFILE command to export the Doris query result set to a specified storage system in a specified file format.

  • The export command does not check whether the file and file path exist. The semantics of the remote storage system determines whether to automatically create a path or overwrite an existing file.
  • If an error occurs during the export, some exported files may remain on the remote storage system. Doris does not clear these files. You need to manually clear them.
  • The timeout period of the export command is the same as that of the query command. You can set the timeout period through SET query_timeout=xxx.
  • For a query whose result set is empty, a file whose size is 0 is still generated.
  • File splitting ensures that a row of data is completely stored in a single file. Therefore, the file size is not strictly equal to the value of max_file_size.
  • For some functions whose output is invisible characters, such as BITMAP and HLL, the output is \N, that is, NULL.
  • Currently, the output type of some geographic information functions, such as ST_Point, is VARCHAR. However, the actual output value is encoded binary characters. These functions output garbled characters. For geographic functions, use ST_AsText for output.

Syntax

query_stmt

INTO OUTFILE "file_path"

[format_as]

[properties]

file_path

format_as

properties

format_as: indicates the export format. The value can be CSV, PARQUET, CSV_WITH_NAMES, CSV_WITH_NAMES_AND_TYPES or ORC. The default value is CSV.

Example

  • Export to HDFS

    Export the simple query result to the hdfs://path/to/result.txt file in CSV format.

    • Kerberos authentication is enabled for the cluster (the cluster is in security mode)

      SELECT * FROM example_db.test_export_tbl

      INTO OUTFILE "hdfs://192.168.67.78:25000/tmp/result_"

      FORMAT AS CSV

      PROPERTIES

      (

      "broker.name" = "broker_192_168_67_78",

      "column_separator" = ",",

      "line_delimiter" = "\n",

      "max_file_size" = "100MB",

      "broker.hadoop.security.authentication" = "kerberos",

      "broker.kerberos_principal" = "doris/hadoop.hadoop.com@HADOOP.COM",

      "broker.kerberos_keytab" = "${BIGDATA_HOME}/FusionInsight_Doris_8.3.0/install/FusionInsight-Doris-1.2.3/doris-fe/bin/doris.keytab"

      );

    • Kerberos authentication is disabled for the cluster (the cluster is in normal mode)

      SELECT * FROM example_db.test_export_tbl

      INTO OUTFILE "hdfs://192.168.67.78:25000/tmp/result_"

      FORMAT AS CSV

      PROPERTIES

      (

      "broker.name" = "broker_192_168_67_78",

      "column_separator" = ",",

      "line_delimiter" = "\n",

      "max_file_size" = "100MB",

      "broker.username"="hdfs",

      "broker.password"=""

      );

  • Export to Local File

    Before exporting data to a local file, you need to configure enable_outfile_to_local=true in the fe.conf file.

    select * from tbl1 limit 10

    INTO OUTFILE "file:///home/work/path/result_";