Updated on 2024-11-29 GMT+08:00

Exporting the Query Result Set

This topic describes how to use the SELECT INTO OUTFILE command to export query results.

  • The export command does not check whether the file and file path exist. Whether the path will be automatically created or whether the existing file will be overwritten is entirely determined by the semantics of the remote storage system.
  • If an error occurs during the export process, the exported file may remain on the remote storage system. Doris will not clean these files. You need to clean them up.
  • The timeout of the export command is the same as the timeout of the query. It can be set by SET query_timeout = xxx.
  • For empty result query, there will be an empty file.
  • File splitting will ensure that a row of data is stored in a single file. Therefore, the size of the file is not strictly equal to max_file_size.
  • For functions whose output is invisible characters, such as BITMAP and HLL types, the output is \N, which is NULL.
  • At present, the output type of some geographic functions, such as ST_Point is VARCHAR, but the actual output value is an encoded binary character. Currently these functions will output garbled characters. For geographic functions, use ST_AsText for output.

Syntax

query_stmt

INTO OUTFILE "file_path"

[format_as]

[properties]

file_path

format_as

properties

format_as indicates the export format. The value can be CSV, PARQUET, CSV_WITH_NAMES, CSV_WITH_NAMES_AND_TYPES or ORC. The default value is CSV.

Example

  • Export to HDFS

    Export simple query results to the hdfs://path/to/result.txt file in CSV format.

    • Kerberos authentication is enabled for the cluster (the cluster is in security mode)

      SELECT * FROM example_db.test_export_tbl

      INTO OUTFILE "hdfs://192.168.67.78:25000/tmp/result_"

      FORMAT AS CSV

      PROPERTIES

      (

      "broker.name" = "broker_192_168_67_78",

      "column_separator" = ",",

      "line_delimiter" = "\n",

      "max_file_size" = "100MB",

      "broker.hadoop.security.authentication" = "kerberos",

      "broker.kerberos_principal" = "doris/hadoop.hadoop.com@HADOOP.COM",

      "broker.kerberos_keytab" = "${BIGDATA_HOME}/FusionInsight_Doris_8.3.1/install/FusionInsight-Doris-2.0.3/doris-fe/bin/doris.keytab"

      );

    • Kerberos authentication is disabled for the cluster (the cluster is in normal mode)

      SELECT * FROM example_db.test_export_tbl

      INTO OUTFILE "hdfs://192.168.67.78:25000/tmp/result_"

      FORMAT AS CSV

      PROPERTIES

      (

      "broker.name" = "broker_192_168_67_78",

      "column_separator" = ",",

      "line_delimiter" = "\n",

      "max_file_size" = "100MB",

      "broker.username"="hdfs",

      "broker.password"=""

      );

  • Export to local file

    Before exporting data to a local file, you need to configure enable_outfile_to_local=true in the fe.conf file.

    select * from tbl1 limit 10

    INTO OUTFILE "file:///home/work/path/result_";