Updated on 2025-08-22 GMT+08:00

Using CarbonData for First Query

Tool Overview

The first query of CarbonData is slow, which may cause a delay for nodes that have high requirements on real-time performance.

The tool provides the following functions:

  • Preheat the tables that have high requirements on query delay for the first time.

Tool Usage

  1. Install the Spark client.

    For details, see Installing a Client.

  2. Log in to the Spark client node as the client installation user.
  3. Go to the {Client installation directory}/Spark2x/spark/bin directory and run the following command:

    start-prequery.sh

    Configure prequeryParams.properties by referring to Table 1.

    Table 1 Parameters

    Parameter

    Description

    Example Value

    spark.prequery.period.max.minute

    Maximum preheating duration, in minutes.

    60

    spark.prequery.tables

    Table name configuration, database.table:int. The table name supports the wildcard (*). int indicates the duration (unit: day) within which the table is updated before it is preheated.

    default.test*:10

    spark.prequery.maxThreads

    Maximum number of concurrent threads during preheating

    50

    spark.prequery.sslEnable

    The value is true in security mode and false in non-security mode.

    true

    spark.prequery.driver

    IP address and port number of JDBCServer. The format is IP address:Port number. If multiple servers need to be preheated, enter multiple IP address:Port number of the servers and separate them with commas (,).

    192.168.0.2:22550

    spark.prequery.sql

    SQL statement for preheating. Different statements are separated by colons (:).

    The statement configured in spark.prequery.sql is executed in each preheated table. The table name is replaced with %s.

    SELECT COUNT(*) FROM %s;SELECT * FROM %s LIMIT 1

    spark.security.url

    URL required by JDBC in security mode

    ;saslQop=auth-conf;auth=KERBEROS;principal=spark2x/hadoop.hadoop.com@HADOOP.COM;

    Script Usage

    Command format: sh start-prequery.sh

    To run this command, place user.keytab or jaas.conf (either of them) and krb5.conf (mandatory) in the conf directory.

    • Currently, this tool supports only Carbon tables.
    • This tool initializes the Carbon environment and pre-reads table metadata to JDBCServer. Therefore, this tool is more suitable for multi-active instances and static allocation mode.