Using CarbonData for First Query
Tool Overview
The first query of CarbonData is slow, which may cause a delay for nodes that have high requirements on real-time performance.
The tool provides the following functions:
- Preheat the tables that have high requirements on query delay for the first time.
Tool Usage
- Install the Spark client.
- Log in to the Spark client node as the client installation user.
- Go to the {Client installation directory}/Spark2x/spark/bin directory and run the following command:
start-prequery.sh
Configure prequeryParams.properties by referring to Table 1.
Table 1 Parameters Parameter
Description
Example Value
spark.prequery.period.max.minute
Maximum preheating duration, in minutes.
60
spark.prequery.tables
Table name configuration, database.table:int. The table name supports the wildcard (*). int indicates the duration (unit: day) within which the table is updated before it is preheated.
default.test*:10
spark.prequery.maxThreads
Maximum number of concurrent threads during preheating
50
spark.prequery.sslEnable
The value is true in security mode and false in non-security mode.
true
spark.prequery.driver
IP address and port number of JDBCServer. The format is IP address:Port number. If multiple servers need to be preheated, enter multiple IP address:Port number of the servers and separate them with commas (,).
192.168.0.2:22550
spark.prequery.sql
SQL statement for preheating. Different statements are separated by colons (:).
The statement configured in spark.prequery.sql is executed in each preheated table. The table name is replaced with %s.
SELECT COUNT(*) FROM %s;SELECT * FROM %s LIMIT 1
spark.security.url
URL required by JDBC in security mode
;saslQop=auth-conf;auth=KERBEROS;principal=spark2x/hadoop.hadoop.com@HADOOP.COM;
Script Usage
Command format: sh start-prequery.sh
To run this command, place user.keytab or jaas.conf (either of them) and krb5.conf (mandatory) in the conf directory.
- Currently, this tool supports only Carbon tables.
- This tool initializes the Carbon environment and pre-reads table metadata to JDBCServer. Therefore, this tool is more suitable for multi-active instances and static allocation mode.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.