Using CarbonData for First Query
Tool Overview
The first query of CarbonData is slow, which may cause a delay for nodes that have high requirements on real-time performance.
The tool provides the following functions:
- Preheat the tables that have high requirements on query delay for the first time.
Tool Usage
- Install the Spark client.
For details, see Installing a Client.
- Log in to the Spark client node as the client installation user.
- Go to the {Client installation directory}/Spark2x/spark/bin directory and run the following command:
start-prequery.sh
Configure prequeryParams.properties by referring to Table 1.
Table 1 Parameters Parameter
Description
Example Value
spark.prequery.period.max.minute
Maximum preheating duration, in minutes.
60
spark.prequery.tables
Table name configuration, database.table:int. The table name supports the wildcard (*). int indicates the duration (unit: day) within which the table is updated before it is preheated.
default.test*:10
spark.prequery.maxThreads
Maximum number of concurrent threads during preheating
50
spark.prequery.sslEnable
The value is true in security mode and false in non-security mode.
true
spark.prequery.driver
IP address and port number of JDBCServer. The format is IP address:Port number. If multiple servers need to be preheated, enter multiple IP address:Port number of the servers and separate them with commas (,).
192.168.0.2:22550
spark.prequery.sql
SQL statement for preheating. Different statements are separated by colons (:).
The statement configured in spark.prequery.sql is executed in each preheated table. The table name is replaced with %s.
SELECT COUNT(*) FROM %s;SELECT * FROM %s LIMIT 1
spark.security.url
URL required by JDBC in security mode
;saslQop=auth-conf;auth=KERBEROS;principal=spark2x/hadoop.hadoop.com@HADOOP.COM;
Script Usage
Command format: sh start-prequery.sh
To run this command, place user.keytab or jaas.conf (either of them) and krb5.conf (mandatory) in the conf directory.
- Currently, this tool supports only Carbon tables.
- This tool initializes the Carbon environment and pre-reads table metadata to JDBCServer. Therefore, this tool is more suitable for multi-active instances and static allocation mode.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot