Updated on 2024-12-13 GMT+08:00

Using Hue to Execute SparkSQL

Scenario

You can use Hue to execute SparkSQL statements in a cluster on the UI.

Configuring Spark2x

Before using the SparkSql editor, you need to modify the Spark2x configuration.

  1. Go to the Spark2x configuration page. For details, see Modifying Cluster Service Configuration Parameters.
  2. Set the Spark2x multi-instance mode. Search for and modify the following parameters of the Spark2x service:

    Parameter

    Value

    spark.thriftserver.proxy.enabled

    false

    spark.scheduler.allocation.file

    #{conf_dir}/fairscheduler.xml

  3. Go to the JDBCServer2x customization page and add the following customized items to the spark.core-site.customized.configs parameter:

    Table 1 Custom parameters

    Parameter

    Value

    hadoop.proxyuser.hue.groups

    *

    hadoop.proxyuser.hue.hosts

    *

  4. Save the configuration and restart the meta and Spark2x services.

Accessing the Editor

  1. Access the Hue web UI. For details, see Accessing the Hue Web UI.
  2. In the navigation tree on the left, click and choose SparkSql. The SparkSql page is displayed.

    SparkSql supports the following functions:

    • Executes and manages SparkSql statements.
    • Views the SparkSql statements saved by the current user in Saved Queries.
    • Queries SparkSql statements executed by the current user in Query History.

Executing SparkSql Statements

  1. Select a SparkSql database from the Database drop-down list box. The default database is default.

    The system displays all available tables. You can enter a keyword of the table name to search for the desired table.

    Figure 1 Selecting a database

  2. Click the desired table name. All columns in the table are displayed.

    Move the cursor to the row of the table and click . Column details are displayed.

  3. In the SparkSql statement editing area, enter the query statement.

    Click the triangle next to and select Explain. The editor checks the syntax and execution plan of the entered statements. If the statements have syntax errors, the editor reports Error while compiling statement.

  4. Click to execute the SparkSql statement.

    Figure 2 Executing a statement
    • If you want to use the entered SparkSql statements again, click to save them.
    • Advanced query configuration:

      Click in the upper right corner to configure information such as files, functions, and settings.

    • Viewing the information of shortcut keys:

      Click in the upper right corner to view the syntax and keyboard shortcut information.

    • To format the SparkSql statement, click the triangle next to and select Format.
    • To delete an entered SparkSql statement, click the triangle next to and select Clear.
    • Viewing historical records:

      Click Query History to view the SparkSql running status. You can view the history of all the statements or only the saved statements. If many historical records exist, you can enter keywords in the text box to search for desired records.

Viewing Execution Results

  1. View the execution results below the execution area on SparkSql. The Query History tab page is displayed by default.
  2. Click a result to view the execution result of the executed statement.

Managing Query Statements

  1. Click Saved Queries.
  2. Click a saved statement. The system automatically adds the statement to the editing area.