Help Center> MapReduce Service> Component Operation Guide (LTS)> Using Spark2x> Basic Operation> Scenario-Specific Configuration> Configuring Whether to Display Spark SQL Statements Containing Sensitive Words
Updated on 2022-11-18 GMT+08:00

Configuring Whether to Display Spark SQL Statements Containing Sensitive Words

Scenario

SQL statements executed by users may contain sensitive information (such as passwords). Disclosure of such information may incur security risks. You can configure the spark.sql.redaction.string.regex parameter to shield SQL keywords that contain sensitive words in logs and on the web UI.

Spark SQL statements consist of two parts:

  1. Spark SQL statements in logs:
    • Driver log: In the JDBCServer service, each time an SQL statement is executed through Beeline, the corresponding SQL statement is added to the Driver log, for example, Running query'show tables' with 0f8fee16-4291-4854-a7b4-b87a162f7cbb.
    • eventLog log: If the eventLog writing function is enabled for each Spark application (spark.eventLog.enabled is set to true), event logs will be written. SQL statements executed using JDBCServer and Spark SQL will also be added to the eventLog file.
  2. Spark SQL statements on the web UI:
    • SparkUI: When executing SQL statements, you can view the executed SQL statements on the Jobs and Stages tab pages on the SparkUI.
    • HistoryServer: HistoryServer reads the eventLog file and displays app information on the page. Therefore, if the eventLog file contains SQL statement records, you can view the corresponding SQL statements on the HistoryServer page.

Configuration

Table 1 Parameter description

Parameter

Description

Default Value

spark.sql.redaction.string.regex

Regular expression that determines which part in a string generated by Spark is sensitive words. If the regular expression matches sensitive words of the string, these sensitive words are replaced with *********(redacted).

NOTE:

The value must be a regular expression.

pwd|password

  • If Spark Beeline is used, you need to restart JDBCServer for the configuration to take effect. If spark-sql is used, you need to restart spark-sql for the configuration to take effect.
  • The preceding parameters are valid only for the SQL statements that are executed after the configuration.