Updated on 2024-04-11 GMT+08:00

Adding an SQL Inspection

Scenario

You can add rules for specified tenants and SQL engines on FusionInsight Manager. The system will display hints on, intercept, or block SQL requests matched by the rules.

Exercise caution when you add or modify a SQL inspection rule for a cluster, enable a rule, and set the threshold. An improper rule may cause upper-layer service interruption.

Adding a Rule

  1. Log in to FusionInsight Manager as a user with the Manager administrator rights.
  2. Click Cluster and choose SQL Inspector. The SQL Inspector page is displayed.

    You can click View Supported Rules to view all SQL inspection rules supported by the current cluster.

  3. Click Add Rule. After the password of the current user is verified, the Add Rule page is displayed.
  4. Set the required parameters and click OK.

    Parameter

    Description

    Name

    Name of a SQL inspection rule

    ID

    Rule ID

    For details about meaning of the rules corresponding to the IDs, see Table 1.

    Tenant

    Click Add to select the name of the tenant to which the current rule will be associated.

    If you need to add a new tenant, plan and create a cluster tenant by referring to Tenant Resources.

    Services and Actions

    Click Add to specify the SQL engine to which this rule will be associated with and set the threshold parameters of the rule.

    Each rule can be associated with one SQL engine. If you want to configure a rule for other SQL engines, add new rules.

    • Service: Select the SQL engine associated with the current rule.
    • If an SQL request meets the rule, the system performs the following operations:
      • Hint: Record logs and display a hint for handling the SQL request. If the rule has parameters, you need to configure the threshold.
      • Intercept: Intercept the SQL request that meets the rule. If the rule has parameters, you need to configure the threshold.
      • Block: Block the SQL request that meets the rule. If the rule has parameters, you need to configure the threshold.
        NOTE:

        For static and dynamic interception rules, Hint and Block operations are supported. For blocking rules, only the Block operation is supported.

  5. View the added prevention rule on the SQL Defense page. The rule takes effect dynamically.

    To adjust the current rule, click Modify in the Operation column of the row that contains the target rule. After the user password is verified, you can modify rule parameters.

    Figure 1 Viewing SQL inspection rules

MRS SQL Inspection Rules

Table 1 MRS SQL inspection rules

ID

Description

Engine

Threshold

Example SQL Statement

static_0001

Check whether the number of occurrences of count(distinct) in the SQL statement exceeds the limit.

  • Hive
  • Spark
  • HetuEngine

Number of occurrences of count(distinct)

Recommended value: 10

SELECT COUNT(DISTINCT deviceId), COUNT(DISTINCT collDeviceId)

FROM table

GROUP BY deviceName, collDeviceName, collCurrentVersion;

static_0002

Check whether the not in <subquery> statement is used in the SQL statement.

  • Hive
  • Spark
  • HetuEngine

N/A

SELECT *

FROM Orders o

WHERE Orders.Order_ID not in (Select Order_ID

FROM HeldOrders h

where h.order_id = o.order_id);

static_0003

Check whether the number of joins in the SQL statement exceeds the limit.

  • Hive
  • Spark
  • HetuEngine

Number of joins

Recommended value: 20

N/A

static_0004

Check whether the number of union all times in the SQL statement exceeds the limit.

  • Hive
  • Spark
  • HetuEngine

Number of union all times

Recommended value: 20

select * from tables t1

union all select * from tables t2

union all select * from tables t3

union all select * from tables t4

union all select * from tables t5

union all select * from tables t6

union all select * from tables t7

union all select * from tables t8

union all select * from tables t9;

static_0005

The number of subquery nesting layers exceeds the limit.

  • Hive
  • Spark
  • HetuEngine

Maximum number of nested subqueries

Recommended value: 20

select * from (

with temp1 as (select * from tables)

select * from temp1);

static_0006

Check whether the length of the SQL statement string exceeds the upper limit.

  • Hive
  • Spark
  • HetuEngine

Length of the SQL string, in KB

Recommended value: 10

N/A

static_0007

Check whether the Cartesian product exists when multiple tables are associated.

  • Hive
  • Spark
  • HetuEngine

N/A

select * from A,B;

static_0008

Check whether alter table update operation is performed at the cluster level (on cluster).

ClickHouse

N/A

alter table testtb1 on cluster default_cluster update price=10.0 where id='100'

static_0009

Check whether alter table delete operation is performed at the cluster level (on cluster).

ClickHouse

N/A

alter table testtb1 on cluster default_cluster delete where id ='10'

static_0010

Check whether the alter table add column operation is performed at the cluster level (on cluster).

ClickHouse

N/A

alter table testtb1 on cluster default_cluster add column testc String

static_0011

Check whether the alter table drop column operation is performed at the cluster level (on cluster).

ClickHouse

N/A

alter table testtb1 on cluster default_cluster drop column testc

static_0012

Check whether the optimize final operation is performed at the cluster level (on cluster).

ClickHouse

N/A

optimize table testtb1 on cluster default_cluster final

static_0013

Check whether the drop table operation is performed at the cluster level (on cluster).

ClickHouse

N/A

drop table testtb1 on cluster default_cluster;

static_0014

Check whether the truncate table operation is performed at the cluster level (on cluster).

ClickHouse

N/A

truncate table testtb1 on cluster default_cluster;

dynamic_0001

Check whether the number of scanned files exceeds the limit.

  • Hive
  • Spark
  • HetuEngine

Number of files that will be scanned or have been scanned

Recommended value: 100,000

SELECT ss_ticket_number FROM store_sales WHERE ss_ticket_number=72291252 LIMIT 10;

dynamic_0002

Check whether the number of partitions involved in operations (select, delete, update, and alter) on a table exceeds the upper limit.

  • Hive
  • Spark
  • HetuEngine
  • ClickHouse

Number of partitions involved in the delete or alter operation

Recommended value: 10,000

DELETE FROM table_name WHERE column_name = value

dynamic_0003

When the right table of a join is a distributed table, check whether the data volume of the right table exceeds the upper limit.

ClickHouse

Number of rows in the right table when the join operation is performed.

Recommended value: 100,000,000

SELECT name, text FROM table_1 JOIN table_2 ON table_1.Id = table_2.Id

running_0001

Check whether the number of result rows returned by the Select statement to the client exceeds the upper limit.

  • Hive
  • Spark
  • HetuEngine
  • ClickHouse

Number of rows in the query result

Recommended value: 100,000

select * from table

running_0002

Check whether the peak memory usage of the SQL statement exceeds the absolute value limit.

  • Hive
  • Spark
  • HetuEngine
  • ClickHouse

Memory occupied by SQL running, in MB

N/A

running_0003

Check whether the running duration of the SQL statement exceeds the upper limit.

  • Hive
  • Spark
  • HetuEngine
  • ClickHouse

SQL running duration threshold, in seconds

N/A

running_0004

The amount of data scanned by the SQL statements.

  • Hive
  • Spark
  • HetuEngine
  • ClickHouse

Amount of data scanned by the SQL statement, in GB

Recommended value: 10,240

N/A