Configuring the Spark Multi-Tenant Mode
Scenarios
In multi-active instance mode, JDBCServer operates in YARN-client mode, but by default, only one YARN resource queue is available. This creates a resource bottleneck. To address this limitation, multi-tenant mode is introduced.
In multi-tenant mode, JDBCServers are bound with tenants. Each tenant corresponds to one or more JDBCServers, and a JDBCServer provides services for only one tenant. Different tenants can be configured with different YARN queues to implement resource isolation. In addition, JDBCServer can dynamically starts as required to avoid resource waste.
Configuration Description
- Log in to FusionInsight Manager.
For details, see Accessing FusionInsight Manager.
- Choose Cluster > Services > Spark2x or Spark, click Configurations and then All Configurations, and search for the following parameters and adjust their values.
Table 1 Parameter description Parameter
Description
Example Value
spark.proxyserver.hash.enabled
Specifies whether to connect to ProxyServer using the Hash algorithm.
- true indicates using the Hash algorithm. In multi-tenant mode, this parameter must be configured to true.
- false indicates using random connection. In multi-active instance mode, this parameter must be configured to false.
After this parameter is modified, you need to download the client again.
true
spark.thriftserver.proxy.enabled
Specifies whether to use the multi-tenant mode.
- false: The multi-instance mode is used.
- true: The multi-tenant mode is used.
true
spark.thriftserver.proxy.maxThriftServerPerTenancy
Specifies the maximum number of JDBCServer instances that can be started by a tenant in multi-tenant mode.
Value range: 1 to 2147483647
2
spark.thriftserver.proxy.maxSessionPerThriftServer
Specifies the maximum number of sessions in a single JDBCServer instance in multi-tenant mode. If the number of sessions exceeds this value and the number of JDBCServer instances does not exceed the upper limit, a new JDBCServer instance is started. Otherwise, an alarm log is output.
Value range: 1 to 2147483647
200
spark.thriftserver.proxy.sessionWaitTime
Specifies the wait time, in milliseconds, before a JDBCServer instance is stopped when it has no session connections in multi-tenant mode.
Value range: 1 to 2147483647
180000
spark.thriftserver.proxy.sessionThreshold
In multi-tenant mode, when the session usage (formula: number of current sessions/spark.thriftserver.proxy.maxSessionPerThriftServer x number of current JDBCServer instances) of the JDBCServer instance reaches the threshold, a new JDBCServer instance is automatically added.
100
spark.thriftserver.proxy.healthcheck.period
Specifies the interval, in milliseconds, at which the JDBCServer proxy checks the health status of the JDBCServer in multi-tenant mode.
Value range: 1 to 2147483647
120000
spark.thriftserver.proxy.healthcheck.recheckTimes
Specifies the number of JDBCServer health check retries conducted by the JDBCServer proxy in multi-tenant mode.
Value range: 1 to 2147483647
6
spark.thriftserver.proxy.healthcheck.waitTime
Specifies the wait time, in milliseconds, for JDBCServer to respond to a health check request sent by the JDBCServer proxy.
Value range: 1 to 2147483647
10000
spark.thriftserver.proxy.session.check.interval
Specifies the period of JDBCServer proxy sessions in multi-tenant mode.
6h
spark.thriftserver.proxy.idle.session.timeout
Specifies the idle time interval of a JDBCServer proxy session in multi-tenant mode. If no operation is performed within this period, the session is closed.
7d
spark.thriftserver.proxy.idle.session.check.operation
Specifies whether to check that operations still exist on a JDBCServer proxy session when the session is checked for expiration in multi-tenant mode.
- true: Check whether operations still exist on a JDBCServer proxy session when the session is checked for expiration.
- false: Do not check whether operations still exist on a JDBCServer proxy session when the session is checked for expiration.
true
spark.thriftserver.proxy.idle.operation.timeout
Specifies the timeout interval of an operation in multi-tenant mode. An operation that times out is closed.
5d
hive.spark.client.server.connect.timeout
Specifies the timeout duration for client connections in multi-tenant mode.
5min
- After the parameter settings are modified, click Save, perform operations as prompted, and wait until the settings are saved successfully.
- After the Spark server configurations are updated, if Configure Status is Expired, restart the component for the configurations to take effect.
Figure 1 Modifying Spark configurationsOn the Spark dashboard page, choose More > Restart Service or Service Rolling Restart, enter the administrator password, and wait until the service restarts.
Components are unavailable during the restart, affecting upper-layer services in the cluster. To minimize the impact, perform this operation during off-peak hours or after confirming that the operation does not have adverse impact.
Helpful Links
For more information about Spark multi-tenancy, see Spark2x Multi-tenant.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot