Configuring Structured Streaming to Use RocksDB for State Store

This section applies only to MRS 3.3.0 or later.
Scenarios
If a large amount of state information is stored in the default HDFS BackedStateStore and JVM GC takes a long time, you can use the following method to select RocksDB as the state backend.
Parameters
Configure the following parameters in the spark-defaults.conf file of the Spark client.
Parameter |
Description |
Default Value |
---|---|---|
spark.sql.streaming.stateStore.providerClass |
Class that manages state data for stateful stream queries. This class must be a subclass of StateStoreProvider and must have a zero argument constructor. Set this parameter to org.apache.spark.sql.execution.streaming.state.RocksDBStateStoreProvider to select RocksDB as the state backend. |
org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.