Configuring Structured Streaming to Use RocksDB for State Store
Scenario
If a large amount of state information is stored in the default HDFSBackedStateStore and JVM GC takes a long time, you can use the following method to select RocksDB as the state backend.
Parameters
Set the following parameters in the spark-defaults.conf file of the Spark client.
Parameter |
Description |
Default Value |
---|---|---|
spark.sql.streaming.stateStore.providerClass |
Class that manages state data for quires require stateful streaming. This class must be a subclass of StateStoreProvider and must have a zero argument constructor. Set this parameter to org.apache.spark.sql.execution.streaming.state.RocksDBStateStoreProvider to select RocksDB as the state backend. |
org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot