Updated on 2022-09-14 GMT+08:00

Application Scenarios

Spark is a distributed batch processing framework. It provides analysis and mining and iterative memory computing capabilities and supports application development in multiple programming languages, including Scala, Java, and Python. Spark is applicable to the following scenarios:

  • Data processing: Spark can process data quickly and has fault tolerance and scalability.
  • Iterative computation: Spark supports iterative computation to meet the requirements of multi-step data processing logic.
  • Data mining: Spark supports complex mining and analysis of massive data and supports multiple data mining and machine learning algorithms.
  • Streaming Processing: Spark supports streaming processing with a second-level delay and supports multiple external data sources.
  • Query Analysis: Spark supports standard SQL query analysis, provides the DSL (DataFrame), and supports multiple external input types.