Help Center> FunctionGraph> User Guide> Flow Management> Processing Stream Files
Updated on 2023-11-30 GMT+08:00

Processing Stream Files

This section describes how to use FunctionGraph to process large stream files. Create an express flow that meets your service requirements.

Background and Value

Serverless workflows feature orchestration, state management, persistence, visualized monitoring, error handling, and cloud service integration. They are suitable for many scenarios, including:

  • Complex, abstract services, such as order management and CRM
  • Services that require automatic interruption and recovery when manual intervention is involved among tasks, such as manual review and pipeline deployment
  • Services that require manual interruption and recovery, such as data backup and restoration
  • Task status monitoring
  • Stream processing, such as log analysis and image/video processing

    Nowadays, most serverless workflow platforms focus more on control process orchestration rather than data flow orchestration and transmission. In scenarios similar to Creating a Flow Trigger, the data flow is simple and well supported by various platforms. However, they do not have a solution for ultra-large data stream processing scenarios, such as file transcoding. For these scenarios, Huawei Cloud FunctionGraph provides the serverless streaming solution, which responds to process files within milliseconds.

Principles

Huawei Cloud FunctionGraph provides the serverless streaming solution for file processing via orchestration. Steps are driven by data flows, which is easier to understand. This section uses image processing as an example to describe how this solution works.

A workflow system needs to process two parts:

  • Control flow: Controls the flow between steps and the execution of serverless functions in the steps.
  • Data flow: Controls the data flowing through the entire workflow. Generally, the output of a previous step serves as the input of the next step. For example, in the preceding image processing workflow, the image compression result is the input of the watermarking step.

In common service orchestration, the execution sequence of each service needs to be precisely controlled, so the control flow is the core of a workflow. However, streaming processing scenarios such as file processing do not require more on the control flow. For example, in the image processing scenario, large images can be processed by block. Image compression and watermarking are not necessarily completed in a specific sequence.

The following figure shows the architecture of Huawei Cloud FunctionGraph's serverless streaming solution.

Serverless streaming allows steps to be executed in parallel rather than in a specific sequence. Steps interact with each through data flows, which are controlled by the Stream Bridge component. The function SDKs include a streaming data API, which writes data into Stream Bridge in a gRPC stream. Stream Bridge then distributes the data flow into the function pod in the next step.

Procedure

  1. Create an image compression function, which uses ctx.Write() to return results as streaming data.

    Currently, only Go functions are supported.

    FunctionGraph supports streaming response with ctx.Write(). Instead of focusing on network transmission, you only need to return the final results in a stream.

  2. Create a workflow on the FunctionGraph console.

  3. Invoke the synchronous flow execution API to obtain the file stream. The data is returned to the client in chunked streaming mode.