Updated on 2024-05-11 GMT+08:00

UDF Development

This section describes the rules and suggestions for developing Doris UDF programs.

Doris UDF Development Rules

  • The method invocation in the UDF must be thread-safe.
  • Do not read external large files to the memory in the UDF implementation. If the file size is too large, the memory may be used up.
  • Avoid a large number of recursive calls. Otherwise, stack overflow or OOM may occur.
  • Do not create objects or arrays continuously. Otherwise, the memory may be used up.
  • The Java UDF should capture and process possible exceptions. Do not send exceptions to services for processing to avoid unknown exceptions in programs. You can use the try-catch block to handle exceptions and record exception information if necessary.
  • In the UDF, do not define static collection classes for storing temporary data or query large objects in external data. Otherwise, the memory usage is high.
  • Ensure that the imported package in the class does not conflict with the package on the server. You can run the grep -lr "Full restriction class name" command to check the JAR packages that conflict with each other. If a class name conflict occurs, you can fully restrict the class name to avoid the conflict.

Doris UDF Development Suggestions

  • Do not copy a large amount of data to prevent stack memory overflow.
  • Do not concatenate a large number of strings. Otherwise, the memory usage is high.
  • Java UDFs should use meaningful names so that other developers can easily understand their purpose. You are advised to use the camel-case naming method and end it with UDF, for example, MyFunctionUDF.
  • The Java UDF should specify the data type of the return value and must have a return value. Do not set the return value to NULL by default or when an exception occurs. You are advised to use basic data types or Java classes as return value types.