Example of Directly Using DataFrame with Scalar UDFs
Scenario
In big data processing scenarios, when users utilize DataFrame for data processing, they often need to use user-defined functions (UDFs) to achieve complex data computation logic. However, in the current system, UDF registration and invocation are tightly coupled, preventing users from independently viewing or deleting registered UDFs post-registration. This creates numerous inconveniences during collaborative development or dynamic management of UDFs within teams. To address this issue, this requirement introduces new Backend.udf series APIs, enabling users to dynamically view, call, and delete UDFs at runtime, thereby enhancing UDF management flexibility and development efficiency.
Constraints
Constraints on directly calling, viewing, and deleting UDFs are as follows:
Users must first establish a Backend (Fabric) connection before calling the Backend UDF Registry API.
Support for specific types relies on DataArts Fabric kernel's support for complex types.
import ibis import ibis_fabric as fabric con = ibis.fabric.connect(...) # View the list of existing UDFs in the database. udfs = con.udf.names(database="your-database") if "transform_json" in udfs: # Directly acquire the UDF and confirm that the transform_json function already exists in the database. transform_json_udf = con.udf.get(name="transform_json", database="your-database") # Use transform_json with the SELECT method of DataFrame. expression = t.select(transform_json_udf(t.ts, t.msg).name("json column")) df = expression.execute() # Delete a UDF. con.udf.unregister("transform_json", database="your-database") if "SPManager" in udfs: # Directly acquire the UDF and confirm that the SPManager class already exists in the database. sentencepiece_udf = con.udf.get(name="SPManager", database="your-database") # Use SPManager with the SELECT method of DataFrame. expression = t.select(sentencepiece_udf(t.data).with_arguments(model_file="test_model.model", bos=True, eos=True).name("pieces column")) df = expression.execute() # Delete a UDF. con.udf.unregister("SPManager", database="your-database")
For details about the complete Scalar UDF operation syntax, see Scalar UDF Direct Operation Syntax.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot