Updated on 2025-12-19 GMT+08:00

Vectorized UDF

Designed to overcome performance limitations of traditional row-based UDFs, vectorized UDFs execute operations in batches. They typically accept and return data in PyArrow or Pandas formats.

Example

The two examples below demonstrate how to use vectorized UDFs.

  • Example 1: Use PyArrow to accelerate vectorization.
    import fabric_data as fabric
    from fabric_data.udf import RegisterType
    
    import pyarrow.compute as pc
    
    # Implicitly register a UDF.
    @fabric.udf.pyarrow(database="your-database", register_type=RegisterType.STAGED)
    def calculate_product(prices: fabric.PyarrowVector[float], quantities: fabric.PyarrowVector[int]) -> fabric.PyarrowVector[float]:    
        return fabric.PyarrowVector[float](pc.multiply(prices, quantities))
    
    # Use the UDF.
    con = ibis.fabric.connect(...)
    t = con.table("your-table", database="your-database")
    expression = t.select(calculate_product(t.price, t.quantity).name("product column"))
    print(expression.execute())
  • Example 2: Use Pandas to accelerate vectorization.
    import fabric_data as fabric
    from fabric_data.udf import RegisterType
    
    import pandas as pd
    
    # Implicitly register a UDF.
    @fabric.udf.pandas(database="your-database", register_type=RegisterType.STAGED)
    def calculate_product(prices: fabric.PandasVector[float], quantities: fabric.PandasVector[int]) -> fabric.PandasVector[float]:    
        return fabric.PandasVector[float](prices * quantities, dtype=pd.Float64Dtype())
    
    # Use the UDF.
    con = ibis.fabric.connect(...)
    t = con.table("your-table", database="your-database")
    expression = t.select(calculate_product(t.price, t.quantity).name("product column"))
    print(expression.execute())