Vectorized UDF

Designed to overcome performance limitations of traditional row-based UDFs, vectorized UDFs execute operations in batches. They typically accept and return data in PyArrow or Pandas formats.

Example

The two examples below demonstrate how to use vectorized UDFs.

Example 1: Use PyArrow to accelerate vectorization.

import fabric_data as fabric
from fabric_data.udf import RegisterType

import pyarrow.compute as pc

# Implicitly register a UDF.
@fabric.udf.pyarrow(database="your-database", register_type=RegisterType.STAGED)
def calculate_product(prices: fabric.PyarrowVector[float], quantities: fabric.PyarrowVector[int]) -> fabric.PyarrowVector[float]:    
    return fabric.PyarrowVector[float](pc.multiply(prices, quantities))

# Use the UDF.
con = ibis.fabric.connect(...)
t = con.table("your-table", database="your-database")
expression = t.select(calculate_product(t.price, t.quantity).name("product column"))
print(expression.execute())

Example 2: Use Pandas to accelerate vectorization.

import fabric_data as fabric
from fabric_data.udf import RegisterType

import pandas as pd

# Implicitly register a UDF.
@fabric.udf.pandas(database="your-database", register_type=RegisterType.STAGED)
def calculate_product(prices: fabric.PandasVector[float], quantities: fabric.PandasVector[int]) -> fabric.PandasVector[float]:    
    return fabric.PandasVector[float](prices * quantities, dtype=pd.Float64Dtype())

# Use the UDF.
con = ibis.fabric.connect(...)
t = con.table("your-table", database="your-database")
expression = t.select(calculate_product(t.price, t.quantity).name("product column"))
print(expression.execute())

Parent topic: User-Defined Functions

Previous topic: Class UDF

Next topic: UDTF