Updated on 2025-08-25 GMT+08:00

Type Inference of the signature Parameter

For the signature parameter, you may choose to provide the parameter/return value types or omit them entirely.

  • If you supply the signature parameter, there is no requirement for the original Python function to utilize type hinting syntax. This enables immediate operational registration of the UDF.
  • Conversely, if the signature parameter is not provided, you are advised to use type hinting syntax within the original Python function, though this precludes immediate operational registration of the UDF.

A comparison of these approaches is summarized below.

Table 1 signature parameter descriptions

signature Parameter

Description

Require Original Python Function with Type Hint Syntax

Support Immediate REPL Operation

User omits passing value

Auto-deduction (recommended)

No, yet recommended usage

No

User specifies value

Specified value

No

Yes

Here, immediate operation pertains to the read-evaluate-print loop (REPL), commonly encountered in Python's interactive terminal environment.

Introduced in Python 3.5 via PEP 484, type hinting syntax involves appending a colon (:) followed by the type after the parameter name and indicating the return type post the parameter list using an arrow (->), exemplified as follows:

def greet(name: str) -> str:
    return f"Hello, {name}"
from typing import List, Dict, Optional

def process_data(data: List[int]) -> Dict[str, Optional[int]]:
    return {"max": max(data) if data else None}

For Scalar Python UDFs, strict typing is mandated upon registration, requiring explicit specification of all parameter and return value types. If you fail to define these through the original Python function's type hinting, you must actively use the signature parameter to designate the Ibis DataType.

In contrast, Builtin Python UDFs do not enforce strict typing during registration (as the UDF is already registered in the database). If you cannot specify the type annotations of the original Python function, you are advised to include only the parameter names without their types. If you later use the return value of the Builtin Python UDF (excluding Top SELECT UDF), then the function's return type needs to be specified, and when necessary, you should actively use the signature parameter to define the Ibis DataType. If not needed (for Top SELECT UDF), you may omit writing the function's return type.

Regarding cases where you do not provide the signature parameter and rely on auto-deduction, the following summary applies:

Table 2 Auto-deduction of signature

Registered UDF Type

Parameter Type

Return Value Type

Scalar Python UDF

Requires type hinting syntax for specification.

Requires type hinting syntax for specification.

Builtin Python UDF

Allows writing just parameter names without types.

Requires type hinting syntax when utilizing return values subsequently. Otherwise, not mandatory.

For cases where you do not pass in the signature parameter and it is inferred automatically, the underlying implementation principle is inspect.signature. Currently, the system accepts the following parameter/return value types from users:

Table 3 Accepted parameter/return value types

Python

Ibis DataType

DataArts Fabric SQL

DataType

DataType

-

type(None)

null

NULL

bool

Boolean

BOOLEAN

bytes

Binary

BYTEA

str

String

TEXT

numbers.Integral

Int64

BIGINT

numbers.Real

Float64

DOUBLE PRECISION

decimal.Decimal

Decimal

DECIMAL

datetime.datetime

Timestamp

TIMESTAMP/TIMESTAMPTZ

datetime.date

Date

TIMESTAMP

datetime.time

Time

TIME

datetime.timedelta

Interval

INTERVAL

uuid.UUID

UUID

UUID

class

Struct

STRUCT

typing.Sequence, typing.Array

Array

ARRAY

typing.Mapping, typing.Map

Map

HSTORE

Notes:

  • The built-in int type of Python belongs to the subclass of numbers.Integral.
  • The built-in float type of Python belongs to the subclass of numbers.Real.

The Python types that are not listed in the preceding table are automatically converted types that are not supported currently.

For parameters/return values where you do not pass the signature parameter and also do not use Python type annotation (type hints) syntax, the current automatic inference adopts the following approach:

Table 4 Special parameter type handling

Parameter Type

Generated Matching Pattern

Pattern Effectiveness

POSITIONAL_ONLY, KEYWORD_ONLY, POSITIONAL_OR_KEYWORD

ValueOf(None)

Exempts from __signature__.validate.

VAR_POSITIONAL

TupleOf(pattern=pattern)

Executes pattern in a for-loop.

VAR_KEYWORD

DictOf(key_pattern=InstanceOf(str), value_pattern=pattern)

Executes pattern in a for-loop.

Return

ValueOf(Unknown)

Provides UnknowScaclar, UnknownColumn as UDF return values passed upward.

The classification of parameter types (Parameter.kind) by inspect.signature is as follows:

Table 5 inspect.signature parameter types

Parameter Type

Description

Sample Code

Parameters Meeting Conditions

POSITIONAL_ONLY

Position-only parameter.

def func(a, /, b): pass

a

KEYWORD_ONLY

Keyword-only parameter.

def func(a, *, b): pass

b

POSITIONAL_OR_KEYWORD

Positional or keyword parameter.

def func(a, b): pass

a, b

VAR_POSITIONAL

Variable positional parameter.

def func(*args): pass

args

VAR_KEYWORD

Variable keyword parameter.

def func(**kwargs): pass

kwargs