Type Inference of the signature Parameter
For the signature parameter, you may choose to provide the parameter/return value types or omit them entirely.
- If you supply the signature parameter, there is no requirement for the original Python function to utilize type hinting syntax. This enables immediate operational registration of the UDF.
- Conversely, if the signature parameter is not provided, you are advised to use type hinting syntax within the original Python function, though this precludes immediate operational registration of the UDF.
A comparison of these approaches is summarized below.
signature Parameter |
Description |
Require Original Python Function with Type Hint Syntax |
Support Immediate REPL Operation |
---|---|---|---|
User omits passing value |
Auto-deduction (recommended) |
No, yet recommended usage |
No |
User specifies value |
Specified value |
No |
Yes |
Here, immediate operation pertains to the read-evaluate-print loop (REPL), commonly encountered in Python's interactive terminal environment.
Introduced in Python 3.5 via PEP 484, type hinting syntax involves appending a colon (:) followed by the type after the parameter name and indicating the return type post the parameter list using an arrow (->), exemplified as follows:
def greet(name: str) -> str: return f"Hello, {name}"
from typing import List, Dict, Optional def process_data(data: List[int]) -> Dict[str, Optional[int]]: return {"max": max(data) if data else None}
For Scalar Python UDFs, strict typing is mandated upon registration, requiring explicit specification of all parameter and return value types. If you fail to define these through the original Python function's type hinting, you must actively use the signature parameter to designate the Ibis DataType.
In contrast, Builtin Python UDFs do not enforce strict typing during registration (as the UDF is already registered in the database). If you cannot specify the type annotations of the original Python function, you are advised to include only the parameter names without their types. If you later use the return value of the Builtin Python UDF (excluding Top SELECT UDF), then the function's return type needs to be specified, and when necessary, you should actively use the signature parameter to define the Ibis DataType. If not needed (for Top SELECT UDF), you may omit writing the function's return type.
Regarding cases where you do not provide the signature parameter and rely on auto-deduction, the following summary applies:
Registered UDF Type |
Parameter Type |
Return Value Type |
---|---|---|
Scalar Python UDF |
Requires type hinting syntax for specification. |
Requires type hinting syntax for specification. |
Builtin Python UDF |
Allows writing just parameter names without types. |
Requires type hinting syntax when utilizing return values subsequently. Otherwise, not mandatory. |
For cases where you do not pass in the signature parameter and it is inferred automatically, the underlying implementation principle is inspect.signature. Currently, the system accepts the following parameter/return value types from users:
Python |
Ibis DataType |
DataArts Fabric SQL |
---|---|---|
DataType |
DataType |
- |
type(None) |
null |
NULL |
bool |
Boolean |
BOOLEAN |
bytes |
Binary |
BYTEA |
str |
String |
TEXT |
numbers.Integral |
Int64 |
BIGINT |
numbers.Real |
Float64 |
DOUBLE PRECISION |
decimal.Decimal |
Decimal |
DECIMAL |
datetime.datetime |
Timestamp |
TIMESTAMP/TIMESTAMPTZ |
datetime.date |
Date |
TIMESTAMP |
datetime.time |
Time |
TIME |
datetime.timedelta |
Interval |
INTERVAL |
uuid.UUID |
UUID |
UUID |
class |
Struct |
STRUCT |
typing.Sequence, typing.Array |
Array |
ARRAY |
typing.Mapping, typing.Map |
Map |
HSTORE |
Notes:
- The built-in int type of Python belongs to the subclass of numbers.Integral.
- The built-in float type of Python belongs to the subclass of numbers.Real.
The Python types that are not listed in the preceding table are automatically converted types that are not supported currently.
For parameters/return values where you do not pass the signature parameter and also do not use Python type annotation (type hints) syntax, the current automatic inference adopts the following approach:
Parameter Type |
Generated Matching Pattern |
Pattern Effectiveness |
---|---|---|
POSITIONAL_ONLY, KEYWORD_ONLY, POSITIONAL_OR_KEYWORD |
ValueOf(None) |
Exempts from __signature__.validate. |
VAR_POSITIONAL |
TupleOf(pattern=pattern) |
Executes pattern in a for-loop. |
VAR_KEYWORD |
DictOf(key_pattern=InstanceOf(str), value_pattern=pattern) |
Executes pattern in a for-loop. |
Return |
ValueOf(Unknown) |
Provides UnknowScaclar, UnknownColumn as UDF return values passed upward. |
The classification of parameter types (Parameter.kind) by inspect.signature is as follows:
Parameter Type |
Description |
Sample Code |
Parameters Meeting Conditions |
---|---|---|---|
POSITIONAL_ONLY |
Position-only parameter. |
def func(a, /, b): pass |
a |
KEYWORD_ONLY |
Keyword-only parameter. |
def func(a, *, b): pass |
b |
POSITIONAL_OR_KEYWORD |
Positional or keyword parameter. |
def func(a, b): pass |
a, b |
VAR_POSITIONAL |
Variable positional parameter. |
def func(*args): pass |
args |
VAR_KEYWORD |
Variable keyword parameter. |
def func(**kwargs): pass |
kwargs |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot