Updated on 2025-12-19 GMT+08:00

Util APIs

create_agg_function(udf, /, input_type=InputType.PYTHON, *, database, name=None, signature=None, imports=None, packages=None, comment=None, **kwargs)

Description: Registers a UDAF.

Input parameters:

  • udf (Callable): UDAF to register.
  • input_type (InputType): Input type of the function. Options: PYTHON, BUILTIN, PYARROW, and PANDAS.
  • database (str): Name of the database the function belongs to.
  • name (Optional[str]): Function name. The default value is None.
  • signature (Optional[Signature]): Optional definition of the function's signature.
  • imports (Optional[Union[str, Iterable[str]]]): Modules required by the function, which can be a single string or a list of strings.
  • packages (Optional[Union[str, Iterable[str]]]): Dependencies needed for the function, which can be a single package name or a list of package names.
  • comment (Optional[str]): Optional comment of the function.
  • **kwargs: Additional optional keyword arguments.

Return type:

Callable: Registered function.

create_scalar_function(udf, /, input_type=InputType.PYTHON, *, database, name=None, imports=None, packages=None, comment=None, **kwargs)

Description: Registers a new scalar UDF.

Input parameters:

  • udf (Callable): Scalar UDF to register.
  • input_type (InputType): Input type of the function. Options: PYTHON, BUILTIN, PYARROW, and PANDAS.
  • database (str): Name of the database the function belongs to.
  • name (Optional[str]): Function name. The default value is None.
  • imports (Optional[Union[str, Iterable[str]]]): Modules required by the function, which can be a single string or a list of strings.
  • packages (Optional[Union[str, Iterable[str]]]): Dependencies needed for the function, which can be a single package name or a list of package names.
  • comment (Optional[str]): Optional comment of the function.
  • **kwargs: Additional optional keyword arguments.

Return type: tuple[str, ...].

create_table(name, /, *, schema, table_format, database=None, temp=False, external=False, overwrite=False, if_not_exists=False, partition_by=None, table_properties=None, location=None)

Description: Creates a table in a specified database.

Input parameters:

  • name (str): Name of the table.
  • schema (ibis.Schema): Schema defining the table's structure.
  • table_format (str): Storage format for the table.
  • database (Optional[str]): Name of the database where the table will be created. If not specified, the default database is used.
  • temp (bool): If True, creates a temporary table.
  • external (bool): If True, creates an external table.
  • overwrite (bool): If True, overwrites the existing table if it already exists.
  • if_not_exists (bool): If True, only creates the table if it does not already exist.
  • partition_by (Optional[ibis.Schema]): Schema used to partition the table.
  • table_properties (Optional[dict]): Additional properties for the table.
  • location (Optional[str]): Storage location for the table data.

Return type: dataset.

create_table_function(udf, /, input_type=InputType.PYTHON, *, database, name=None, signature=None, imports=None, packages=None, comment=None, **kwargs)

Description: Registers a UDTF.

Input parameters:

  • udf (Callable): UDTF to register.
  • input_type (InputType): Input type of the function. Options: PYTHON and BUILTIN.
  • database (str): Name of the database the function belongs to.
  • name (Optional[str]): Function name. The default value is None.
  • signature (Optional[Signature]): Optional definition of the function's signature.
  • imports (Optional[Union[str, Iterable[str]]]): Modules required by the function, which can be a single string or a list of strings.
  • packages (Optional[Union[str, Iterable[str]]]): Dependencies needed for the function, which can be a single package name or a list of package names.
  • comment (Optional[str]): Optional comment of the function.
  • **kwargs: Additional optional keyword arguments.

Return type:

Callable: Registered function.

delete_function(name, /, *, database=None, if_it_exists=False)

Description: Deletes a UDF from a specified database.

Input parameters:

  • name (str): Name of the function to delete.
  • database (Optional[str]): Name of the database containing the function to delete. If not specified, the default database is used.
  • if_exists (bool): If True, deletes the function only if it exists, preventing errors when the function is missing.

Return type: str.

delete_table(name, /, *, database=None, if_it_exists=False)

Description: Removes a specified dataset from the database.

Input parameters:

  • name (str): Name of the table to delete.
  • database (Optional[str]): Name of the database containing the table to delete. If not specified, the default database is used.
  • if_it_exists (bool): If True, deletes the table only if it exists, avoiding errors when the table is missing.

Return type: str.

describe_function(name, /, *, database=None)

Description: Describes the functions in a specified database.

Input parameters:

  • name (str): Function name.
  • database (Optional[str]): Name of the database containing the function.

Return type: str, metadata information about the function.

describe_table(name, /, *, database=None)

Description: Describes the tables in a specified database.

Input parameters:

  • name (str): Function name.
  • database (Optional[str]): Name of the database containing the table.

Return type: str, metadata information about the table.

drop_table(name, *, database=None, force=False)

Description: Deletes a table from a specified database.

Input parameters:

  • name (str): Name of the table to delete.
  • database (Optional[str]): Name of the database containing the table to delete. If not specified, the default database is used.
  • force (bool): If True, the table will be deleted even if it contains data. The default value is False.

Return type: None.

get_function(name, *, database=None)

Description: Retrieves a UDF from a specified or default database.

Input parameters:

  • func_name (str): Name of the UDF to retrieve.
  • database (Optional[str]): Name of the database containing the UDF to retrieve. If not specified, the default database is used.

Return type: Callable, callable function.

list_functions(database=None)

Description: Lists all UDFs in a specified or default database.

Input parameters:

database (Optional[str]): Name of the database to list UDFs from. If not specified, the default database is used.

Return type: pandas.DataFrame.

list_tables(database=None)

Description: Lists all tables in a specified or default database.

Input parameters:

database (Optional[str]): Name of the database to list tables from. If it is None, the default database is used.

Return type: list[tuple], a list of tuples, where each tuple contains table names.

load_dataset(name, *, database=None)

Description: Loads a dataset from a specified database.

Input parameters:

  • name (str): Name of the dataset to load.
  • database (Optional[str]): Name of the database to load the dataset from. If not specified, the default database is used.

Return type: dataset.

open_table(name, *, database=None)

Description: Opens a table in a specified database.

Input parameters:

  • name (str): Name of the table to open.
  • database (Optional[str]): Name of the database containing the table. If not specified, the default database is used.

Return type: table.

set_function_staging_workspace(*, obs_server, obs_bucket_name, obs_directory_base, access_key, secret_key, security_token=None)

Description: Configures the temporary workspace required for function deployment by setting necessary credentials and storage details.

Input parameters:

  • obs_server (str): OBS address.
  • obs_bucket_name (str): Name of the OBS bucket to use.
  • obs_directory_base (str): Base directory path in OBS for storing workspace data.
  • access_key (str): Access key for authentication.
  • secret_key (str): Secret key for authentication.
  • security_token (Optional[str]): Security token for authentication. Defaults to None if not provided.

Return type: None.

sql(query, **kwargs)

Description: Executes a SQL query and returns the result.

Input parameters:

  • query (str): SQL query to execute.
  • **kwargs (Any): Variable keyword arguments for passing additional parameters for the query.

Return type: Any. The result of the SQL query is returned.

explain_plan(verbose=True)

Description: Generates the execution plan for a job.

Input parameters:

verbose(bool): If True, provides a detailed explanation of the query plan. If False, returns a simplified version.

Return type: str.

explain_performance()

Description: Executes a query and generates a performance report with detailed metrics from the execution process.

Input parameters: None.

Return type: str, detailed execution plan.

stats()

Description: Retrieves the execution history of all jobs in the current session.

Input parameters: None.

Return type: list.