更新时间:2023-05-05 GMT+08:00
分享

孤立森林

概述

对sklearn孤立森林算法的封装。

输入

参数

子参数

参数说明

inputs

dataframe

inputs为字典类型,dataframe为pyspark中的DataFrame类型对象

输出

数据集

参数说明

参数

子参数

参数说明

select_columns_str

-

列名组成的格式化字符串,例如:

"column_a"

"column_a,column_b"

n_estimators

-

基学习器的数量,默认为100

max_samples

-

从数据集中抽取多少个样本来训练,支持"auto"、int类型、float类型

contamination

-

-

max_features

-

从数据集中抽取多少数量的特征来训练每个基训练器

bootstrap

-

构建树时,下次是否替换采样,True表示替换,False表示不替换

样例

inputs = {
    "dataframe": None  # @input {"label":"dataframe","type":"DataFrame"}
}
params = {
    "inputs": inputs,
    "select_columns_str": "",  # @param {"label":"select_columns_str","type":"string","required":"false","helpTip":""}
    "n_estimators": 100,  # @param {"label":"n_estimators","type":"integer","required":"false","helpTip":""}
    "max_samples": "auto",  # @param {"label":"max_samples","type":"string","required":"false","helpTip":""}
    "contamination": "auto",  # @param {"label":"contamination","type":"string","required":"false","helpTip":""}
    "max_features": 1.0,  # @param {"label":"max_features","type":"number","required":"false","helpTip":""}
    "bootstrap": False  # @param {"label":"bootstrap","type":"boolean","required":"false","helpTip":""}
}
isolation_forest____id___ = MLSIsolationForest(**params)
isolation_forest____id___.run()
# @output {"label":"dataframe","name":"isolation_forest____id___.get_outputs()['output_port_1']","type":"DataFrame"}

分享:

    相关文档

    相关产品