更新时间:2024-04-29 GMT+08:00
分享

数据压缩

概述

将数据压缩后到本地文件系统。

输入

参数

子参数

参数说明

inputs

dataframe

inputs为字典类型,dataframe为pyspark中的DataFrame类型对象

输出

参数说明

参数

子参数

参数说明

data_delimeter

-

数据分割符

compression_type

-

数据压缩的格式,当前支持Bzip2,deflate,Gzip三种方式

data_partition

-

数据保存的分区

data_path

-

数据保存路径

样例

inputs = {
    "dataframe": None,  # @input {"label":"dataDF","type":"DataFrame"}
}
params = {
    "inputs": inputs,
    "data_delimeter": ",",  # @param {"label":"data_delimeter","type":"string","required":"false","helpTip":""}
    "compression_type": "Bzip2",  # @param {"label":"task","type":"enum", "options":"Bzip2,deflate,Gzip","required":"true","helpTip":""}
    "data_partition": 1,  # @param {"label":"data_partition","type":"int","required":"false","helpTip":""}
    "data_path": "",  # @param {"label":"data_path","type":"path","required":"true","helpTip":""}
}
execute_compress____id___ = MLSSaveWithCompression(**params)
execute_compress____id___.run()
分享:

    相关文档

    相关产品