更新时间:2024-11-08 GMT+08:00
Spark如何将数据写入到DLI表中
使用Spark将数据写入到DLI表中,主要设置如下参数:
- fs.obs.access.key
- fs.obs.secret.key
- fs.obs.impl
- fs.obs.endpoint
示例如下:
import logging from operator import add from pyspark import SparkContext logging.basicConfig(format='%(message)s', level=logging.INFO) #import local file test_file_name = "D://test-data_1.txt" out_file_name = "D://test-data_result_1" sc = SparkContext("local","wordcount app") sc._jsc.hadoopConfiguration().set("fs.obs.access.key", "myak") sc._jsc.hadoopConfiguration().set("fs.obs.secret.key", "mysk") sc._jsc.hadoopConfiguration().set("fs.obs.impl", "org.apache.hadoop.fs.obs.OBSFileSystem") sc._jsc.hadoopConfiguration().set("fs.obs.endpoint", "myendpoint") # red: text_file rdd object text_file = sc.textFile(test_file_name) # counts counts = text_file.flatMap(lambda line: line.split(" ")).map(lambda word: (word, 1)).reduceByKey(lambda a, b: a + b) # write counts.saveAsTextFile(out_file_name)
父主题: Spark作业开发类