Help Center/
MapReduce Service/
Developer Guide (Normal_3.x)/
Spark2x Development Guide (Normal Mode)/
Developing the Project/
Spark Core Project/
Python Example Code
Updated on 2022-08-16 GMT+08:00
Python Example Code
Function
Collects the information of female netizens who spend more than 2 hours in online shopping on the weekend from the log files.
Example Code
The following code segment is only an example. For details, see collectFemaleInfo.py.
def contains(str, substr): if substr in str: return True return False if __name__ == "__main__": if len(sys.argv) < 2: print "Usage: CollectFemaleInfo <file>" exit(-1) spark = SparkSession \ .builder \ .appName("CollectFemaleInfo") \ .getOrCreate() """ The following programs are used to implement the following functions: 1. Read data. This code indicates the data path that the input parameter argv[1] specifies. - text 2. Filter data about the time that female netizens spend online. - filter 3. Aggregate the total time that each female netizen spends online. - map/map/reduceByKey 4. Filter information about female netizens who spend more than 2 hours online. - filter """ inputPath = sys.argv[1] result = spark.read.text(inputPath).rdd.map(lambda r: r[0])\ .filter(lambda line: contains(line, "female")) \ .map(lambda line: line.split(',')) \ .map(lambda dataArr: (dataArr[0], int(dataArr[2]))) \ .reduceByKey(lambda v1, v2: v1 + v2) \ .filter(lambda tupleVal: tupleVal[1] > 120) \ .collect() for (k, v) in result: print k + "," + str(v) # Stop SparkContext. spark.stop()
Parent topic: Spark Core Project
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
The system is busy. Please try again later.
For any further questions, feel free to contact us through the chatbot.
Chatbot