Python Sample Code
Function Description
Collect statistics on female netizens who dwell on online shopping for more than 2 hours on the weekend.
Sample Code
The following code snippets are used as an example. For complete codes, see collectFemaleInfo.py.
def contains(str, substr): if substr in str: return True return False if __name__ == "__main__": if len(sys.argv) < 2: print "Usage: CollectFemaleInfo <file>" exit(-1) # Create SparkContext and set AppName. sc = SparkContext(appName = "CollectFemaleInfo")? """ The following programs are used to implement the following functions: //1. Read data. The input parameter argv[1] specifies the data path. - textFile 2. Filter data information of the time that female netizens spend online. - filter 3. Summarize the total time that each female netizen spends online. -map/map/reduceByKey. 4. Filter the information of female netizens who spend more than 2 hours online. - filter """ inputPath = sys.argv[1] result = sc.textFile(name = inputPath, use_unicode = False) \ .filter(lambda line: contains(line, "female")) \ .map(lambda line: line.split(',')) \ .map(lambda dataArr: (dataArr[0], int(dataArr[2]))) \ .reduceByKey(lambda v1, v2: v1 + v2) \ .filter(lambda tupleVal: tupleVal[1] > 120) \ .collect() for (k, v) in result: print k + "," + str(v) # Stop SparkContext. sc.stop()
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.