Esta página aún no está disponible en su idioma local. Estamos trabajando arduamente para agregar más versiones de idiomas. Gracias por tu apoyo.

On this page

Python Sample Code

Updated on 2022-09-14 GMT+08:00

Function Description

Collect statistics on female netizens who dwell on online shopping for more than 2 hours on the weekend.

Sample Code

The following code snippets are used as an example. For complete codes, see collectFemaleInfo.py.

def contains(str, substr):
  if substr in str:
    return True
  return False

if __name__ == "__main__":
  if len(sys.argv) < 2:
    print "Usage: CollectFemaleInfo <file>"
    exit(-1)

  # Create SparkContext and set AppName.
  sc = SparkContext(appName = "CollectFemaleInfo")?

  """
The following programs are used to implement the following functions:
  //1. Read data. The input parameter argv[1] specifies the data path. - textFile
  2. Filter data information of the time that female netizens spend online. - filter
  3. Summarize the total time that each female netizen spends online. -map/map/reduceByKey.
  4. Filter the information of female netizens who spend more than 2 hours online. - filter
  """
  inputPath = sys.argv[1]
  result = sc.textFile(name = inputPath, use_unicode = False) \
    .filter(lambda line: contains(line, "female")) \
    .map(lambda line: line.split(',')) \
    .map(lambda dataArr: (dataArr[0], int(dataArr[2]))) \
    .reduceByKey(lambda v1, v2: v1 + v2) \
    .filter(lambda tupleVal: tupleVal[1] > 120) \
    .collect()
  for (k, v) in result:
    print k + "," + str(v)

  # Stop SparkContext.
  sc.stop()
Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback