Building a Program
Download and use the information filtering program package provided in this example.
Creating a Deployment Package
This example uses a Python function to filter sensitive information. For details about function development, see Developing Functions in Python. Figure 1 shows the sample code directory. The service code is not described.
Under the directory, the sensitivewords.txt file is a sensitive word library, which allows you to define sensitive words and put each word in a separate line.
This tutorial uses the Jieba word segmentation module (one of the best Chinese character segmentation modules in Python) and sensitive word library. Both of them must be placed under the fss_examples_message_filtering directory.
index.py is a handler file used for executing the function. A code snippet is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
# -*- coding:utf-8 -*-
import json
import jieba
import os
import sys
reload(sys)
sys.setdefaultencoding(´utf-8´)
def is_ok_publish(body, over = 0.1):
´´´
If 10% of the words in a message body are sensitive words, the message is not suitable for publishing.
´´´
words = list(jieba.cut(body)) #Uses the Jieba word segmentation module to segment incoming messages. For example, if the body is I love China; the words included in this message body are ["I","love","China"].
filename = os.environ.get(´RUNTIME_CODE_ROOT´) + ´/sensitivewords.txt´ #Reads information from the sensitive word library (in the correct file path). The sensitivewords file is a sensitive word list uploaded together with the code.
with open(filename)as file:
sensitive_words = file.read().decode(´gbk´).split(´\r\n´) #The newline character may vary depending on the operating system (OS). It has been verified that the newline character in this code works in a Windows OS.
num = 0 #Calculates the number of sensitive words that appear in both a message and the sensitive word library.
for each in (set(words) & set(sensitive_words)):
num = num + 1
length = len(set(words))
rate = float(num)/length
if(rate >= over): #If the number of sensitive words in a message exceeds 10% of the total words, the messages is sensitive and will not be published.
return False
return True #Publishes the message.
def handler (event, context):
msg = event[´Messages´] #Pulls messages from a specified DMS queue.
body = msg[0]["body"] #Reads the body of a message.
flag = is_ok_publish(body) #Determines whether the body of the message is sensitive.
|
Uploading Code to the OBS Bucket
Log in to the OBS console, go to the Objects page of the obs-mycode bucket, and upload the sample program package to this bucket, as shown in Figure 2.
On the fss_examples_message_filtering page, file link https://obs-mycode.obs.myhuaweicloud.com/fss_examples_message_filtering.zip is displayed.
Creating a Function
When creating a function, specify an agency with DMS access permissions so that FunctionGraph can invoke the DMS service.
- Log in to the FunctionGraph console, and choose Functions > Function List in the navigation pane.
- Click Create Function.
- Set the function information.
- Set the basic information, as shown in Figure 3.
For Function Name, enter fss_examples_message_filtering.
For App, select default.
For Description, enter Sensitive information filtering.
For Agency, select serverless_dms created in Creating an Agency.
- Set the code information, as shown in Figure 4.
For Runtime, select Python 2.7.
For Handler, enter index.handler.
For Code Entry Mode, select Upload file from OBS, and paste OBS link URL https://obs-mycode.obs.myhuaweicloud.com/fss_examples_message_filtering.zip obtained in Uploading Code to the OBS Bucket.
- Click Create Function.
- Set the basic information, as shown in Figure 3.
- On the fss_examples_message_filtering page, select the Configuration tab and set the environment information, as shown in Figure 5.
For Memory, select 512.
For Timeout, enter 40.
- Click Save.
Last Article: Preparation
Next Article: Adding an Event Source





Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.