Help Center> FunctionGraph> Best Practices> Filtering Sensitive Information> Building a Program

Building a Program

Download and use the information filtering program package provided in this example.

Creating a Deployment Package

This example uses a Python function to filter sensitive information. For details about function development, see Developing Functions in Python. Figure 1 shows the sample code directory. The service code is not described.

Figure 1 Sample code directory
Click to enlarge

Under the directory, the sensitivewords.txt file is a sensitive word library, which allows you to define sensitive words and put each word in a separate line.

This tutorial uses the Jieba word segmentation module (one of the best Chinese character segmentation modules in Python) and sensitive word library. Both of them must be placed under the fss_examples_message_filtering directory.

index.py is a handler file used for executing the function. A code snippet is as follows:

# -*- coding:utf-8 -*-
import json
import jieba
import os
import sys

reload(sys)  
sys.setdefaultencoding(´utf-8´)

def is_ok_publish(body, over = 0.1):
    ´´´
If 10% of the words in a message body are sensitive words, the message is not suitable for publishing.
    ´´´
    words = list(jieba.cut(body))   #Uses the Jieba word segmentation module to segment incoming messages. For example, if the body is I love China; the words included in this message body are ["I","love","China"].

    filename = os.environ.get(´RUNTIME_CODE_ROOT´) + ´/sensitivewords.txt´    #Reads information from the sensitive word library (in the correct file path). The sensitivewords file is a sensitive word list uploaded together with the code.
    with open(filename)as file:
        sensitive_words = file.read().decode(´gbk´).split(´\r\n´)        #The newline character may vary depending on the operating system (OS). It has been verified that the newline character in this code works in a Windows OS.
    num = 0     #Calculates the number of sensitive words that appear in both a message and the sensitive word library.
    for each in (set(words) & set(sensitive_words)):
        num = num + 1
    length = len(set(words))
    rate = float(num)/length
    if(rate >= over):      #If the number of sensitive words in a message exceeds 10% of the total words, the messages is sensitive and will not be published.
        return False
    return True    #Publishes the message.

def handler (event, context):

    msg = event[´Messages´]   #Pulls messages from a specified DMS queue.
    body = msg[0]["body"]       #Reads the body of a message.

    flag = is_ok_publish(body) #Determines whether the body of the message is sensitive.

Uploading Code to the OBS Bucket

Log in to the OBS console, go to the Objects page of the obs-mycode bucket, and upload the sample program package to this bucket, as shown in Figure 2.

Figure 2 Uploading the code package
Click to enlarge

On the fss_examples_message_filtering page, file link https://obs-mycode.obs.myhuaweicloud.com/fss_examples_message_filtering.zip is displayed.

Creating a Function

When creating a function, specify an agency with DMS access permissions so that FunctionGraph can invoke the DMS service.

Log in to the FunctionGraph console, and choose Functions > Function List in the navigation pane.
Click Create Function.
Set the function information.
1. Set the basic information, as shown in Figure 3.
  For Function Name, enter fss_examples_message_filtering.
  
  For App, select default.
  
  For Description, enter Sensitive information filtering.
  
  For Agency, select serverless_dms created in Creating an Agency.
  
  Figure 3 Basic information
2. Set the code information, as shown in Figure 4.
  For Runtime, select Python 2.7.
  
  For Handler, enter index.handler.
  
  For Code Entry Mode, select Upload file from OBS, and paste OBS link URL https://obs-mycode.obs.myhuaweicloud.com/fss_examples_message_filtering.zip obtained in Uploading Code to the OBS Bucket.
  
  Figure 4 Code information
3. Click Create Function.
On the fss_examples_message_filtering page, select the Configuration tab and set the environment information, as shown in Figure 5.

For Memory, select 512.

For Timeout, enter 40.

Figure 5 Environment information
Click Save.

Parent topic: Filtering Sensitive Information

Last Article: Preparation

Next Article: Adding an Event Source

Did this article solve your problem?

Thank you for your score！Your feedback would help us improve the website.

Products

Compute

Application

Dedicated Cloud

Storage

Management & Deployment

Migration

Network

Enterprise Intelligence

Video

Database

Edge Cloud Services

DevCloud

Security

Cloud Communications

Internet of Things

Solutions

Industry-Specific Solutions

General-Purpose Solutions

Security

DevOps

Enterprise Intelligence

Essential Platform

Big Data

Visual Cognition

Speech and Semantics

Support

Help Center

Customer Services

Developers

Console

语言 - Language

中国站 - 简体中文

中国站 - English

International - 简体中文

International - English

Help Center

Building a Program

Creating a Deployment Package

Uploading Code to the OBS Bucket

Creating a Function