Analyzing E-commerce BI Reports
Application Scenarios
As a self-operated e-commerce company in China, the X mall has developed hundreds of millions of loyal users and accumulated massive amounts of authentic data while maintaining high-speed development. How to use the BI tool to find business opportunities from historical data is a key issue in the precision marketing of big data applications. It is also the core technology required for intelligent upgrade of all e-commerce platforms.
This case uses HUAWEI CLOUD DLI, GaussDB(DWS), and Yonghong BI to analyze data features of users and offerings based on the real user, product, and comment data (anonymized) of the mall, providing high-quality information for marketing decision-making, advertising recommendation, credit rating, brand monitoring, and user behavior prediction.
Process
To use DLI to analyze e-commerce data, perform the following steps:
Step 1: Uploading Data. Upload the data to OBS for data analysis using DLI.
Step 2: Analyzing Data. Use DLI to query the data for analysis.
Data Types
To protect user privacy and data security, all sampled data is anonymized.
- User data
Table 1 User data Field
Data Type
Description
Value
user_id
int
User ID
Anonymized
age
int
Age group
-1 indicates that the user age is unknown.
gender
int
Gender
- 0: Male
- 1: Female
- 2: Confidential
rank
Int
User level
Sequenced list of user level. The higher the user level, the larger the number.
register_time
string
User registration date
Unit: day
- Product data
Table 2 Product data Field
Data Type
Description
Value
product_id
int
Product No.
Anonymized
a1
int
Attribute 1
Enumerated value. The value -1 indicates unknown.
a2
int
Attribute 2
Enumerated value. The value -1 indicates unknown.
a3
int
Attribute 3
Enumerated value. The value -1 indicates unknown.
category
int
Category ID
Anonymized
brand
int
Brand ID
Anonymized
- Comment data
Table 3 Comment data Field
Data Type
Description
Value
deadline
string
End time
Unit: day
product_id
int
Product No.
Anonymized
comment_num
int
Segments of accumulated comment count
- 0: No comment
- 1: One comment
- 2: 2 to 10 comments
- 3: 11-50 comments
- 4: More than 50 comments
has_bad_comment
int
Whether there is negative feedback.
0: No; 1: Yes.
bad_comment_rate
float
Dissatisfaction rate
Proportion of the negative feedback.
- Action data
Table 4 Action data Field
Data Type
Description
Value
user_id
int
User ID
Anonymized
product_id
int
Product No.
Anonymized
time
string
Time of action
-
model_id
string
Module ID
Anonymized
type
string
- Browse (refers to the offering details page)
- Add to cart
- Remove from cart
- Place an order
- Follow
- Click
-
Step 1: Uploading Data
- Download OBS Browser+. For details about the download address, see Object Storage Service Tool Guide.
- Install OBS Browser+. For details about the installation procedure, see Object Storage Service Tool Guide.
- Log in to OBS Browser+. OBS Browser+ supports two login modes: AK login (using access keys) or authorization code login. For details about the login procedure, see Object Storage Service Tool Guide.
- Upload data using the OBS Browser+.
On the OBS Browser+ page, click Create Bucket. Select a region and enter a bucket name (for example, DLI-demo). After the bucket is created, return to the bucket list and click DLI-demo. OBS Browser+ supports upload by dragging. You can drag one or more files or folders from a local path to the object list of a bucket or a parallel file system on OBS Browser+. You can even drag a file or folder directly to a specified folder on OBS Browser+.
Obtain the test data by downloading the Best_Practice_04.zip file, decompressing it, and uploading the Data folder to the root directory of the OBS bucket. The test data directory is as follows:
- data/JData_User: Data in the user table
- data/JData_Product:Data in the product table
- data/JData_Product/JData_Comment: Data in the comment table
- data/JData_Action: Data the action table
Step 2: Analyzing Data
- Creating a Database and a Table
- On the top menu bar of the portal page, choose Products > Analytics > Data Lake Insight (DLI).
- Create a demo database. On the DLI console, choose Job Management >SQL Jobs. Click the created job on the displayed page to go to the SQL Editor page.
- In the left pane of the SQL Editor, select the Databases tab and click to create the demo database. For details, see Figure 1.
The default database is a built-in database. You cannot create a database named default.
- Choose the demo database, and enter the following SQL statement in the editing box:
create table user( user_id int, age int, gender int, rank int, register_time string ) USING csv OPTIONS (path "obs://DLI-demo/data/JData_User")
The file path in the preceding SQL statement is the actual OBS path for storing data.
- Click Execute to create the user information table user.
- Create the product, comment, and action tables in the same way.
- Product data
create table product( product_id int, a1 int, a2 int, a3 int, category int, brand int ) USING csv OPTIONS (path "obs://DLI-demo/data/JData_Product")
- Comment table
create table comment( deadline string, product_id int, comment_num int, has_bad_comment int, bad_comment_rate float ) USING csv OPTIONS (path "obs://DLI-demo/data/JData_Comment")
- Action table
create table action( user_id int, product_id int, time string, model_id string, type string ) USING csv OPTIONS (path "obs://DLI-demo/data/JData_Action");
- Product data
- Querying Data
You can save common query statements as templates on the Template Management page for later use. For details, see SQL Template Management in Data Lake Insight User Guide.
- Top 10 products with the most likes
- Run the following SQL statement to analyze the top 10 products with the most likes.
SELECT product.brand as brand, COUNT(product.brand) as like_count from action JOIN product ON (action.product_id = product.product_id) WHERE action.type = 'like' group by brand ORDER BY like_count desc limit 10
- Click Execute. The execution results are displayed, as shown in Figure 2.
- Click to view the result in a chart.
Figure 3 Graphical results
- Run the following SQL statement to analyze the top 10 products with the most likes.
- Top 10 worst-rated products
- Run the following SQL statement to analyze the top 10 worst-rated products:
SELECT DISTINCT product_id, comment_num, bad_comment_rate from comment where comment_num > 3 order by bad_comment_rate desc limit 10
- Click Execute. The execution results are displayed, as shown in Figure 4.
- Click to view the result in a chart.
Figure 5 Graphical result
- Run the following SQL statement to analyze the top 10 worst-rated products:
You can also analyze data for age distribution, gender ratio, offering evaluation, purchase number, and browsing statistics of users.
- Top 10 products with the most likes
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.