Updated on 2022-09-15 GMT+08:00

Scenario

Consulting company H uses CDM to import local trade statistics to OBS, and Data Lake Insight (DLI) to analyze trade statistics. In this way, company H builds its big data analytics platform at an extremely low cost, allowing the company more time to focus on their businesses and make innovations continuously.

Background

Company H is a commercial organization in China that engages in collecting trade statistics of major trading nations and buyer data. It has a large-scale trade statistics database. The collected data is widely used in industry research, international trade promotion, and other fields.

In the past, company H used its own big data cluster with maintenance by dedicated personnel. Each year, company H purchased the dedicated bandwidth from China Telecom and China Unicom and invested heavily in equipment room, electric power, private networks, servers, and O&M. However, the company could not satisfy customers' ever-changing service requirements due to insufficient manpower and capability restrictions of its big data cluster. As a result, only 4% of 100 TB inventory data was useful.

After migrating local trade statistics to HUAWEI CLOUD, company H can make full use of the 100 TB inventory data in maximizing asset monetization, without the need of constructing and maintaining infrastructures but relying on HUAWEI CLOUD's big data analysis capabilities.

CDM and DLI use the pay-per-use billing mode, so maintenance personnel are not required and the dedicated bandwidth cost is reduced. Compared with the offline data center, CDM and DLI save the maintenance cost by 70%. In addition, CDM and DLI have low skill demands for personnel and enable smooth migration of existing services, shortening the service rollout period by 50%.

Task

Use CDM, OBS, and DLI to complete trade statistics analysis using the existing data (for example, trade detail records and basic information) of company H's customer data collection and processing system.

Figure 1 Scenario scheme
When creating an OBS foreign table on DLI, the data storage format of the OBS table must meet the following requirements:
  • When you use the DataSource syntax to create an OBS table, the ORC, Parquet, JSON, CSV, Carbon, and Avro formats are supported.
  • When you use the Hive syntax to create an OBS table, the Text file, Avro, ORC, SequenceFile, RCFile, Parquet, Carbon formats are supported.

If the storage format of the raw data table does not meet the requirements, you can use CDM to import the raw data to DLI for analysis without uploading the data to OBS.

Data Types

  • Trade detail records

    Trade detail records include trade statistics of major trading nations.

Table 1 Trade detail records

Field Name

Field Type

Field Description

hs_code

string

List of import and export offering code

country

smallint

Basic information about countries

dollar_value

double

Transaction amount

quantity

double

Transaction volume

unit

smallint

Measurement unit

b_country

smallint

Basic information about the target country

imex

smallint

Import or export

y_year

smallint

Year

m_month

smallint

Month

  • Basic information

    The basic information indicates the dictionary data corresponding to the fields in the trade detail records.

Table 2 Basic information about countries (description of country)

Field Name

Field Type

Field Description

countryid

smallint

Country code

country_en

string

English name of a country

country_cn

string

Chinese name of a country

Table 3 Information about the update time (description of updatetime)

Field Name

Field Type

Field Description

countryid

smallint

Country code

imex

smallint

Import or export

hs_len

smallint

Length of the offering code

minstartdate

string

Minimum start time

startdate

string

Start time

newdate

string

Update time

minnewdate

string

Last update time

Table 4 Information about import and export offering code (description of hs246)

Field Name

Field Type

Field Description

id

bigint

ID

hs

string

Offering code

hs_cn

string

Chinese name of an offering

hs_en

string

English name of an offering

Table 5 Information about units (description of unit_general)

Field Name

Field Type

Field Description

id

smallint

Measurement unit code

unit_en

string

English name of a measurement unit

unit_cn

string

Chinese name of a measurement unit