Updated on 2024-11-12 GMT+08:00

Scenario

Consulting company H uses CDM to import local trade statistics to OBS, and Data Lake Insight (DLI) to analyze trade statistics. In this way, company H builds its big data analytics platform at an extremely low cost, allowing the company more time to focus on their businesses and make innovations continuously.

Background

Company H is a commercial organization in China that engages in collecting trade statistics of major trading nations and buyer data. It has a large-scale trade statistics database. The collected data is widely used in industry research, international trade promotion, and other fields.

In the past, company H used its own big data cluster with maintenance by dedicated personnel. Each year, company H purchased the dedicated bandwidth from China Telecom and China Unicom and invested heavily in equipment room, electric power, private networks, servers, and O&M. However, the company could not satisfy customers' ever-changing service requirements due to insufficient workforce and limited capabilities of its big data cluster. As a result, only 4% of 100 TB inventory data was useful.

After migrating local trade statistics to Huawei Cloud, company H can make full use of the 100 TB inventory data in maximizing asset monetization, without the need of constructing and maintaining infrastructures but relying on Huawei Cloud's big data analysis capabilities.

CDM and DLI use the pay-per-use billing mode, so maintenance personnel are not required and the dedicated bandwidth cost is reduced. Compared with the on-premises data center, CDM and DLI save the maintenance cost by 70%. In addition, CDM and DLI have low skill demands for personnel and enable smooth migration of existing services, shortening the service rollout period by 50%.

Task

Use CDM, OBS, and DLI to complete trade statistics analysis using the existing data (for example, trade detail records and basic information) of company H's customer data collection and processing system.

Figure 1 Scenario scheme
When creating an OBS foreign table on DLI, the data storage format of the OBS table must meet the following requirements:
  • When you use the DataSource syntax to create an OBS table, the ORC, Parquet, JSON, CSV, Carbon, and Avro formats are supported.
  • When you use the Hive syntax to create an OBS table, the Text file, Avro, ORC, SequenceFile, RCFile, Parquet, Carbon formats are supported.

If the storage format of the raw data table does not meet the requirements, you can use CDM to import the raw data to DLI for analysis without uploading the data to OBS.

Data Types

  • Trade detail records

    Trade detail records include trade statistics of major trading nations.

    Table 1 Trade detail records

    Field Name

    Field Type

    Field Description

    hs_code

    string

    List of import and export offering code

    country

    smallint

    Basic information about countries

    dollar_value

    double

    Transaction amount

    quantity

    double

    Transaction volume

    unit

    smallint

    Measurement unit

    b_country

    smallint

    Basic information about the target country

    imex

    smallint

    Import or export

    y_year

    smallint

    Year

    m_month

    smallint

    Month

  • Basic information

    The basic information indicates the dictionary data corresponding to the fields in the trade detail records.

    Table 2 Basic information about countries (description of country)

    Field Name

    Field Type

    Field Description

    countryid

    smallint

    Country code

    country_en

    string

    English name of a country

    country_cn

    string

    Chinese name of a country

    Table 3 Information about the update time (description of updatetime)

    Field Name

    Field Type

    Field Description

    countryid

    smallint

    Country code

    imex

    smallint

    Import or export

    hs_len

    smallint

    Length of the offering code

    minstartdate

    string

    Minimum start time

    startdate

    string

    Start time

    newdate

    string

    Update time

    minnewdate

    string

    Last update time

    Table 4 Information about import and export offering code (description of hs246)

    Field Name

    Field Type

    Field Description

    id

    bigint

    ID

    hs

    string

    Offering code

    hs_cn

    string

    Chinese name of an offering

    hs_en

    string

    English name of an offering

    Table 5 Information about units (description of unit_general)

    Field Name

    Field Type

    Field Description

    id

    smallint

    Measurement unit code

    unit_en

    string

    English name of a measurement unit

    unit_cn

    string

    Chinese name of a measurement unit