Analysis of Passed Vehicles at Traffic Checkpoints

This practice demonstrates the analysis of passed vehicles at traffic checkpoints. In this practice, 890 million pieces of data from traffic checkpoints are loaded to a single database table on GaussDB(DWS) for performing accurate query and fuzzy query. It is an example of high-performance query of historical data on GaussDB(DWS).

The sample data has been uploaded to the traffic-data folder in an OBS bucket, and all HUAWEI CLOUD accounts have been granted the read-only permission to access the OBS bucket.

This practice takes about 40 minutes. The basic process is as follows:

  1. Making Preparations
  2. Step 1: Creating a Cluster
  3. Step 2: Using Data Studio to Connect to a Cluster
  4. Step 3: Importing Sample Data
  5. Step 4: Performing Vehicle Analysis

Making Preparations

  • Register a HUAWEI CLOUD account and check the account status before using GaussDB (DWS). The account cannot be in arrears or frozen.
  • Obtain the AK/SK of the account. For details, see Creating Access Keys (AK and SK).

Step 1: Creating a Cluster

  1. Log in to the HUAWEI CLOUD management console.
  2. Choose Service List > EI Enterprise Intelligence > Data Warehouse Service.
  3. In the navigation pane on the left, choose Cluster Management. On the displayed page, click Buy DWS Cluster in the upper right corner.
  4. Configure the parameters according to Table 1.

    Table 1 Software configuration

    Parameter

    Configuration Method

    Region

    Select the CN North-Beijing4.

    NOTE:

    This section uses CN North-Beijing4 as an example. If you want to perform operations in other regions, ensure that all operations are performed in the same region.

    AZ

    AZ2

    Cluster Type

    Standard

    CPU Architecture

    x86

    Node Flavor

    dws2.m6.4xlarge.8 (16 vCPU | 128 GB | 2000 GB SSD)

    Nodes

    3

    Cluster Name

    dws-demo

    Administrator Account

    dbadmin

    Administrator Password

    Demo@123456

    Confirm Password

    Demo@123456

    Database Port

    8000

    VPC

    vpc-default

    Subnet

    subnet-default(192.168.0.0/24)

    Security Group

    Automatic creation

    EIP

    Buy now

    Bandwidth

    1 Mbit/s

    Advanced Settings

    Default

  5. Confirm the information, click Buy Now, and then click Submit.
  6. Wait about 6 minutes. After the cluster is created, click next to the cluster name. On the displayed cluster information page, record the Public Network Address, for example, dws-demov.dws.huaweicloud.com.

Step 2: Using Data Studio to Connect to a Cluster

  1. Ensure that JDK 1.8.0 or later has been installed on the client host. Choose PC > Properties > Advanced System Settings > Environment Variables and set JAVA_HOME (for example, C:\Program Files\Java\jdk1.8.0_191). Add ;%JAVA_HOME%\bin to the variable path.
  2. On the Connection Management page of the GaussDB(DWS) console, download the Data Studio GUI client.
  3. Decompress the downloaded Data Studio software package, go to the decompressed directory, and double-click Data Studio.exe to start the client.
  4. On the Data Studio main menu, choose File > New Connection. In the dialog box that is displayed, configure the connection based on Table 2.

    Table 2 Data Studio software configuration

    Parameter

    Configuration Method

    Database Type

    GaussDB(DWS)

    Column

    dws-demo

    Host

    dws-demov.dws.huaweicloud.com

    The value of this parameter must be the same as the value of Public Network Address queried in Step 1: Creating a Cluster.

    Host Port

    8000

    Database Name

    postgres

    User Name

    dbadmin

    Password

    Demo@123456

    Enable SSL

    Disable

  5. Click OK.

Step 3: Importing Sample Data

After connecting to the cluster using the SQL client tool, perform the following operations in the SQL client tool to import the sample data from traffic checkpoints and perform data queries.

  1. Execute the following statement to create the traffic database:

    1
    create database traffic encoding 'utf8' template template0; 
    

  2. Perform the following steps to switch to the new database:

    1. In the Object Browser window of the Data Studio client, right-click the database connection and choose Refresh from the shortcut menu. Then, the new database is displayed.
    2. Right-click the name of the new database traffic and choose Connect to DB from the shortcut menu.
    3. Right-click the name of the new database traffic and choose Open Terminal from the shortcut menu. The SQL command window for connecting to the specified database is displayed. Perform the following steps in the window.

  3. Execute the following statements to create a database table for storing vehicle information from traffic checkpoints:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    create schema traffic_data;
    set current_schema= traffic_data;
    drop table if exists GCJL;
    CREATE TABLE GCJL
    (
            kkbh   VARCHAR(20), 
            hphm   VARCHAR(20),
            gcsj   DATE ,
            cplx   VARCHAR(8),
            cllx   VARCHAR(8),
            csys   VARCHAR(8)
    )
    with (orientation = column, COMPRESSION=MIDDLE)
    distribute by hash(hphm);
    

  4. Create a foreign table, which is used to identify and associate the source data on OBS. Replace <Access_Key_Id> and <Secret_Access_Key> with the actual values obtained in Making Preparations.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    create schema tpchobs;
    set current_schema = 'tpchobs';
    drop FOREIGN table if exists GCJL_OBS;
    CREATE FOREIGN TABLE GCJL_OBS
    (
            like traffic_data.GCJL
    )
    SERVER gsmpp_server 
    OPTIONS (
            encoding 'utf8',
            location 'obs://dws-demo-cn-north-4/traffic-data/gcxx',
            format 'text',
            delimiter ',',
            access_key '<Access_Key_Id>',
            secret_access_key '<Secret_Access_Key>',
            chunksize '64',
            IGNORE_EXTRA_DATA 'on'
    );
    

  5. Execute the following statement to import data from the foreign table to the database table:

    1
    insert into traffic_data.GCJL select * from tpchobs.GCJL_OBS;
    

    It takes some time to import data.

Step 4: Performing Vehicle Analysis

  1. Executing Analyze

    This statement collects statistics related to common table content in databases. The statistics are saved under the system directory PG_STATISTIC. The statistics are useful when you run the planner, which provides you with an efficient query execution plan.

    Execute the following statement to generate the statistics related to the tables:

    1
    Analyze;
    
  1. Querying the data volume of the data table

    Execute the following statement to query the number of loaded data records:

    1
    2
    set current_schema= traffic_data;
    Select count(*) from traffic_data.gcjl;
    
  1. Accurate vehicle query

    Execute the following statement to query driving route by license plate number and time segment: GaussDB(DWS) responds to the request in seconds.

    1
    2
    3
    4
    5
    6
    set current_schema= traffic_data;
    select hphm, kkbh, gcsj
    from traffic_data.gcjl
    where hphm = 'YueD38641' 
    and gcsj between '2016-01-06' and '2016-01-07'
    order by gcsj desc;
    
  1. Fuzzy vehicle query

    Execute the following statement to query driving route by license plate number and time segment. GaussDB(DWS) responds to the request in seconds.

    1
    2
    3
    4
    5
    6
    7
    set current_schema= traffic_data;
    select hphm, kkbh, gcsj 
    from traffic_data.gcjl
    where hphm like '%A23F%' 
    and kkbh in('508', '1125', '2120') 
    and gcsj between '2016-01-01' and '2016-01-07'  
    order by hphm,gcsj desc;