Checkpoint Vehicle Analysis
This practice shows you how to analyze passing vehicles at checkpoints. In this practice, 890 million data records from checkpoints are loaded to a single database table on GaussDB(DWS) for accurate and fuzzy query, demonstrating the ability of GaussDB(DWS) to perform high-performance query for historical data.
The sample data has been uploaded to the traffic-data folder in an OBS bucket, and all Huawei Cloud accounts have been granted the read-only permission for accessing the OBS bucket.
General Procedure
This practice takes about 40 minutes. The basic process is as follows:
Supported Regions
Region |
OBS Bucket |
---|---|
EU-Dublin |
dws-demo-eu-west-101 |
Making Preparations
- You have registered a GaussDB(DWS) account and checked the account status before using GaussDB(DWS). The account cannot be in arrears or frozen.
- You have obtained the AK and SK of the account.
Step 1: Creating a Cluster
- Log in to the management console.
- Click Service List and choose Analytics > GaussDB(DWS).
- In the navigation pane on the left, choose Clusters. On the displayed page, click Create Cluster in the upper right corner.
- Configure the parameters according to Table 2.
Table 2 Basic configurations Parameter
Configuration
Region
Select CN North-Beijing4 or CN-Hong KongEU-Dublin.
NOTE:EU-Dublin is used as an example. You can select other regions as required. Ensure that all operations are performed in the same region.
AZ
AZ2
Resource
Standard Warehouse
Compute Resource
ECS
Storage type
Cloud SSD
CPU Architecture
X86
Node Flavor
dws2.m6.4xlarge.8 (16 vCPUs | 128 GB | 2000 GB SSD)
NOTE:If this flavor is sold out, select other AZs or flavors.
Hot Storage
100 GB/node
Nodes
3
- Verify that the information is correct and click Next: Configure Network. Configure the network by referring to Table 3.
- Verify that the information is correct and click Next: Configure Advanced Settings. Configure the network by referring to Table 4.
- Click Next: Confirm, confirm the configuration, and click Next.
- Wait about 6 minutes. After the cluster is created, click next to the cluster name. On the displayed cluster information page, record the value of Public Network Address.
Step 2: Using Data Studio to Connect to a Cluster
- Ensure that JDK 1.8.0 or later has been installed on the client host. Choose PC > Properties > Advanced System Settings > Environment Variables and set JAVA_HOME (for example, C:\Program Files\Java\jdk1.8.0_191). Add ;%JAVA_HOME%\bin to the variable path.
- On the Connections page of the GaussDB(DWS) console, download the Data Studio GUI client.
- Decompress the downloaded Data Studio software package, go to the decompressed directory, and double-click Data Studio.exe to start the client.
- On the Data Studio main menu, choose File > New Connection. In the dialog box that is displayed, configure the connection based on Table 5.
Table 5 Data Studio software configuration Parameter
Configuration
Database Type
GaussDB(DWS)
Connection Name
dws-demo
Host
dws-demov.dws.huaweicloud.com
The value of this parameter must be the same as the value of Public Network Address queried in Step 1: Creating a Cluster.
Host Port
8000
Database Name
gaussdb
User Name
dbadmin
Password
-
Enable SSL
Disable
- Click OK.
Step 3: Importing Sample Data
After connecting to the cluster using the SQL client tool, perform the following operations on the SQL client tool to import the sample data from traffic checkpoints and perform data queries.
- Execute the following statement to create the traffic database:
1
CREATE DATABASE traffic encoding 'utf8' template template0;
- Perform the following steps to switch to the new database:
- In the Object Browser window of the Data Studio client, right-click the database connection and choose Refresh from the shortcut menu. Then, the new database is displayed.
- Right-click the name of the new database traffic and choose Connect to DB from the shortcut menu.
- Right-click the name of the new database traffic and choose Open Terminal from the shortcut menu. The SQL command window for connecting to the specified database is displayed. Perform the following steps in the window.
- Execute the following statements to create a database table for storing vehicle information from traffic checkpoints:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
CREATE SCHEMA traffic_data; SET current_schema= traffic_data; DROP TABLE if exists GCJL; CREATE TABLE GCJL ( kkbh VARCHAR(20), hphm VARCHAR(20), gcsj DATE , cplx VARCHAR(8), cllx VARCHAR(8), csys VARCHAR(8) ) with (orientation = column, COMPRESSION=MIDDLE) distribute by hash(hphm);
- Create a foreign table, which is used to identify and associate the source data on OBS.
- <obs_bucket_name> indicates the OBS bucket name. Only some regions are supported. For details about the supported regions and OBS bucket names, see Supported Regions. GaussDB(DWS) clusters do not support cross-region access to OBS bucket data.
- , and replace <Access_Key_Id> and <Secret_Access_Key> with the value obtained in Making Preparations.
- // Hard-coded or plaintext AK and SK are risky. For security purposes, encrypt your AK and SK and store them in the configuration file or environment variables.
- If the message "ERROR: schema "xxx" does not exist Position" is displayed when you create a foreign table, the schema does not exist. Perform the previous step to create a schema.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
CREATE SCHEMA tpchobs; SET current_schema = 'tpchobs'; DROP FOREIGN table if exists GCJL_OBS; CREATE FOREIGN TABLE GCJL_OBS ( like traffic_data.GCJL ) SERVER gsmpp_server OPTIONS ( encoding 'utf8', location 'obs://<obs_bucket_name>/traffic-data/gcxx', format 'text', delimiter ',', access_key '<Access_Key_Id>', secret_access_key '<Secret_Access_Key>', chunksize '64', IGNORE_EXTRA_DATA 'on' );
- Execute the following statement to import data from the foreign table to the database table:
1
INSERT INTO traffic_data.GCJL SELECT * FROM tpchobs.GCJL_OBS;
It takes some time to import data.
Step 4: Performing Vehicle Analysis
- Performing ANALYZE
This statement collects statistics related to ordinary tables in databases. The statistics are saved to the system catalog PG_STATISTIC. When you run the planner, the statistics help you develop an efficient query execution plan.
Execute the following statement to generate the table statistics:
1
ANALYZE;
- Querying the data volume of the data table
Execute the following statement to query the number of loaded data records:
1 2
SET current_schema= traffic_data; SELECT count(*) FROM traffic_data.gcjl;
- Accurate vehicle query
Run the following statements to query the driving route of a vehicle by the license plate number and time segment. GaussDB(DWS) responds to the request in seconds.
1 2 3 4 5 6
SET current_schema= traffic_data; SELECT hphm, kkbh, gcsj FROM traffic_data.gcjl where hphm = 'YD38641' and gcsj between '2016-01-06' and '2016-01-07' order by gcsj desc;
- Fuzzy vehicle query
Run the following statements to query the driving route of a vehicle by the license plate number and time segment. GaussDB(DWS) responds to the request in seconds.
1 2 3 4 5 6 7
SET current_schema= traffic_data; SELECT hphm, kkbh, gcsj FROM traffic_data.gcjl where hphm like 'YA23F%' and kkbh in('508', '1125', '2120') and gcsj between '2016-01-01' and '2016-01-07' order by hphm,gcsj desc;
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.