Getting Started with DWS
After creating a DWS cluster, you can try some best practices provided by DWS to meet your workload requirements.
Practice |
Description |
|
---|---|---|
Data Import and Export |
This practice demonstrates how to upload sample data to OBS and import OBS data to the target table on DWS, helping you quickly learn how to import data from OBS to a DWS cluster. You can import data in TXT, CSV, ORC, PARQUET, CARBONDATA, or JSON format from OBS to a DWS cluster for query. |
|
This practice demonstrates how to use General Data Service (GDS) to import data from a remote server to DWS. DWS allows you to import data in TXT, CSV, or FIXED format. |
||
DWS allows you to export ORC data to MRS using an HDFS foreign table. You can specify the export mode and export data format in the foreign table. Data is exported from DWS in parallel using multiple DNs and stored in HDFS. In this way, the overall export performance is improved. |
||
Data Migration |
This practice demonstrates how to migrate Oracle data to DWS. |
|
Using a Flink Job of DLI to Synchronize Kafka Data to a DWS Cluster in Real Time |
This practice demonstrates how to use DLI Flink jobs to synchronize consumption data from Kafka to DWS in real time. This practice takes about 90 minutes. The cloud services used in this practice include Virtual Private Cloud (VPC) and subnets, Elastic Cloud Server (ECS), Object Storage Service (OBS), Distributed Message Service (DMS) for Kafka, Data Lake Insight (DLI), and DWS. |
|
Optimization Table |
In this practice, you will learn how to optimize the design of your tables. You will start by creating tables without specifying their storage mode, distribution key, distribution mode, or compression mode. Load test data into these tables and test system performance. Then, follow excellent practices to create the tables again using new storage modes, distribution keys, distribution modes, and compression modes. Load the test data and test performance again. Compare the two test results to find out how table design affects the storage space, and the loading and query performance of the tables. Estimated time: 60 minutes |
|
Advanced Features |
In massive big data scenarios, with the growing of data, data storage and consumption increase rapidly. The need for data may vary in different time periods, therefore, data is managed in a hierarchical manner, improving data analysis performance and reducing service costs. In some data usage scenarios, data can be classified into hot data and cold data by accessing frequency. |
|
For partition tables whose partition columns are time, the automatic partition management function can be added to automatically create partitions and delete expired partitions, reducing partition table maintenance costs and improving query performance. To facilitate data query and maintenance, the time column is often used as the partition column of a partitioned table that stores time-related data, such as e-commerce order information and real-time IoT data. When the time-related data is imported to a partitioned table, the table should have partitions of the corresponding time ranges. Common partition tables do not automatically create new partitions or delete expired partitions. Therefore, maintenance personnel need to periodically create new partitions and delete expired partitions, leading to increased O&M costs. Addressing this, DWS introduces the automatic partition management feature. You can set the table-level parameters period and ttl to enable the automatic partition management function, which automatically creates partitions and deletes expired partitions, reducing partitioned table maintenance costs and improving query performance. |
||
Database Management |
This practice demonstrates how to use DWS for resource management, helping enterprises eliminate bottlenecks in concurrent queries. SQL jobs can run smoothly without affecting each other and consume less resources than before. |
|
Based on a large number of SQL execution mechanisms and practices, we can optimize SQL statements following certain rules to more quickly execute SQL statements and obtain correct results. |
||
This practice includes the following storage skew cases:
|
||
A DWS cluster mainly consists of system administrators and common users. This practice describes the permissions of system administrators and common users and describes how to create users and query user information. |
||
This practice demonstrates some basic database query cases:
|
||
Sample Data Analysis |
This practice demonstrates how to analyze vehicles that have passed through traffic checkpoints. The process involves loading 890 million pieces of data from these checkpoints into a single database table on DWS for accurate and fuzzy query. It is a great example of how DWS can handle high-performance queries of historical data. |
|
This practice demonstrates how to load the sample data set from OBS to a DWS cluster and perform data queries. It comprises multi-table analysis and theme analysis in data analysis scenarios. |
||
In this practice, the daily business data of each retail store is loaded from OBS to the corresponding table in the data warehouse cluster for summarizing and querying KPIs. This data includes store turnover, customer flow, monthly sales ranking, monthly customer flow conversion rate, monthly price-rent ratio, and sales per unit area. This practice demonstrates the multidimensional query and analysis of DWS in retail scenarios. |
||
Data Security |
Data encryption is widely used in information systems to prevent unauthorized access and data leakage. As the core of an information system, the DWS data warehouse also provides transparent encryption and encryption using SQL functions. This section describes SQL function encryption. |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.