GaussDB(DWS) Overall Development and Design Specifications
To improve data development efficiency, reduce resource consumption, and minimize business disruptions, developers should follow the development and design specifications listed in Table 1 before using GaussDB(DWS) for data development.
These design specifications are divided into three levels of importance: Caution, Warning, and Danger. Each level has the following meanings:
- Caution: Recommended design standards that users should follow. Failure to comply may affect business performance and increase O&M difficulties.
- Warning: Compliance with these rules ensures efficient business operations. Violation of these rules may result in business disruptions and errors.
- Danger: Violation of these rules may result in data deletion, significant system changes, or even system failures.
Scenario |
No. |
Specifications |
Level |
Impact of Non-Compliance |
---|---|---|---|---|
User and connection management |
1 |
Avoid having all business operations run under a single database user. Instead, plan different database users according to the business modules. |
Caution |
When abnormal business or user operations cause problems in the entire cluster, it is difficult to quickly isolate and control them. |
2 |
For better access control of different business modules, it is better to use multiple users and permissions instead of depending on the system administrator user to run business operations. For details, see GaussDB(DWS) User and Permissions Management. |
Caution |
Administrators with excessive user privileges are difficult to manage. |
|
3 |
You are not advised to connect services to a single CN. Instead, configure load balancing to ensure that connections to each CN are balanced. |
Warning |
Connections to CNs are unbalanced, load skew slows down the overall performance, and services cannot be automatically recovered when CNs are faulty. |
|
4 |
After connecting to the database and completing required operations, close the database connection in a timely manner to prevent idle connections from continuously occupying resources and consuming connections and public resources. |
Warning |
Having too many idle connections consumes a lot of public resources, making it impossible for normal business operations to establish connections and run. |
|
5 |
In the scenario where the database connection pool is used, after the database GUC parameters are set using the SET statement in the service, the parameters must be restored using the RESET statement before the connection pool is returned. |
Warning |
Jobs that use connection pools interfere with each other. |
|
Object design |
6 |
You are not advised to specify a user-defined tablespace when creating an ordinary table. |
Warning |
Custom table spaces occupy storage space that is not controlled, which can lead to space and performance issues. |
7 |
Do not specify the COMPRESS compression attribute when creating a row-store table. |
Warning |
The row-store compression feature is unstable and has been deprecated. |
|
8 |
You are not advised to frequently perform small batch real-time data insertion for ordinary column-store tables. |
Warning |
Real-time data insertion for column-store causes severe small CU expansion, triggering continuous space, resource, and performance issues. |
|
9 |
When creating a hash-distributed table object, ensure that data is evenly distributed. (For a table with more than 10 GB data, the skew rate must be less than 10%.) |
Warning |
Data distribution skewness leads to calculation skewness, resulting in space and performance issues. |
|
10 |
When creating a table object for REPLICATION distribution, ensure that the number of rows in the table is less than 1 million. |
Caution |
Replicating large amounts of table data can increase storage space risks. |
|
11 |
When creating an HStore table, ensure that the database GUC parameter settings meet the following requirements:
|
Warning |
The delta table of HStore expands rapidly, causing sustained performance degradation. |
|
12 |
For a large table (with more than 50 million rows of data) that contains the time field, the table must be designed as a partition table and the partition interval must be properly designed based on the query characteristics. |
Warning |
Querying and cleaning specific time periods for partitioned tables is much more efficient than for non-partitioned tables. |
|
13 |
For a table where a large amount of data needs to be added, deleted, or modified, it is recommended that the number of indexes be less than or equal to three. The maximum number of indexes is five. |
Suggestion |
Excessive indexing can slow down data management tasks like adding, deleting, and modifying data. It can even cause xlog buildup and cluster malfunctions in extreme cases. |
|
Operation regulations |
14 |
In scenarios where the client explicitly starts a transaction or manually disables "autocommit", you must manually execute a commit command to commit the transaction. |
Warning |
Long transactions caused by uncommitted transactions block other locked businesses and recovery operations. |
15 |
You are advised to optimize the statements that take longer than 30 minutes to execute. |
Caution |
Issues with business efficiency and performance. |
|
16 |
Avoid business operations that take longer than 2 hours to execute to prevent issues such as long transactions and long locks. |
Warning |
Long transactions block other locked businesses and recovery operations, affecting business performance. |
|
17 |
Be cautious when using DROP OBJECT (e.g., DATABASE, USER/ROLE, SCHEMA, TABLE, VIEW) as it may cause data loss, especially with CASCADE deletions. Always back up data before proceeding. |
Danger |
The data is deleted abnormally and cannot be recovered quickly. |
|
18 |
Avoid executing ALTER (add, delete, modify columns, DROP PARTITION, etc.) and TRUNCATE operations during business peak hours to prevent long SQL blocking ALTER, TRUNCATE operations, or SQL business being blocked by ALTER or TRUNCATE. |
Warning |
ALTER and TRUNCATE have high lock levels, which can block other concurrent businesses and be blocked by other businesses, causing business performance issues. |
|
19 |
Do not perform the CREATE INDEX and REINDEX operations on large tables during peak hours. |
Warning |
CREATE INDEX and REINDEX operations can block data import and take a long time to execute on large tables, causing prolonged data import. |
|
20 |
Avoid SQL statements that cannot be pushed down (indicated by REMOTE_XXX_QUERY in the EXPLAIN output). |
Caution |
CN bottleneck, causing business performance issues. |
|
21 |
Refrain from frequent COUNT queries on large row-store tables (over 10 million rows). |
Caution |
High I/O usage, causing business performance issues. |
|
22 |
Limit the number of UNION/UNION ALL branches in a single SQL statement to 50, and associate no more than 25 non-replicated tables. |
Caution |
High resource consumption, causing business performance issues. |
|
23 |
Prevent Cartesian products by ensuring proper association conditions during large table joins. |
Caution |
High resource consumption, causing business performance issues. |
|
24 |
In multi-table association queries, keep the number of streams in a single statement below 100. |
Caution |
High CPU usage, causing business performance issues. |
|
25 |
Use the WITH RECURSIVE statement cautiously. Define data repetition limits and termination conditions to ensure recursion ends as expected. |
Caution |
Infinite loops, causing business performance issues. |
|
26 |
Avoid using UPDATE or DELETE to modify or delete a large volume of data and use TRUNCATE PARTITION or DROP PARTITION instead. |
Caution |
Dirty pages, causing business performance issues. |
|
27 |
Do not concurrently update the same column-store table using the UPDATE and UPSERT statements. |
Caution |
Lock issues, causing business performance issues. |
|
28 |
Avoid using stored procedures, especially those with complex structures and multiple layers of nesting. |
Caution |
Low maintenance efficiency and high fault locating costs. |
|
O&M management |
29 |
If you need to adjust database parameters at the cluster level, contact Huawei for a risk assessment before making any changes. |
Danger |
Adjusting parameters at the cluster level can impact customer services, so it is important to conduct a risk assessment before making any changes. |
30 |
If more than 10% of a table's records are added or modified at once, explicitly execute the ANALYZE operation. |
Warning |
Business performance stability issues. |
|
31 |
Execute VACUUM FULL on tables with a high dirty page rate and small CU ratio exceeding 25% at regular intervals. Perform VACUUM FULL on common tables during low-traffic hours and on system tables when they are offline. |
Warning |
Object expansion, causing business performance issues. |
|
32 |
Alarms of "major" level or above in the database must be promptly handled and cleared. |
Warning |
Long-term cluster stability. |
|
33 |
Before batch business rollout, it is necessary to conduct large-scale testing and validation on the test cluster and notify technical support personnel to ensure business security. |
Caution |
Business launch evaluation and assurance. |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot