Selecting Distribution Keys

Updated on 2024-10-14 GMT+08:00

View PDF

Selecting a distribution key for a hash table is essential. Details are as follows:

Ensure that the column values are discrete so that data can be evenly distributed to each DN. You can select the primary key of the table as the distribution key. For example, for a person information table, choose the ID card number column as the distribution key.
With the above principles met, you can select join conditions as distribution keys so that join tasks can be pushed down to DNs, reducing the amount of data transferred between the DNs.

For a hash table, an improper distribution key may cause data skew or poor I/O performance on certain DNs. Therefore, you need to check the table to ensure that data is evenly distributed on each DN. You can run the following SQL statement to check for data skew:

   
      select 
xc_node_id, count(1) 
from tablename 
group by xc_node_id 
order by xc_node_id desc;

xc_node_id corresponds to a DN. Generally, over 5% difference between the amount of data on different DNs is regarded as data skew. If the difference is over 10%, choose another distribution key.

Multiple distribution keys can be selected in GaussDB to evenly distribute data.

You can select the distribution key of the range or list distribution table as required. In addition to selecting a proper distribution key, pay attention to the impact of distribution rules on data distribution.

Parent topic: Best Practices of Table Design

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.

Chatbot