Column Design Specifications
- [Rule] Use recommended data types for column design.
Recommended data types must be used for column design. Some data types are not recommended because they apply to limited service scenarios and are not used on a large scale for commercial purposes.
Table 1 Best practices of database data types Data Type
Description
Recommended or Not
UUID
Different clusters may generate the same UUID.
Prohibited
Serial integer
Auto-increment column, including SMALLSERIAL, SERIAL, and BIGSERIAL.
Prohibited
Integer
TINYINT, SMALLINT, INTEGER, and BIGINT
Recommended
Arbitrary-precision
NUMERIC/DECIMAL
Recommended
Floating-point
REAL/FLOAT4, DOUBLE PRECISION/FLOAT8, and FLOAT
Recommended
Boolean
Boolean
Recommended
Fixed-length character
CHAR(n)
Recommended
Variable-length character
VARCHAR(n) and NVARCHAR2(n)
Recommended
TEXT and CLOB (character large object)
Not recommended
Time
DATE, TIME, TIMESTAMP, SMALLDATETIME, INTERVAL, and REALTIME
Recommended
TIMETZ and TIMESTAMPTZ
Not recommended
Binary
BYTEA (variable-length binary)
Recommended
BLOB (binary large object) and RAW (variable-length hexadecimal string)
Not recommended
Bit string
BIT(n) and VARBIT(n)
Recommended
Special character
NAME and "CHAR" are usually used within the database system.
Not recommended
JSON
JSON data does not support operators.
Not recommended
HLL
You are advised to use the HLL functions to reduce the impact on performance.
Not recommended
Currency
The MONEY type stores a currency amount with fixed fractional precision.
Not recommended
Geometric
POINT, LSEG, BOX, PATH, POLYGON, and CIRCLE
Not recommended
Network address
Stores IPv4 and MAC addresses.
Not recommended
- [Rule] Use the most specific numeric data types. If all of the following numeric types provide the required service precision, they are recommended in descending order of priority: integer, floating point, and NUMERIC.
- [Rule] Properly set the data type of a numeric column based on the value range, and use the NUMERIC or DECIMAL type as less as possible.
NUMERIC and DECIMAL are equivalent. NUMERIC or DECIMAL data operations consume great CPU resources.
Table 2 Storage space and value range of numeric data types Type
Storage Size (Unit: Byte)
Minimum Value
Maximum Value
TINYINT
1
0
255
SMALLINT
2
-32768
32767
INTEGER
4
-2,147,483,648
2,147,483,647
BIGINT
8
-9,223,372,036,854,775,808
9,223,372,036,854,775,807
REAL/FLOAT4
4
6-bit decimal digits
DOUBLE PRECISION/FLOAT8
8
15-bit decimal digits
- [Rule] Select a proper string type. If the value of a column must be a fixed-length character, use fixed-length character types or automatically add spaces. Otherwise, use the variable-length character type VARCHAR.
For a typical fixed-length column, for example, gender, you can enter only f or m that occupies a byte. You are advised to use the fixed-length data type (for example, CHAR(n)) for this type of columns.
If such requirement does not exist or longer characters may be required for future expansion, use variable-length character types (such as VARCHAR and TEXT) preferentially. You are advised not to specify the length of variable-length characters.
The reasons are as follows:
- For fixed-length columns, the input data that is shorter than the fixed length will be padded with space characters and then be saved to the database. This wastes the storage space in the database.
- For fixed-length character types, the entire table needs to be scanned and rewritten if the length needs to be extended later. This causes high performance overhead and affects online services.
- For a variable-length column with a fixed length, the system checks whether the length exceeds the limit each time upon data insertion. This causes performance overhead.
- [Rule] Do not store data of the numeric type in columns of the character type.
If numeric calculation or comparison (for example, adding a filter condition) is performed on data stored in columns of the character type, unnecessary overhead will be caused due to data type conversion, and the column indexes may become invalid, affecting query performance.
- [Rule] Do not store data of the time or date type in columns of the character type.
If calculation or comparison (for example, adding a filter condition) with data of the time or date type is performed on data stored in columns of the character type, unnecessary overhead will be caused by data type conversion, and the column indexes may become invalid, affecting query performance.
- [Rule] Add NOT NULL constraints to columns that never have NULL values.
In certain scenarios, the optimizer may specially optimize NOT NULL columns to improve query performance.
- [Rule] Use the same data type for joined columns.
If the column types are inconsistent during a join operation, overhead will be caused by data type conversion.
- [Rule] The number of large fields (such as varchar (1000) and varchar (4000)) is not to exceed eight.
- [Recommendation] When defining a column, you are advised to create a comment for the column to facilitate subsequent maintenance.
For details about the description, value range, and usage of different types of fields, see Data Type.
- [Recommendation] In tables that are logically related, columns having the same meaning should use the same data type.
- [Recommendation] For string data, you are advised to use variable-length strings and specify the maximum length. To avoid truncation, ensure that the specified maximum length is greater than the maximum number of characters to be stored. You are advised not to use CHAR(n), BPCHAR(n), NCHAR(n), or CHARACTER(n), unless you know that the string length is fixed.
- [Recommendation] Add NOT NULL constraints to columns that are used for WHERE filtering and join operations.
In certain scenarios, the optimizer may specially optimize NOT NULL columns to greatly improve query performance.
- [Recommendation] Do not reserve columns for a table. In most cases, you can quickly add or delete table columns, or change the default values of columns.
An added column must meet the following requirements. Otherwise, the entire table is updated, leading to additional overheads and affecting online services.
- The data type is BOOLEAN, BYTEA, SMALLINT, BIGINT, SMALLINT, INTEGER, NUMERIC, FLOAT, DOUBLE PRECISION, CHAR, VARCHAR, TEXT, TIMESTAMPTZ, TIMESTAMP, DATE, TIME, TIMETZ, or INTERVAL.
- The length of the default value cannot exceed 128 bytes.
- The default value of the added column does not contain the volatile function.
- The default value is required and cannot be NULL.
If you are not sure whether the third condition is met, contact GaussDB technical support for evaluation.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot