Updated on 2024-09-06 GMT+08:00

Table Design

  • All created MySQL tables must use the InnoDB engine.
  • The decimal type must be DECIMAL. Do not use FLOAT or DOUBLE.

    FLOAT and DOUBLE have lower precision than DECIMAL and may cause rounding errors. If a value to be stored is beyond the range of DECIMAL, split the value into INTEGER and DECIMAL parts and store them separately.

  • The following reserved words cannot be used: DESC, RANGE, MATCH, and DELAYED.

    For details about the keywords and reserved words of MySQL 8.0 Community Edition, see Keywords and Reserved Words.

    In addition to the keywords and reserved words of MySQL 8.0 Community Edition, some other keywords and reserved words are added to GaussDB(for MySQL). Do not use these keywords and reserved words when naming objects.

    Table 1 lists the new keywords and reserved words in GaussDB(for MySQL).
    Table 1 GaussDB(for MySQL) new keywords and reserved words

    Reserved Word

    Related Scenario

    EXTRA_HEALTH

    High availability

    PBS

    Backup and restoration

    REDO

    Primary/standby replication

    SLICEID

    Shared storage

    SLOWIO

    Shared storage

    SPACEUSAGE

    Shared storage

    RDS_INSTANT

    Recycle bin

    RECYCLE_BIN

    Recycle bin

    RDS_RECYCLE

    Recycle bin

    RDS_TAC

    Recycle bin

    RDS_GDB_CTRL

    RegionlessDB

  • Every data table must have a primary key, which can be either an ordered and unique field related to business or an auto-increment field unrelated to business.
  • Each table field must have a default value and NOT NULL. If the field is the numeric type, use 0 as its default value. If the field is the character type (such as VARCHAR), use an empty string (").

    The absence of a primary key may cause slow execution of the primary database and replication delay.

  • You are not advised to use partitioned tables. If necessary, use multiple independent tables.

    Disadvantages of partitioned tables:

    • All partitions will be locked during DDL operations. As a result, operations on the partitions will be blocked.
    • When a partitioned table contains a large amount of data, it is difficult and risky to perform DDL or other O&M operations on the table.
    • Partition tables are seldom used, which may cause unknown risks.
    • When a single server is poor in performance, splitting a partitioned table is expensive.
    • When all partitions are accessed due to improper operations on a partitioned table, severe performance problems may occur.
  • Each table contains two DATETIME fields: CREATE_TIME and UPDATE_TIME.

    You can obtain the required data from a data warehouse based on these two fields without consulting services.

    When an exception occurs in the database, you can use the two fields to determine the time when the data is inserted and updated. In extreme cases, you can determine whether to restore data based on the fields.

  • VARCHAR is a variable-length character data type. The length of VARCHAR cannot exceed 2,048.

    If the length of a field exceeds 2,048, define the field type as TEXT or create an independent table and use a primary key to associate the related tables. In this way, the index efficiency of other fields is not affected.

  • The length of a single row in a table cannot exceed 1,024 bytes.
  • The maximum number of fields in a single table is 50.
  • If the lengths of strings are almost the same, use the fixed-length character strings.
  • On the premise of ensuring data consistency, cross-table redundant fields are allowed to avoid join queries and improve query performance.

    Redundant fields must comply with the following rules:

    • Fields are not frequently modified.
    • Fields are not larger VARCHAR and TEXT.
  • The data types with proper storage size can save database tablespace and index storage space while improving the search speed. LONG TEXT and BLOB are not recommended.
  • Ensure that all characters are stored and represented in UTF-8 or utf8mb4 encoding. Comments must be provided for tables and fields.
  • Avoid using large transactions.

    For example, if multiple SELECT and UPDATE statements are executed in a high-frequency transaction, the database concurrency capability is severely affected because resources such as locks held by the transaction can be released only when the transaction is rolled back or committed. In this case, data write consistency must also be considered.

  • Full-text indexes are not recommended because there are many limitations on them.
  • For ultra-large tables, you also need to comply with the following rules:
    • Use TINYINT, SMALLINT, and MEDIUM_INT as integer types instead of INT. If a value is non-negative, add UNSIGNED. Keep the field type as short as possible while meeting service evolution requirements.
    • Allocate only the necessary space for VARCHAR length.

      Example:

      CREATE TABLE T1 (A VARCHAR(255));

      After optimization:

      CREATE TABLE T1 (A VARCHAR(Length that meets service requirements));

    • Use enumerations or integers instead of string types.
    • Use TIMESTAMP instead of DATETIME.
    • Keep the number of fields in a single table below 20.
    • Try not to use UNIQUE. The program enforces the constraint.
    • Store IP addresses as integers.
    • Partition fields with strong sequence and add range conditions during queries to improve efficiency.
    • If there is obvious hot data and other data is seldom accessed, place the hot data in a separate partition.
    • Use a database proxy to connect to a database. In scenarios that do not require high consistency, distribute read workloads to read replicas. When dealing with a large volume of queries, add more read replicas to improve query performance.