Help Center/ GeminiDB/ GeminiDB Cassandra API/ FAQs/ Database Usage/ What Should I Pay Attention to When Creating a GeminiDB Cassandra Table?
Updated on 2024-09-04 GMT+08:00

What Should I Pay Attention to When Creating a GeminiDB Cassandra Table?

When you create tables in a GeminiDB Cassandra database, pre-allocate memory to guarantee database performance. GeminiDB Cassandra has a limit on the number of tables.

Precautions

  • Half of node memory is allocated to the storage engine.
  • An odd number of clusters can tolerate N/2-1 faulty nodes, and an even number of clusters can tolerate N/2 faulty nodes.
  • GeminiDB Cassandra API utilizes a table-level hash ring, with the tokens parameter indicating the number of data shards for a table. This parameter differs from the num_tokens used in open-source Cassandra.

Calculating the Number of Tables

The memory required for creating tables depends on the instance specifications. Assume that an instance has 4 vCPUs and 16 GB memory and the size of a single table is 768 MB.

Maximum number of tables that can be created = Total available memory of the cluster / Memory required by a single table

  • Cluster with an odd number of nodes

    Available cluster memory = Node memory/2 x (N/2 + 1)

  • Cluster with an even number of nodes

    Available cluster memory = Node memory/2 x (N/2)

For example:
  • Available memory of an instance with 3 nodes, 4 vCPUs, and 16 GB memory = 16/2 x (3/2 + 1) = 16 GB

    Maximum number of created tables = 16 x 1024 MB/768 MB = 21

  • Available memory of an instance with 4 nodes, 4 vCPUs, and 16 GB memory = 16/2 x (4/2) = 16 GB

    Maximum number of created tables = 16 x 1024 MB/768 MB = 21

  • Available memory of an instance with 5 nodes, 4 vCPUs, and 16 GB memory = 16/2 x (5/2 + 1) = 24 GB

    Maximum number of created tables = 24 x 1024 MB/768 MB = 32

For details about the mapping between the number of nodes (4 vCPUs, 16 GB) and the number of tables, see Table 1.
Table 1 Upper limit on the number of tables

Instance Class

Number of Nodes

Number of Tables

4 vCPUs | 16 GB

3

21

4

21

5

32

6

32

7

42

8

42

9

53

10

53

11

64

12

64

  • A single table occupies 768 MB memory, and the default number of table tokens is 12. If tokens are separately set, calculate the number of tables using the following formula: (768/12) x Number of tokens.
  • The preceding formula is designed for common tables. If stream table is enabled, one stream table consumes resources 2.5 times more than common tables.
For details about the mapping between the number of nodes (8 vCPUs, 32 GB) and the number of tables, see Table 2.
Table 2 Upper limit on the number of tables

Instance Class

Number of Nodes

Number of Tables

8 vCPUs | 32 GB

3

22

4

22

5

34

6

34

7

45

8

45

9

56

10

56

11

68

12

68

  • A single table occupies 1440 MB memory, and the default number of table tokens is 12. If tokens are set separately, calculate the number of tables using the following formula: (1440/12) x Number of tokens.
  • The preceding formula is designed for common tables. If stream table is enabled, one stream table consumes resources 2.5 times more than common tables.
For details about the mapping between the number of nodes (16 vCPUs, 64 GB) and the number of tables, see Table 3.
Table 3 Upper limit on the number of tables

Instance Class

Number of Nodes

Number of Tables

16 vCPUs | 64 GB

3

45

4

45

5

68

6

68

7

91

8

91

9

113

10

113

11

136

12

136

  • A single table occupies 1440 MB memory, and the default number of table tokens is 12. If tokens are set separately, calculate the number of tables using the following formula: (1440/12) x Number of tokens.
  • The preceding formula is designed for common tables. If stream table is enabled, one stream table consumes resources 2.5 times more than common tables.
For details about the mapping between the number of nodes (32 vCPUs, 128 GB) and the number of tables, see Table 4.
Table 4 Mapping between the number of nodes (32U128GB) and the number of tables

Instance Class

Number of Nodes

Number of Tables

32 vCPUs | 128 GB

3

68

4

68

5

102

6

102

7

136

8

136

9

170

10

170

11

204

12

204

  • A single table occupies 1920 MB memory, and the default number of table tokens is 12. If tokens are separately set, calculate the number of tables using the following formula: (1920/12) x Number of tokens
  • The preceding formula is designed for common tables. If stream table is enabled, one stream table consumes resources 2.5 times more than common tables.

Parameters for Creating a Table

  1. Z00_THROUGHPUT (throughput parameter) is related to the upper limit of table write performance. The default value is big, indicating the upper limit of standard write performance.
    • Low throughput
      CREATE TABLE test1 (k int,p int,s int static,v int,PRIMARY KEY (k, p)) WITH Z00_THROUGHPUT = 'small';
    • Medium throughput
      CREATE TABLE test2 (k int,p int,s int static,v int,PRIMARY KEY (k, p)) WITH Z00_THROUGHPUT = 'medium';
    • High throughput
      CREATE TABLE test3 (k int,p int,s int static,v int,PRIMARY KEY (k, p)) WITH Z00_THROUGHPUT = 'big';
  2. Number of table tokens: indicates the number of table tokens when a table is created. The number of tokens must be greater than 1.
    CREATE TABLE test4 (k int,p int,s int static,v int,PRIMARY KEY (k, p)) WITH Z01_TABLE_TOKENS = 24;
  3. Table parameters: Z00_BUFFER_SIZE and Z00_BUFFER_NUMBER (not recommended).

    When creating a table, you can specify the number of memtables in the storage layer and the size of each memtable.

    • Z00_BUFFER_SIZE is of the map type and specifies the CF name and value. The value ranges from 2 to 32.
      CREATE TABLE test6 (k int,p int,s int static,v int,PRIMARY KEY (k, p)) WITH Z00_BUFFER_SIZE = {'default': 16};
    • Z00_BUFFER_NUMBER is of the map type and specifies the CF name and value. The value ranges from 2 to 8.
      CREATE TABLE test5 (k int,p int,s int static,v int,PRIMARY KEY (k, p)) WITH Z00_BUFFER_NUMBER = {'default': 3};

If you need to adjust the table specifications after the table is created, for example, when the maximum number of the tables is reached, you can reduce the table specifications to create more tables by adjusting the following parameters.

  • If you set the throughput of all created tables to medium, the number of tables can be doubled
    ALTER TABLE keyspace_name.table_name WITH Z00_THROUGHPUT = 'medium';
  • If you set the throughput of all created tables to small, the number of tables can be tripled.
    ALTER TABLE keyspace_name.table_name WITH Z00_THROUGHPUT = 'small';