Help Center/ GaussDB(DWS)/ Best Practices/ Advanced Features/ Best Practices of GIN Index
Updated on 2024-06-07 GMT+08:00

Best Practices of GIN Index

A GIN index is a data structure that pairs a key with its posting list. The key indicates a specific value, and the posting list tracks all the locations that this key occurs. For example, 'hello', '14:2 23:4' indicates that hello is found at the locations 14:2 and 23:4. A GIN index efficiently locates tuples with specific keywords, making it ideal for searching elements within multi-valued fields. This section describes how to use GIN indexes to search through array and JSONB types, as well as how to conduct full-text searches.

Using a GIN Index to Search Through the Array Type

Create a GIN index to speed up tag searches.

  1. Create a cluster on the management console. For details, see section Creating a Cluster.
  2. Use DAS to connect to a cluster. Locate the required cluster in the cluster list and click Log In in the Operation column. On the DAS page that is displayed, enter the username, database name, and password, and test the connection. If the connection is successful, log in to the cluster. For details, see Using DAS to Connect to a Cluster.

  3. Create the books table. The tags column stores the tag information of books using the array type.

    1
    CREATE TABLE books (id SERIAL PRIMARY KEY, title VARCHAR(100), tags TEXT[]);
    

  4. Insert data.

    1
    2
    3
    4
    5
    INSERT INTO books (title, tags) 
    VALUES   ('Book 1', ARRAY['fiction', 'adventure']),
       ('Book 2', ARRAY['science', 'fiction']),
       ('Book 3', ARRAY['romance', 'fantasy']),
       ('Book 4', ARRAY['adventure']);
    

  5. Create a GIN index.

    1
    CREATE INDEX idx_books_tags_gin ON books USING GIN (tags);
    

  6. Use the GIN index to perform a search query to find books that contain a specific tag in the tags column. Search for books containing the tag "fiction":

    1
    2
    3
    4
    5
    6
    SELECT * FROM books WHERE tags @> ARRAY['fiction'];
     id | title  |        tags
    ----+--------+---------------------
      1 | Book 1 | {fiction,adventure}
      2 | Book 2 | {science,fiction}
    (2 rows)
    

  7. Use the GIN index to search for books that contain both the "fiction" and "adventure" tags.

    1
    2
    3
    4
    5
    SELECT * FROM books WHERE tags @> ARRAY['fiction', 'adventure'];
     id | title  |        tags
    ----+--------+---------------------
      1 | Book 1 | {fiction,adventure}
    (1 row)
    

Using a GIN Index to Search Through the JSONB Type

When using the JSONB type to store and query JSON data, you can use GIN indexes to improve query performance. GIN indexes are suitable for querying JSONB columns that contain a large number of different key-value pairs.

  1. Create the my_table table. The data column stores information about each person using the JSONB type.

    1
    CREATE TABLE my_table (id SERIAL PRIMARY KEY, data JSONB);  
    

  2. Insert data.

    1
    2
    3
    4
    INSERT INTO my_table (data)
     VALUES ('{"name": "John", "age": 30, "address": {"career": "announcer", "state": "NY"}}'),
          ('{"name": "Alice", "age": 25, "address": {"career": "architect", "state": "CA"}}'),
          ('{"name": "Bob", "age": 35, "address": {"career": "dentist", "state": "WA"}}');
    

  3. Create a GIN index to accelerate the query of JSONB columns.

    1
    CREATE INDEX my_table_data_gin_index ON my_table USING GIN (data);
    

  4. Use the GIN index to perform queries on JSONB columns. For example, search for a person whose occupation is dentist::

    1
    2
    3
    4
    5
    SELECT * FROM my_table WHERE data @> '{"address": {"career": "dentist"}}';
     id |                                    data
    ----+-----------------------------------------------------------------------------
      3 | {"age": 35, "name": "Bob", "address": {"state": "WA", "career": "dentist"}}
    (1 row)
    

  5. GIN indexes can also be queried on keys of JSONB columns. For example, search for people who are 30 years old or older:

    1
    2
    3
    4
    5
    6
    SELECT * FROM my_table WHERE data ->> 'age' >= '30';
     id |                                      data
    ----+--------------------------------------------------------------------------------
      3 | {"age": 35, "name": "Bob", "address": {"state": "WA", "career": "dentist"}}
      1 | {"age": 30, "name": "John", "address": {"state": "NY", "career": "announcer"}}
    (2 rows)
    

Using a GIN Index for Full-Text Search

When using GIN indexes for full-text search, you can use the tsvector and tsquery data types and related functions.

To build a tsquery object, you need to use the to_tsquery function and provide the search criteria and the corresponding text search configuration (english in this case). Other text search functions and operators can also be used for more complex full-text searches, such as plainto_tsquery and ts_rank. The specific usage depends on your needs.

  1. Create an articles table in which the content column stores the article content.

    1
    CREATE TABLE articles (id SERIAL PRIMARY KEY,title VARCHAR(100),content TEXT);
    

  2. Insert data.

    1
    2
    3
    4
    5
    INSERT INTO articles (title, content) 
     VALUES ('Article 1', 'This is the content of article 1.'),
       ('Article 2', 'Here is the content for article 2.'),
       ('Article 3', 'This article discusses various topics.'),
       ('Article 4', 'The content of the fourth article is different.');
    

  3. Creates an auxiliary column tsvector for the content column that stores the processed text indexes.

    1
    ALTER TABLE articles ADD COLUMN content_vector tsvector;
    

  4. Update the value in the content_vector column and convert the text in the content column to the tsvector type.

    1
    UPDATE articles SET content_vector = to_tsvector('english', content);
    

  5. Create a GIN index.

    1
    CREATE INDEX idx_articles_content_gin ON articles USING GIN (content_vector);
    

  6. Perform a full-text search, using the tsquery type to specify the search criteria. For example, search for an article that contains the word "content":

    1
    SELECT * FROM articles WHERE content_vector @@ to_tsquery('english', 'content');