FAQ
Low Connection Performance
- Problem: log_hostname is enabled, but DNS is incorrect.
    Solution: Connect to the database, and run show log_hostname to check whether log_hostname is enabled in the database. If it is enabled, the database kernel will use DNS to check the name of the host where the client is deployed. If the host where the CN resides is configured with an incorrect or unreachable DNS, the database connection will take a long time to set up. For more details about log_hostname, see "GUC Parameters". 
- Problem: The database kernel runs the initialization statement slowly.
    
    It is difficult to locate faults in this scenario. Try using the strace Linux command. strace gsql -U MyUserName -d gaussdb -h 127.0.0.1 -p 23508 -r -c '\q' Password for MyUserName: The database connection process will be printed on the screen. For example, if the following operations are suspended for a long time, the database executes the SELECT VERSION() statement slowly. sendto(3, "Q\0\0\0\25SELECT VERSION()\0", 22, MSG_NOSIGNAL, NULL, 0) = 22 poll([{fd=3, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=3, revents=POLLIN}])After the database is connected, you can run the explain performance select version() statement to find the reason why the initialization statement was run slowly. For more information, see "SQL Optimization > Introduction to the SQL Execution Plan" in Developer Guide. An uncommon scenario is that the disk of the machine where the database CN resides is full or faulty, affecting queries and leading to user authentication failures. As a result, the connection process is suspended. To solve this problem, clear the data disk space of the database CN. 
- Problem: The TCP connection is set up slowly.
    
    Run strace to check whether the initialization statement is run slowly. If the following information is displayed for a long time: connect(3, {sa_family=AF_FILE, path="/home/test/tmp/gaussdb_llt1/.s.PGSQL.61052"}, 110) = 0Or, connect(3, {sa_family=AF_INET, sin_port=htons(61052), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)It indicates that the physical connection between the client and the database is set up slowly. In this case, check whether the network is unstable or has high throughput. 
- Problem: The connection is slow due to full resource load.
    
    When the CPU, memory, or I/O usage is close to 100%, the gsql connection is slow. - Run the top command to check the CPU usage. Run the free command to check the memory usage. Run the iostat command to check the I/O load. You can also check the monitor logs in the CM Agent and the monitoring records on the database O&M platform.
- For peak load scenarios caused by a large number of slow queries in a short period of time, you can use the port specified by (Port number of the database server + 1) to query the pg_stat_activity view. For slow queries, you can use the system function pg_terminate_backend to kill sessions.
- If service overloading exists for a long time (that is, there is no obvious slow query, or new queries still become slow after slow queries are killed), reduce the service load and increase database resources.
 
Problems in Setting Up Connections
- Problem: The following error message is displayed: "gsql: could not connect to server: No route to host."
    
    This problem occurs because an unreachable address or port is specified. Check whether the values of -h and -p parameters are correct. 
- Problem: The following error message is displayed: "gsql: FATAL: Invalid username/password,login denied."
    
    This problem occurs generally because an incorrect username or password was entered. Contact the database administrator to check whether the username and password are correct. 
- Problem: The following error message is displayed: "gsql: FATAL: Forbid remote connection with trust method!"
    
    For security purposes, remote database login in trust mode is forbidden. In this case, you need to modify the connection authentication information in the gs_hba.conf file. Contact an administrator.   Do not modify the configurations of database cluster hosts in the gs_hba.conf file. Otherwise, the database may become faulty. It is recommended that service applications be deployed outside the database cluster. 
- Problem: The database can be connected from the host where the CN resides by adding -h 127.0.0.1, but the connection fails if -h 127.0.0.1 is removed.
    Solution: Run the SQL statement show unix_socket_directory to check whether the unix socket directory used by the database CN is the same as that specified by the environment variable $PGHOST in the shell directory. If they are different, set $PGHOST to the directory specified by unix_socket_directory. For more details about unix_socket_directory. 
- Problem: The following error message is displayed: "The "libpq.so" loaded mismatch the version of gsql, please check it."
    
    This problem occurs because the version of libpq.so used in the environment does not match that of gsql. Run the ldd gsql command to check the version of the loaded libpq.so, and then load correct libpq.so by modifying the environment variable LD_LIBRARY_PATH. 
- Problem: The following error message is displayed: "gsql: symbol lookup error: xxx/gsql: undefined symbol: libpqVersionString."
    
    This problem occurs because the version of libpq.so used in the environment does not match that of gsql (or the PG libpq.so exists in the environment). Run the ldd gsql command to check the version of the loaded libpq.so, and then load correct libpq.so by modifying the environment variable LD_LIBRARY_PATH. 
- Problem: The following error message is displayed: "gsql: connect to server failed: Connection timed out."
    Is the server running on host "xx.xxx.xxx.xxx" and accepting TCP/IP connections on port xxxx? Solution: This problem is caused by network connection faults. Check the network connection between the client and the database server. If you cannot ping from the client to the database server, the network connection is abnormal. Contact network management personnel for troubleshooting. ping -c 4 10.10.10.1 PING 10.10.10.1 (10.10.10.1) 56(84) bytes of data. From 10.10.10.1: icmp_seq=2 Destination Host Unreachable From 10.10.10.1 icmp_seq=2 Destination Host Unreachable From 10.10.10.1 icmp_seq=3 Destination Host Unreachable From 10.10.10.1 icmp_seq=4 Destination Host Unreachable --- 10.10.10.1 ping statistics --- 4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 2999ms 
- Problem: The following error message is displayed: "gsql: FATAL: permission denied for database "gaussdb"."
    DETAIL: User does not have CONNECT privilege. Solution: This problem occurs because the user does not have the permission to access the database. To solve this problem, perform the following steps: - Connect to the database as a database administrator.
      gsql -d gaussdb -U dbadmin -p 8000
- Grant the user with the permission to access the database.
      GRANT CONNECT ON DATABASE gaussdb TO user1; 
 In addition, many common misoperations may cause users to fail to connect to the database, for example, entering an incorrect database name, username, or password. In this case, the client tool will display the corresponding error messages. gsql -d gaussdb -p 8000 gsql: FATAL: database "gaussdb" does not exist gsql -d gaussdb -U user1 -p 8000 Password for user user1: gsql: FATAL: Invalid username/password,login denied. 
- Connect to the database as a database administrator.
      
- Problem: The following error message is displayed: "gsql: FATAL: sorry, too many clients already, active/non-active: 197/3."
    
    This problem occurs because the number of system connections exceeds the allowed maximum. Contact the DBA database administrator to release unnecessary sessions. You can check the number of connections of user sessions as described in Table 1. You can view the session status in the PG_STAT_ACTIVITY view. To release unnecessary sessions, use the pg_terminate_backend function. SELECT datid,pid,state FROM pg_stat_activity; datid | pid | state -------+-----------------+-------- 13205 | 139834762094352 | active 13205 | 139834759993104 | idle (2 rows) The value of pid is the thread ID of the session. Terminate the session using its thread ID. SELECT PG_TERMINATE_BACKEND(139834759993104); If a command output similar to the following is displayed, the session is successfully terminated: PG_TERMINATE_BACKEND ---------------------- t (1 row) 
| Description | Command | 
|---|---|
| View the maximum number of sessions connected to a specific user. | Run the following command to view the upper limit of user1's connections. -1 indicates that no upper limit is set for USER1's session connections. gaussdb=# SELECT ROLNAME,ROLCONNLIMIT FROM PG_ROLES WHERE ROLNAME='user1';
 rolname | rolconnlimit
---------+--------------
 user1    |           -1
(1 row) | 
| View the number of session connections that have been used by a specified user. | Run the following command to view the number of connections that have been used by user1. 1 indicates the number of connections that have been used by user1. gaussdb=# SELECT COUNT(*) FROM dv_sessions WHERE USERNAME='user1';
 count
-------
     1
(1 row) | 
| View the maximum number of session connections of a specific database. | Run the following command to view the upper limit of gaussdb's session connections. -1 indicates that no upper limit is set for gaussdb's session connections. gaussdb=# SELECT DATNAME,DATCONNLIMIT FROM PG_DATABASE WHERE DATNAME='gaussdb';
 datname  | datconnlimit
----------+--------------
 gaussdb |           -1
(1 row) | 
| View the number of session connections that have been used by a specified database. | Run the following command to view the number of session connections that have been used by gaussdb. 1 indicates the number of session connections that have been used by gaussdb. gaussdb=# SELECT COUNT(*) FROM PG_STAT_ACTIVITY WHERE DATNAME='gaussdb';
 count 
-------
     1
(1 row) | 
| View the number of session connections that have been used by all users. | Run the following command to view the number of session connections that have been used by all users: gaussdb=# SELECT COUNT(*) FROM dv_sessions;
 count
-------
     10
(1 row) | 
- Problem: The following error message is displayed: "gsql: wait xxx.xxx.xxx.xxx:xxxx timeout expired."
    
    When gsql initiates a connection request to the database, a 5-minute timeout period is used. If the database cannot correctly authenticate the client request and client identity within this period, gsql will exit the connection process for the current session, and will report the above error. Generally, this problem is caused by the incorrect host and port (that is, the xxx part in the error information) specified by the -h and -p parameters. As a result, the communication fails. Occasionally, this problem is caused by network faults. To resolve this problem, check whether the host name and port number of the database are correct. 
- Problem: The following error message is displayed: "gsql: could not receive data from server: Connection reset by peer."
    
    Check whether CN logs contain information similar to "FATAL: cipher file "/data/coordinator/server.key.cipher" has group or world access." This error is usually caused by incorrect tampering with the permissions for data directories or some key files. For details about how to correct the permissions, see related permissions for files on other normal instances. 
- Problem: The following error message is displayed: "gsql: FATAL: GSS authentication method is not allowed because XXXX user password is not disabled."
    
    In gs_hba.conf of the target CN, the authentication mode is set to gss for authenticating the IP address of the current client. However, this authentication algorithm cannot authenticate clients. Change the authentication algorithm to sha256 and try again. For details, contact the administrator.   - Do not modify the configurations of database cluster hosts in the gs_hba.conf file. Otherwise, the database may become faulty.
- It is recommended that service applications be deployed outside the database cluster.
 
Other Faults
- Problem: A core dump or abnormal exit occurs due to a bus error.
    
    Generally, this problem is caused by changes to the shared dynamic library (.so file in Linux) loaded during process running. Alternatively, if the process binary file changes, the execution code for the OS to load machines or the entry for loading a dependent library will change accordingly. In this case, the OS terminates the process for protection purposes, generating a core dump file. To resolve this problem, please try again. In addition, do not run service programs in a cluster during O&M operations, such as an upgrade, preventing such a problem caused by file replacement during the upgrade.   Possibly, a stack of the core dump file contains dl_main and its function calling. The file is used by the OS to initialize a process and load the shared dynamic library. If the process has been initialized but the shared dynamic library has not been loaded, the process cannot be considered completely started. 
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot 
    