Troubleshooting
Low Connection Performance
- log_hostname is enabled, but DNS is incorrect.
Connect to the database, and run show log_hostname to check whether log_hostname is enabled in the database.
If it is enabled, the database kernel will use DNS to check the name of the host where the client is deployed. If the database of the host where the CN resides is configured with an incorrect or unreachable DNS server, the database connection will take a long time to set up.
- The database kernel slowly runs the initialization statement.
Problems are difficult to locate in this scenario. Try using the strace Linux trace command.
strace gsql -U MyUserName -d postgres -h 127.0.0.1 -p 23508 -r -c '\q' Password for MyUserName:
The database connection process will be printed on the screen. If the following statement takes a long time to run:
sendto(3, "Q\0\0\0\25SELECT VERSION()\0", 22, MSG_NOSIGNAL, NULL, 0) = 22 poll([{fd=3, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=3, revents=POLLIN}])
It indicates that the SELECT VERSION() statement was run slowly.
After the database is connected, you can run the explain performance select version() statement to find the reason why the initialization statement was run slowly.
An uncommon scenario is that the disk of the machine where the database CN resides is full or faulty, affecting queries and leading to user authentication failures. As a result, the connection process is suspended. To solve this problem, contact customer service to clear the data disk of the database CN.
- TCP connection is set up slowly.
Adapt the steps of troubleshooting slow initialization statement execution. Use strace. If the following statement is run slowly:
connect(3, {sa_family=AF_FILE, path="/home/test/tmp/gaussdb_llt1/.s.PGSQL.61052"}, 110) = 0
Or,
connect(3, {sa_family=AF_INET, sin_port=htons(61052), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
It indicates that the physical connection between the client and the database is set up slowly. In this case, check whether the network is unstable or has high throughput.
Problems in Setting Up Connections
- gsql: could not connect to server: No route to host
This problem occurs generally because an unreachable IP address or port number was specified. Check whether the values of -h and -p parameters are correct.
- gsql: FATAL: Invalid username/password,login denied.
This problem occurs generally because an incorrect username or password was entered. Contact the database administrator to check whether the username and password are correct.
- The "libpq.so" loaded mismatch the version of gsql, please check it.
This problem occurs because the version of libpq.so used in the environment does not match that of gsql. Run the ldd gsql command to check the version of the loaded libpq.so, and then load correct libpq.so by modifying the environment variable LD_LIBRARY_PATH.
- gsql: symbol lookup error: xxx/gsql: undefined symbol: libpqVersionString
This problem occurs because the version of libpq.so used in the environment does not match that of gsql (or the PostgreSQL libpq.so exists in the environment). Run the ldd gsql command to check the version of the loaded libpq.so, and then load correct libpq.so by modifying the environment variable LD_LIBRARY_PATH.
- gsql: connect to server failed: Connection timed out
Is the server running on host "xx.xxx.xxx.xxx" and accepting TCP/IP connections on port xxxx?
This problem is caused by network connection faults. Check the network connection between the client and the database server. If you cannot ping from the client to the database server, the network connection is abnormal. Contact network management personnel for troubleshooting.
ping -c 4 10.10.10.1 PING 10.10.10.1 (10.10.10.1) 56(84) bytes of data. From 10.10.10.1: icmp_seq=2 Destination Host Unreachable From 10.10.10.1 icmp_seq=2 Destination Host Unreachable From 10.10.10.1 icmp_seq=3 Destination Host Unreachable From 10.10.10.1 icmp_seq=4 Destination Host Unreachable --- 10.10.10.1 ping statistics --- 4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 2999ms
- gsql: FATAL: permission denied for database "postgres"
DETAIL: User does not have CONNECT privilege.
This problem occurs because the user does not have the permission to access the database. To solve this problem, perform the following steps:
- Connect to the database as the database administrator.
- Grant the user with the permission to access the database.
- gsql: FATAL: sorry, too many clients already, active/non-active: 197/3.
This problem occurs because the number of system connections exceeds the allowed maximum. Contact the DBA database administrator to release unnecessary sessions.
You can check the number of connections as described in Table 1.
You can view the session status in the PG_STAT_ACTIVITY view. To release unnecessary sessions, use the pg_terminate_backend function.
select datid,pid,state from pg_stat_activity;
datid | pid | state -------+-----------------+-------- 13205 | 139834762094352 | active 13205 | 139834759993104 | idle (2 rows)
The value of pid is the thread ID of the session. Terminate the session using its thread ID.
SELECT PG_TERMINATE_BACKEND(139834759993104);
If a command output similar to the following is displayed, the session is successfully terminated.
PG_TERMINATE_BACKEND ---------------------- t (1 row)
Table 1 Viewing the number of session connections Description
Command
View the maximum number of sessions connected to a specific user.
Run the following command to view the upper limit of the number of USER1's session connections. -1 indicates that no upper limit is set for the number of USER1's session connections.
SELECT ROLNAME,ROLCONNLIMIT FROM PG_ROLES WHERE ROLNAME='user1'; rolname | rolconnlimit ---------+-------------- user1 | -1 (1 row)
View the number of session connections that have been used by a specified user.
Run the following command to view the number of session connections that have been used by USER1. 1 indicates the number of session connections that have been used by USER1.
SELECT COUNT(*) FROM dv_sessions WHERE USERNAME='user1'; count ------- 1 (1 row)
View the maximum number of sessions connected to a specific database.
Run the following command to view the upper limit of the number of postgres's session connections. -1 indicates that no upper limit is set for the number of postgres's session connections.
SELECT DATNAME,DATCONNLIMIT FROM PG_DATABASE WHERE DATNAME='postgres'; datname | datconnlimit ----------+-------------- postgres | -1 (1 row)
View the number of session connections that have been used by a specific database.
Run the following command to view the number of session connections that have been used by postgres. 1 indicates the number of session connections that have been used by postgres.
SELECT COUNT(*) FROM PG_STAT_ACTIVITY WHERE DATNAME='postgres'; count ------- 1 (1 row)
View the number of session connections that have been used by all users.
Run the following command to view the number of session connections that have been used by all users:
SELECT COUNT(*) FROM dv_sessions; count ------- 10 (1 row)
- gsql: wait xxx.xxx.xxx.xxx:xxxx timeout expired
When gsql initiates a connection request to the database, a 5-minute timeout period is used. If the database cannot correctly authenticate the client request and client identity within this period, gsql will exit the connection process for the current session, and will report the above error.
Generally, this problem is caused by the incorrect host and port (that is, the xxx part in the error information) specified by the -h and -p parameters. As a result, the communication fails. Occasionally, this problem is caused by network faults. To resolve this problem, check whether the host name and port number of the database are correct.
Other Faults
- There is a core dump or abnormal exit due to the bus error.
Generally, this problem is caused by changes in loading the shared dynamic library (.so file in Linux) during process running. Alternatively, if the process binary file changes, the execution code for the OS to load machines or the entry for loading a dependent library will change accordingly. In this case, the OS kills the process for protection purposes, generating a core dump file.
To resolve this problem, try again. In addition, do not run service programs in a cluster during O&M operations, such as an upgrade, preventing such a problem caused by file replacement during the upgrade.
A possible stack of the core dump file contains dl_main and its function calling. The file is used by the OS to initialize a process and load the shared dynamic library. If the process has been initialized but the shared dynamic library has not been loaded, the process cannot be considered completely started.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot