Updated on 2023-10-23 GMT+08:00

Troubleshooting

Low Connection Performance

  • log_hostname is enabled, but DNS is incorrect.

    Connect to the database, run show log_hostname to check whether log_hostname is enabled in the database.

    If it is enabled, the database will use DNS to check the name of the host where the client is deployed. If the database is configured with an incorrect or unreachable DNS server, the database connection will take a long time to set up. For details about this parameter, see the description of log_hostname in section "GUC Parameters" > "Error Reporting and Logging" > "Logging Content".

Problems in Setting Up Connections

  • gsql: could not connect to server: No route to host

    This problem occurs generally because an unreachable IP address or port number was specified. Check whether the values of -h and -p parameters are correct.

  • gsql: FATAL: Invalid username/password,login denied.

    This problem occurs generally because an incorrect username or password was entered. Contact the database administrator to check whether the username or password is correct.

  • After -h 127.0.0.1 is added to a DN, the DN can be connected to the database. After -h 127.0.0.1 is deleted, the DN cannot be connected to the database.

    Run the show unix_socket_directory SQL statement to check whether the Unix socket directory used by the DN is the same as that specified by $PGHOST in the shell directory.

    If they are different, set $PGHOST to the directory specified by unix_socket_directory.

    For details about unix_socket_directory, see section "GUC Parameters" > "Connection and Authentication" > "Connection Settings".

  • The "libpq.so" loaded mismatch the version of gsql, please check it.

    This problem occurs because the version of libpq.so used in the environment does not match that of gsql. Run the ldd gsql command to check the version of the loaded libpq.so, and then load correct libpq.so by modifying the environment variable LD_LIBRARY_PATH.

  • gsql: symbol lookup error: xxx/gsql: undefined symbol: libpqVersionString

    This problem occurs because the version of libpq.so used in the environment does not match that of gsql (or the PostgreSQL libpq.so exists in the environment). Run the ldd gsql command to check the version of the loaded libpq.so, and then load correct libpq.so by modifying the environment variable LD_LIBRARY_PATH.

  • gsql: connect to server failed: Connection timed out

    Is the server running on host "xx.xxx.xxx.xxx" and accepting TCP/IP connections on port xxxx?

    This problem is caused by network connection faults. Check the network connection between the client and the database server. If you cannot ping from the client to the database server, the network connection is abnormal. Contact network management personnel for troubleshooting.

    ping -c 4 10.10.10.1
    PING 10.10.10.1 (10.10.10.1) 56(84) bytes of data.
    From 10.10.10.1: icmp_seq=2 Destination Host Unreachable
    From 10.10.10.1 icmp_seq=2 Destination Host Unreachable
    From 10.10.10.1 icmp_seq=3 Destination Host Unreachable
    From 10.10.10.1 icmp_seq=4 Destination Host Unreachable
    --- 10.10.10.1 ping statistics ---
    4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 2999ms
  • gsql: FATAL: sorry, too many clients already, active/non-active: 197/3.

    This problem occurs because the number of system connections exceeds the allowed maximum. Contact the database administrator to release unnecessary sessions.

    You can check the number of connections as described in Table 1.

    You can view the session status in the PG_STAT_ACTIVITY view. To release unnecessary sessions, use the pg_terminate_backend function.

    select datid,pid,state from pg_stat_activity;
     datid |       pid       | state  
    -------+-----------------+--------
     13205 | 139834762094352 | active
     13205 | 139834759993104 | idle
    (2 rows)

    The pid value is the thread ID of the session. Terminate the session using its thread ID.

    SELECT PG_TERMINATE_BACKEND(139834759993104);

    If information similar to the following is displayed, the session is successfully terminated:

    PG_TERMINATE_BACKEND
    ----------------------
     t
    (1 row)
    Table 1 Viewing the number of session connections

    Description

    Command

    View the maximum number of sessions connected to a specific user.

    Run the following command to view the upper limit of user user1's connections. -1 indicates that no connection upper limit is set for user user1.

    SELECT ROLNAME,ROLCONNLIMIT FROM PG_ROLES WHERE ROLNAME='user1';
     rolname | rolconnlimit
    ---------+--------------
     user1    |           -1
    (1 row)

    View the number of session connections that have been used by a user.

    Run the following command to view the number of connections that have been used by user1. 1 indicates the number of connections that have been used by user1.

    SELECT COUNT(*) FROM dv_sessions WHERE USERNAME='user1';
    
     count
    -------
         1
    (1 row)

    View the maximum number of sessions connected to a specific database.

    Run the following commands to view the upper limit of the number of postgres's session connections. –1 indicates that no upper limit is set for the number of postgres's session connections.

    SELECT DATNAME,DATCONNLIMIT FROM PG_DATABASE WHERE DATNAME='postgres';
    
     datname  | datconnlimit
    ----------+--------------
     postgres |           -1
    (1 row)

    View the number of session connections that have been used by a specific database.

    Run the following commands to view the number of session connections that have been used by postgres. 1 indicates the number of session connections that have been used by postgres.

    SELECT COUNT(*) FROM PG_STAT_ACTIVITY WHERE DATNAME='postgres';
     count 
    -------
         1
    (1 row)

    View the number of session connections that have been used by all users.

    Run the following commands to view the number of session connections that have been used by all users:

    SELECT COUNT(*) FROM dv_sessions;
     
     count
    -------
         10
    (1 row)
  • gsql: wait xxx.xxx.xxx.xxx:xxxx timeout expired

    When gsql initiates a connection request to the database, a 5-minute timeout period is used. If the database cannot correctly authenticate the client request and client identity within this period, gsql will exit the connection process for the current session, and will report the above error.

    Generally, this problem is caused by the incorrect host and port (that is, the xxx part in the error information) specified by the -h and -p parameters. As a result, the communication fails. Occasionally, this problem is caused by network faults. To resolve this problem, check whether the host name and port number of the database are correct.

  • gsql: could not receive data from server: Connection reset by peer.

    Check whether DN logs contain information similar to " FATAL: cipher file "/data/coordinator/server.key.cipher" has group or world access". This error is usually caused by tampering with the permissions for data directories or some key files by mistake. For details about how to correct the permissions, see related permissions for files on other normal instances.

Other Faults

  • There is a core dump or abnormal exit due to the bus error.

    Generally, this problem is caused by changes in loading the shared dynamic library (.so file in Linux) during process running. Alternatively, if the process binary file changes, the execution code for the OS to load machines or the entry for loading a dependent library will change accordingly. In this case, the OS kills the process for protection purposes, generating a core dump file.

    To resolve this problem, try again. In addition, do not run service programs in a database during O&M operations, such as an upgrade, preventing such a problem caused by file replacement during the upgrade.

    A possible stack of the core dump file contains dl_main and its function calling. The file is used by the OS to initialize a process and load the shared dynamic library. If the process has been initialized but the shared dynamic library has not been loaded, the process cannot be considered completely started.