Help Center/ Elastic Cloud Server/ FAQs/ Website or Application Inaccessible/ Why Did I See "Invalid argument" or "neighbour table overflow" During an Access to a Linux ECS?
Updated on 2025-01-10 GMT+08:00

Why Did I See "Invalid argument" or "neighbour table overflow" During an Access to a Linux ECS?

Symptom

  1. When a Linux ECS sends a request to a server in the same subnet, the server has received the request but does not return a response. When the server pings the client, the message "sendmsg: Invalid argument" is displayed.
    64 bytes from 192.168.0.54: icmp_seq=120 ttl=64 time=0.064 ms
    64 bytes from 192.168.0.54: icmp_seq=122 ttl=64 time=0.071 ms
    ping: sendmsg: Invalid argument
    ping: sendmsg: Invalid argument
    ping: sendmsg: Invalid argument
  2. "neighbor table overflow" is displayed in the /var/log/messages log file or the dmesg command output of a Linux ECS.
    [21208.317370] neighbour: ndisc_cache: neighbor table overflow!
    [21208.317425] neighbour: ndisc_cache: neighbor table overflow!
    [21208.317473] neighbour: ndisc_cache: neighbor table overflow!
    [21208.317501] neighbour: ndisc_cache: neighbor table overflow!

Root Cause

The Neighbour table references the ARP cache. When the Neighbour table overflows, the ARP table is full and will reject connections.

You can run the following command to check the maximum size of the ARP cache table:

# cat /proc/sys/net/ipv4/neigh/default/gc_thresh3

Check the following parameters in the ARP cache table:
/proc/sys/net/ipv4/neigh/default/gc_thresh1
/proc/sys/net/ipv4/neigh/default/gc_thresh2
/proc/sys/net/ipv4/neigh/default/gc_thresh3
  • gc_thresh1: The minimum number of entries to keep in the ARP cache. The garbage collector will not run if there are fewer than this number of entries in the cache.
  • gc_thresh2: The soft maximum number of entries to keep in the ARP cache. The garbage collector will allow the number of entries to exceed this for 5 seconds before collection will be performed.
  • gc_thresh3: The hard maximum number of entries to keep in the ARP cache. The garbage collector will always run if there are more than this number of entries in the cache.

To verify the actual number of IPv4 ARP entries, run the following command:

# ip -4 neigh show nud all | wc -l

Solution

  1. Make sure that the number of servers in a subnet is less than the default.gc_thresh3 value.
  2. Adjust parameters: change gc_thresh3 to a value much greater than the number of servers in the same VPC network segment, and make sure that the gc_thresh3 value is greater than the gc_thresh2 value, and the gc_thresh2 value is greater than the gc_thresh1 value.

    For example, if a subnet has a 20-bit mask, the network can accommodate a maximum of 4,096 servers. The default.gc_thresh3 value of this network segment must be a value much greater than 4,096.

    Temporary effective:
    # sysctl -w net.ipv4.neigh.default.gc_thresh1=2048
    # sysctl -w net.ipv4.neigh.default.gc_thresh2=4096
    # sysctl -w net.ipv4.neigh.default.gc_thresh3=8192

    Always effective:

    Add the following content to the /etc/sysctl.conf file:
    net.ipv4.neigh.default.gc_thresh1 = 2048
    net.ipv4.neigh.default.gc_thresh2 = 4096
    net.ipv4.neigh.default.gc_thresh3 = 8192
    Add IPv6 configuration if required:
    net.ipv6.neigh.default.gc_thresh1 = 2048
    net.ipv6.neigh.default.gc_thresh2 = 4096
    net.ipv6.neigh.default.gc_thresh3 = 8192