Connection Reset by peer means the remote side is terminating the session. This error is generated when the OS receives notification of TCP Reset (RST) from the remote peer.
Understanding Connection Reset by peer
Connection reset by peer means the TCP stream was abnormally closed from the other end. A TCP RST was received and the connection is now closed. This occurs when a packet is sent from our end of the connection but the other end does not recognize the connection; it will send back a packet with the RST bit set in order to forcibly close the connection.
“Connection reset by peer” is the TCP/IP equivalent of slamming the phone back on the hook. It’s more polite than merely not replying, leaving one hanging. But it’s not the FIN-ACK expected of the truly polite TCP/IP.
Understanding RST TCP Flag
RST is used to abort connections. It is very useful to troubleshoot a network connection problem.
RST (Reset the connection). Indicates that the connection is being aborted. For active connections, a node sends a TCP segment with the RST flag in response to a TCP segment received on the connection that is incorrect, causing the connection to fail.
The sending of an RST segment for an active connection forcibly terminates the connection, causing data stored in send and receive buffers or in transit to be lost. For TCP connections being established, a node sends an RST segment in response to a connection establishment request to deny the connection attempt. The sender will get Connection Reset by peer error.
Check network connectivity
The “ping” command is a tool used to test the availability of a network resource. The “ping” command sends a series of packets to a network resource and then measures the amount of time it takes for the packets to return.
If you want to ping a remote server, you can use the following command: ping <remote server>
In this example, “<remote server>” is the IP address or hostname of the remote server that you want to ping.
Ping the remote host we were connected to. If it doesn’t respond, it might be offline or there might be a network problem along the way. If it does respond, this problem might have been a transient one (so we can reconnect now)
If you are experiencing packet loss when pinging a remote server, there are a few things that you can do to troubleshoot the issue.
The first thing that you can do is check the network interface on the remote server. To do this, use the “ifconfig” command. The output of the “ifconfig” command will show you the status of all network interfaces on the system. If there is a problem with one of the interfaces, it will be shown in the output.
You can also use the “ip route” command to check routing information. The output of the “ip route” command will show you a list of all routes on the system. If there is a problem with one of the routes, it will be shown in the output.
If you are still experiencing packet loss, you can try to use a different network interface. To do this, use the “ping” command with the “-i” option. For example, the following command will use the eth0 interface:
ping -i eth0 google.com
Check remote service port is open
A port is a logical entity which acts as a endpoint of communication associated with an application or process on an Linux operating system. We can use some Linux commands to check remote port status.
Commands like nc, curl can be used to check if remote ports are open or not. For example, the following command will check if port 80 is open on google.com:
nc -zv google.com 80
The output of the above command should look something like this: Connection to google.com port 80 [tcp/80] succeeded!
This means that the port is open and we can establish a connection to it.
Check application log on remote server
For example, if the error is related with SSH. we can debug this on the remote server from sshd logs. The log entries will be in one of the files in the /var/log directory. SSHD will be logging something every time it drops our session.
Oct 22 12:09:10 server internal-sftp: session closed for local user fred from [192.0.2.33]
Check related Linux kernel parameters
Kernel parameter is also related to Connection Reset by peer error. The keepalive concept is very simple: when we set up a TCP connection, we associate a set of timers. Some of these timers deal with the keepalive procedure. When the keepalive timer reaches zero, we send our peer a keepalive probe packet with no data in it and the ACK flag turned on.
we can do this because of the TCP/IP specifications, as a sort of duplicate ACK, and the remote endpoint will have no arguments, as TCP is a stream-oriented protocol. On the other hand, we will receive a reply from the remote host (which doesn’t need to support keepalive at all, just TCP/IP), with no data and the ACK set.
If we receive a reply to we keepalive probe, we can assert that the connection is still up and running without worrying about the user-level implementation. In fact, TCP permits us to handle a stream, not packets, and so a zero-length data packet is not dangerous for the user program.
we usually use tcp keepalive for two tasks:
- Checking for dead peers
- Preventing disconnection due to network inactivity
Check Application heartbeat configuration
Connection Reset by peer error is also related to the application. Certain networking tools (HAproxy, AWS ELB) and equipment (hardware load balancers) may terminate “idle” TCP connections when there is no activity on them for a certain period of time. Most of the time it is not desirable.
We will use rabbitmq as an example. When heartbeats are enabled on a connection, it results in periodic light network traffic. Therefore heartbeats have a side effect of guarding client connections that can go idle for periods of time against premature closure by proxies and load balancers.
With a heartbeat timeout of 30 seconds the connection will produce periodic network traffic roughly every 15 seconds. Activity in the 5 to 15 second range is enough to satisfy the defaults of most popular proxies and load balancers. Also see the section on low timeouts and false positives above.
Check OS metric on peer side
Connection Reset by peer can be triggered by a busy system. we can setup a monitoring for our Linux system to the metrics like CPU, memory, network etc. If the system is too busy, the network will be impacted by this.
For example, we can use the “top” command to check the CPU usage. The output of the “top” command will show us the list of processes sorted by CPU usage. If there is a process which is using a lot of CPU, we can investigate this further to see if it is causing the network issues.
We can also use the “netstat” command to check network statistics. The output of the “netstat” command will show us a list of active network connections. If there are too many connections established, this could be causing the network issues.
We can also use the “ps” command to check the status of processes. The output of the “ps” command will show us a list of all processes running on the system. If there are too many processes running, this could be causing the network issues.
We can use these commands to troubleshoot network issues on a Linux system. By using these commands, we can narrow down the root cause of the issue and fix it.