10.24.2008

Unauthenticated openssh sessions closed by remote host

At work we recently had a problem with ssh connections being closed unexpectedly on a fresh machine dedicated to cvs hosting. The problem was turned over to our server support team, but they didn't have any luck finding a fix. Well yesterday, we teamed up and dedicated ourselves to solving the problem, and I'm blogging our fix since I had very little luck finding an easy answer to this problem. I found plenty of people experiencing similar problems and the solutions offered were sound, but our problem was different even though my error message looked identical to theirs.

The problem manifest itself when trying to open multiple concurrent sessions on the machine. Once you had about 7 or 8 open, but anauthenticated, sessions open the machine would start actively closing the connections. The following message was received on the client side:

ssh_exchange_identification: Connection closed by remote host


It may seem strange to be opening so many sessions at once and then expecting to have them sit waiting for you to enter your login credentials individually. However, that's essentially what our Integrated Development Environment does and it was in fact causing users to see this message.

I ran ssh with the -v option to see if there were any descriptive debug messages that might help me out. Here is the (edited) output of that command:


ssh -v hostname
OpenSSH_versionnumber, OpenSSL versionnumber releasedate
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Applying options for *
debug1: Connecting to hostname [ip.address] port 22.
debug1: Connection established.
debug1: identity file /home/username/.ssh/identity type -1
debug1: identity file /home/username/.ssh/id_rsa type 1
debug1: identity file /home/username/.ssh/id_dsa type -1
debug1: loaded 3 keys
ssh_exchange_identification: Connection closed by remote host


I was able to find several sources on the Internet where people were seeing the exact same error message and debug output as above. None of their solutions helped me solve my problem though, becase they pointed to such fixes as increasing the values of /proc/sys/net/core/netdev_max_backlog and /proc/sys/net/core/somaxconn or adding things to /etc/hosts.deny and /etc/hosts.allow. As you can see from the debug output above, I was getting a connection established on port 22, and so I was relatively sure the linux networking subsystem itself wasn't the culprit. Another suggestion I saw had to do with permissions of /var and /var/empty, but the permissions on this host were fine in that regard.

I finally was able to find the solution, which lay in the /etc/ssh/sshd_config file after all. I had our administrator change the value of the property MaxStartups from the default to 20, like so.

MaxStartups 20


This property's meaning is explained in the sshd_config manpage as:

     MaxStartups
Specifies the maximum number of concurrent unauthenticated con-
nections to the sshd daemon. Additional connections will be
dropped until authentication succeeds or the LoginGraceTime
expires for a connection. The default is 10.

Alternatively, random early drop can be enabled by specifying the
three colon separated values ``start:rate:full'' (e.g.,
"10:30:60"). sshd will refuse connection attempts with a proba-
bility of ``rate/100'' (30%) if there are currently ``start''
(10) unauthenticated connections. The probability increases lin-
early and all connection attempts are refused if the number of
unauthenticated connections reaches ``full'' (60).

After restarting sshd, we were able to get our test case to pass and subsequently, our IDE is now able to perform cvs operations in the way it knows how without bringing the ssh daemon to its knees.

2 comments:

Erik said...

Interesting problem. Good job on the fix.

Forrest said...

Hey wow, hardwareguy. I give you kudos for still having my feed in your reader.