The Problem
When trying to open an ssh connection to a system with a specific account, it failed with ‘Connection reset by peer’. Other users can successfully connect with ssh to this system.
Below is an example showing the failed login to node [NODE2] with the account ‘oracle’.
[oracle@NODE1]$ ssh oracle@[NODE2] oracle@[NODE2]'s password: Read from remote host [NODE2]: Connection reset by peer Connection to [NODE2] closed.
[oracle@NODE1]$ ssh root@[NODE2] root@[NODE2]'s password: Last login: Fri Mar 6 02:30:55 2009 from [NODE1]
The error information can be found in /var/log/messages log on node [NODE2]:
Feb 29 11:11:11 [NODE2] sshd[7194]: Accepted password for oracle from ::ffff:xx.xx.xx.xx port 24318 ssh2 Feb 29 11:11:11 [NODE2] sshd[7202]: fatal: setresuid 501: Resource temporarily unavailable
The Solution
More users have been added to this system. The limit value of ‘soft nofile‘ or ‘soft nproc‘ in file /etc/security/limits.conf is in effect. The /etc/security/limits.conf file sets limits on system resources for each user.
For example, the value for opened files returned by lsof is higher than the limit value of ‘soft nofile’ on node NODE2 for account ‘oracle’.
1. Check /etc/security/limits.conf:
[oracle@~]$cat /etc/security/limits.conf # /etc/security/limits.conf # #Each line describes a limit for a user in the form: # #Where: # can be: # - an user name # - a group name, with @group syntax # - the wildcard *, for default entry # - the wildcard %, can be also used with %group syntax, # for maxlogin limit # can have the two values: # - "soft" for enforcing the soft limits # - "hard" for enforcing hard limits # - can be one of the following: # - core - limits the core file size (KB) # - data - max data size (KB) # - fsize - maximum filesize (KB) # - memlock - max locked-in-memory address space (KB) # - nofile - max number of open files # - rss - max resident set size (KB) # - stack - max stack size (KB) # - cpu - max CPU time (MIN) # - nproc - max number of processes # - as - address space limit # - maxlogins - max number of logins for this user # - maxsyslogins - max number of logins on the system # - priority - the priority to run user process with # - locks - max number of file locks the user can hold # - sigpending - max number of pending signals # - msgqueue - max memory used by POSIX message queues (bytes) #[domain] [type] [item] [value] oracle hard nofile 65535 oracle soft nofile 4096 oracle hard nproc 20480 oracle soft nproc 2047
2. Check the processes run by user ‘oracle’:
[oracle@NODE2 ~]$ ps -u oracle|wc -l 489
3. Check the files opened by user ‘oracle’:
[oracle@[NODE2] ~]$ /usr/sbin/lsof -u oracle | wc -l 62490
Once you have identified the parameters limits, follow the steps outlined below to resolve the issue:
1. Modify /etc/security/limits.conf manually. Increase the value of ‘soft nofile‘ until it is equal to ‘hard nofile‘ value. Increase the value of ‘soft nproc‘ until it is equal to ‘hard nproc‘ value.
[oracle@NODE2 ~]$cat /etc/security/limits.conf oracle hard nofile 65535 oracle soft nofile 65535 oracle hard nproc 20480 oracle soft nproc 20480
2. Check if the issue still exists:
[oracle@NODE1 ~]$ssh oracle@NODE2 oracle@NODE2's password: Last login: Fri Mar 6 02:33:01 2009 from NODE1
Different errors when other limit settings are reached
The error is different when it is reaching the limit ‘open files’ and ‘max user processes’ in /etc/profile.
1. Error on reaching the limit ‘open files’:
[oracle@NODE1~]$ssh NODE2 oracle@NODE2's password: -bash: ulimit: max user processes: cannot modify limit: Operation not permitted -bash: /home/oracle/.bash_profile: Too many open files
2. Error on reaching the limit ‘max user processes’:
[oracle@NODE1 ~]$ssh oracle@NODE2 oracle@NODE2's password: -bash: ulimit: open files: cannot modify limit: Operation not permitted -bash: fork: Resource temporarily unavailable