Ubuntu 16.04 - "IOException: Too many open files"
Posted: Fri Oct 06, 2017 4:50 pm
[Problem]
Starting a service using "sudo service <servicename> start" or the service automatically starting after a reboot fails or is semi successful. Especially the latter is very nasty: I ran a "7 Days to Die" server that appeared to be up, but any players who wanted to connect got a message like "Server is still initializing, please try again later". Studying the server log after a trying a restart did show an "IOException: Too many open files".
did show that Ubuntu server 16.04.2 LTS does have a quite low default of 1024. Obviously not enough, so we tried to raise it in /etc/security/limits.conf:
Changes to this are only live after logging out and back in! We did that, checked again with ulimit -n, and everything looked fine. Unfortunately, the error happened again when executing
Checking the process itself for the limit was also done:
Max open files was still set to 1024 for the process! A reboot did also not change this, so it was clear that the limits.conf file was probably helping with other issues, but not this specific one.
[Solution]
In the end the problem was that upstart has its own set of limits when invoking a service and thus ignores the values set in the limits.conf file. The solution was to modify the /lib/systemd/system/7days.service configuration file and add "LimitNOFILE=1024000" in the [Service] section (Note: a lower value probably would've been sufficient, e.g. 16384). Below, find the now working version as an example:
/lib/systemd/system/7days.service:
After changes to service config files the daemon must be reloaded:
Restarting the 7 Days to Die server, with a stop before to make sure it's not running:
This time, the server should come up nicely and accept connections.
[Cause]
The most likely cause why the server worked without issues for many months and suddenly couldn't start up anymore is the creation of additional region files the more of the map is discovered by players. To make exploration less laggy, I've used the command "visitmap -10000 -10000 10000 10000" in the command shell of the 7days telnet access. This causes the whole map to be generated in the background while people are playing on the server. The process takes about 36 hours and creates 1600 Files, so on the next restart the default file limit for this process as too low.
Starting a service using "sudo service <servicename> start" or the service automatically starting after a reboot fails or is semi successful. Especially the latter is very nasty: I ran a "7 Days to Die" server that appeared to be up, but any players who wanted to connect got a message like "Server is still initializing, please try again later". Studying the server log after a trying a restart did show an "IOException: Too many open files".
Code: Select all
$ ulimit -n
Code: Select all
* soft nofile 16384
* hard nofile 16384
Code: Select all
$ sudo service 7days start
Checking the process itself for the limit was also done:
Code: Select all
$ cat /proc/<PID>/limits
Code: Select all
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 47903 47903 processes
Max open files 1024 1024 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 47903 47903 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
[Solution]
In the end the problem was that upstart has its own set of limits when invoking a service and thus ignores the values set in the limits.conf file. The solution was to modify the /lib/systemd/system/7days.service configuration file and add "LimitNOFILE=1024000" in the [Service] section (Note: a lower value probably would've been sufficient, e.g. 16384). Below, find the now working version as an example:
/lib/systemd/system/7days.service:
Code: Select all
[Unit]
Description=7 Days to Die THC Private Server
After=network.target nss-lookup.target
[Service]
User=7days
Group=7days
Type=simple
PIDFile=/run/7days.pid
ExecStart=/home/7days/7days_server/startserver.sh -configfile=/home/7days/config/serverconfig.xml
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s QUIT $MAINPID
Restart=always
LimitNOFILE=1024000
[Install]
WantedBy=multi-user.target
Code: Select all
$ sudo systemctl daemon-reload
Code: Select all
$ sudo service 7days stop
$ sudo service 7days start
[Cause]
The most likely cause why the server worked without issues for many months and suddenly couldn't start up anymore is the creation of additional region files the more of the map is discovered by players. To make exploration less laggy, I've used the command "visitmap -10000 -10000 10000 10000" in the command shell of the 7days telnet access. This causes the whole map to be generated in the background while people are playing on the server. The process takes about 36 hours and creates 1600 Files, so on the next restart the default file limit for this process as too low.