[uWSGI] Automatically recovering from uWSGI listen queue full

Roberto De Ioris roberto at unbit.it
Wed Sep 26 10:42:11 UTC 2018


> Hi, we are using uWSGI in emperor mode on AWS and we recently encountered
> a
> bad case of EBS drive degradation that forced us to restart uWSGI once the
> hard drive was working correctly again. My main question is: is there any
> configuration option that would have allowed us to automatically restart
> uWSGI.
>
> The details:
>
> 1. We use uWSGI 2.0.17 in emperor mode and nginx is used as a reverse
> proxy.
>
> 2. Our main application has the following configuration:
> ```
> [uwsgi]
> socket = 127.0.0.1:8999
> pythonpath = /path/to/app
> virtualenv = /path/to/virtualenv
> processes = 6
> max-requests = 100
> reload-on-rss = 100
> master = true
> harakiri = 20
> module = our_app.wsgi
> pidfile = /var/run/webapp/our_app.pid
> post-buffering = 4096
> logger = syslog:uwsgi.our_app
> disable-logging = true
> log-date = true
> log-slow = 1000
> log-5xx = true
> log-maxsize = 16777216
> ```
>
> 3. When the EBS drive started behaving erratically, we started receiving
> hundreds of harakiri notifications:
>
> `Sep 25 20:10:15 our_server uwsgi.our_app: Tue Sep 25 20:05:37 2018 - ***
> HARAKIRI ON WORKER 1 (pid: 3720, try: 382) **`
>
> 4. Then we started to see this log statement:
>
> `Sep 25 20:13:35 our_server uwsgi.our_app: Tue Sep 25 20:08:57 2018 - ***
> uWSGI listen queue of socket "127.0.0.1:8999" (fd: 6) full !!! (101/100)
> ***`
>
> 5. Once the hard drive recovered, new requests were still returning 504
> gateway timeout error for several minutes. Restarting uWSGI "fixed" the
> problem.
>
> My uninformed guess is that uWSGI was still trying to process the listen
> queue, but I wonder if there is some configuration parameter or some
> statistics emitted by uWSGI that we could use to automate a restart if
> this
> ever happens again.
>
> In any case, thanks for this wonderful piece of software!
>


Hi, technically this has nothing to do with uWSGI by itself but on how
sockets work:

until the socket is opened, the kernel will continue to fill its backlog
buffer. Probably in your case the stop/restart (a graceful reload is not
enough as the internal socket is not closed) should have been triggered on
the EBS resume.

Note that you can attach an alarm to the listen queue full event:

https://uwsgi-docs.readthedocs.io/en/latest/AlarmSubsystem.html

and eventually trigger the restart, but honestly your case is so
"apocalyptic" that a manual procedure is the most secure thing to do.
(immagine restarting uWSGI under a dos, you will end with both the network
and the system load destroyed ;)

-- 
Roberto De Ioris
http://unbit.com


More information about the uWSGI mailing list