Welcome! Log In Create A New Profile

Advanced

posix_memalign error

Posted by Anoop Alias 
Anoop Alias
posix_memalign error
July 31, 2018 06:30AM
I am repeatedly seeing errors like

######################
2018/07/31 03:46:33 [emerg] 2854560#2854560: posix_memalign(16, 16384)
failed (12: Cannot allocate memory)
2018/07/31 03:54:09 [emerg] 2890190#2890190: posix_memalign(16, 16384)
failed (12: Cannot allocate memory)
2018/07/31 04:08:36 [emerg] 2939230#2939230: posix_memalign(16, 16384)
failed (12: Cannot allocate memory)
2018/07/31 04:24:48 [emerg] 2992650#2992650: posix_memalign(16, 16384)
failed (12: Cannot allocate memory)
2018/07/31 04:42:09 [emerg] 3053092#3053092: posix_memalign(16, 16384)
failed (12: Cannot allocate memory)
2018/07/31 04:42:17 [emerg] 3053335#3053335: posix_memalign(16, 16384)
failed (12: Cannot allocate memory)
2018/07/31 04:42:28 [emerg] 3053937#3053937: posix_memalign(16, 16384)
failed (12: Cannot allocate memory)
2018/07/31 04:47:54 [emerg] 3070638#3070638: posix_memalign(16, 16384)
failed (12: Cannot allocate memory)
####################

on a few servers

The servers have enough memory free and the swap usage is 0, yet somehow
the kernel denies the posix_memalign with ENOMEM ( this is what I think is
happening!)

The numbers requested are always 16, 16k . This makes me suspicious

I have no setting in nginx.conf that reference a 16k

Is there any chance of finding out what requests this and why this is not
fulfilled


--
*Anoop P Alias*
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Maxim Dounin
Re: posix_memalign error
July 31, 2018 03:40PM
Hello!

On Tue, Jul 31, 2018 at 09:52:29AM +0530, Anoop Alias wrote:

> I am repeatedly seeing errors like
>
> ######################
> 2018/07/31 03:46:33 [emerg] 2854560#2854560: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/07/31 03:54:09 [emerg] 2890190#2890190: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/07/31 04:08:36 [emerg] 2939230#2939230: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/07/31 04:24:48 [emerg] 2992650#2992650: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/07/31 04:42:09 [emerg] 3053092#3053092: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/07/31 04:42:17 [emerg] 3053335#3053335: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/07/31 04:42:28 [emerg] 3053937#3053937: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/07/31 04:47:54 [emerg] 3070638#3070638: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> ####################
>
> on a few servers
>
> The servers have enough memory free and the swap usage is 0, yet somehow
> the kernel denies the posix_memalign with ENOMEM ( this is what I think is
> happening!)
>
> The numbers requested are always 16, 16k . This makes me suspicious
>
> I have no setting in nginx.conf that reference a 16k
>
> Is there any chance of finding out what requests this and why this is not
> fulfilled

There are at least some buffers which default to 16k - for
example, ssl_buffer_size (http://nginx.org/r/ssl_buffer_size).

You may try debugging log to futher find out where the particular
allocation happens, see here for details:

http://nginx.org/en/docs/debugging_log.html

But I don't really think it worth the effort. The error is pretty
clear, and it's better to focus on why these allocations are
denied. Likely you are hitting some limit.

--
Maxim Dounin
http://mdounin.ru/
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Anoop Alias
Re: posix_memalign error
August 02, 2018 07:20AM
Hi Maxim,

I enabled debug and the memalign call is happening on nginx reloads and the
ENOMEM happen sometimes on the reload(not on all reloads)

2018/08/02 05:59:08 [notice] 872052#872052: signal process started
2018/08/02 05:59:23 [notice] 871570#871570: signal 1 (SIGHUP) received from
872052, reconfiguring
2018/08/02 05:59:23 [debug] 871570#871570: wake up, sigio 0
2018/08/02 05:59:23 [notice] 871570#871570: reconfiguring
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
0000000002B0DA00:16384 @16 === > the memalign call on reload
2018/08/02 05:59:23 [debug] 871570#871570: malloc: 00000000087924D0:4560
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
000000000E442E00:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: malloc: 0000000005650850:4096
20




2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #71
2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #72
2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #73
2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #74
2018/08/02 05:48:49 [debug] 871275#871275: add cleanup: 000000005340D728
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000024D3260:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000517BAF10:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053854FC0:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053855FD0:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053856FE0:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053857FF0:4096
2018/08/02 05:48:49 [debug] 871275#871275: posix_memalign:
0000000053859000:16384 @16
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385D010:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385E020:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385F030:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CD160:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CE170:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CF180:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D0190:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D11A0:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D21B0:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D31C0:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D41D0:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D51E0:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D61F0:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D7200:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D8210:4096
2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D9220:4096


Infact there are lot of such calls during a reload

2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA17ED00:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA1B0FF0:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA1E12C0:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA211590:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA243880:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA271B30:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA2A3E20:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA2D20D0:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA3063E0:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA334690:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA366980:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA396C50:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA3C8F40:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA3F9210:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA4294E0:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA45B7D0:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA489A80:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA4BBD70:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA4EA020:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA51E330:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA54C5E0:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA57E8D0:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA5AEBA0:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA5DEE70:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA611160:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA641430:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA671700:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA6A29E0:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA6D5CE0:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA707FD0:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA736280:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA768570:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA796820:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA7CAB30:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA7F8DE0:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA82B0D0:16384 @16
2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
00000000BA85B3A0:16384 @16



What is perplexing is that the system has enough free (available RAM)
#############
# free -g
total used free shared buff/cache
available
Mem: 125 54 24 8 46
58
Swap: 0 0 0
#############

# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 514579
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 514579
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

#########################################

There is no other thing limiting memory allocation

Any way to prevent this or probably identify/prevent this


On Tue, Jul 31, 2018 at 7:08 PM Maxim Dounin <[email protected]> wrote:

> Hello!
>
> On Tue, Jul 31, 2018 at 09:52:29AM +0530, Anoop Alias wrote:
>
> > I am repeatedly seeing errors like
> >
> > ######################
> > 2018/07/31 03:46:33 [emerg] 2854560#2854560: posix_memalign(16, 16384)
> > failed (12: Cannot allocate memory)
> > 2018/07/31 03:54:09 [emerg] 2890190#2890190: posix_memalign(16, 16384)
> > failed (12: Cannot allocate memory)
> > 2018/07/31 04:08:36 [emerg] 2939230#2939230: posix_memalign(16, 16384)
> > failed (12: Cannot allocate memory)
> > 2018/07/31 04:24:48 [emerg] 2992650#2992650: posix_memalign(16, 16384)
> > failed (12: Cannot allocate memory)
> > 2018/07/31 04:42:09 [emerg] 3053092#3053092: posix_memalign(16, 16384)
> > failed (12: Cannot allocate memory)
> > 2018/07/31 04:42:17 [emerg] 3053335#3053335: posix_memalign(16, 16384)
> > failed (12: Cannot allocate memory)
> > 2018/07/31 04:42:28 [emerg] 3053937#3053937: posix_memalign(16, 16384)
> > failed (12: Cannot allocate memory)
> > 2018/07/31 04:47:54 [emerg] 3070638#3070638: posix_memalign(16, 16384)
> > failed (12: Cannot allocate memory)
> > ####################
> >
> > on a few servers
> >
> > The servers have enough memory free and the swap usage is 0, yet somehow
> > the kernel denies the posix_memalign with ENOMEM ( this is what I think
> is
> > happening!)
> >
> > The numbers requested are always 16, 16k . This makes me suspicious
> >
> > I have no setting in nginx.conf that reference a 16k
> >
> > Is there any chance of finding out what requests this and why this is not
> > fulfilled
>
> There are at least some buffers which default to 16k - for
> example, ssl_buffer_size (http://nginx.org/r/ssl_buffer_size).
>
> You may try debugging log to futher find out where the particular
> allocation happens, see here for details:
>
> http://nginx.org/en/docs/debugging_log.html
>
> But I don't really think it worth the effort. The error is pretty
> clear, and it's better to focus on why these allocations are
> denied. Likely you are hitting some limit.
>
> --
> Maxim Dounin
> http://mdounin.ru/
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx
>


--
*Anoop P Alias*
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Igor A. Ippolitov
Re: posix_memalign error
August 02, 2018 09:10AM
Anoop,

I doubt this will be the solution, but may we have a look at
/proc/buddyinfo and /proc/slabinfo the moment when nginx can't allocate
memory?

On 02.08.2018 08:15, Anoop Alias wrote:
> Hi Maxim,
>
> I enabled debug and the memalign call is happening on nginx reloads
> and the ENOMEM happen sometimes on the reload(not on all reloads)
>
> 2018/08/02 05:59:08 [notice] 872052#872052: signal process started
> 2018/08/02 05:59:23 [notice] 871570#871570: signal 1 (SIGHUP) received
> from 872052, reconfiguring
> 2018/08/02 05:59:23 [debug] 871570#871570: wake up, sigio 0
> 2018/08/02 05:59:23 [notice] 871570#871570: reconfiguring
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 0000000002B0DA00:16384 @16      === > the memalign call on reload
> 2018/08/02 05:59:23 [debug] 871570#871570: malloc: 00000000087924D0:4560
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 000000000E442E00:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: malloc: 0000000005650850:4096
> 20
>
>
>
>
> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #71
> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #72
> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #73
> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #74
> 2018/08/02 05:48:49 [debug] 871275#871275: add cleanup: 000000005340D728
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000024D3260:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000517BAF10:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053854FC0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053855FD0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053856FE0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053857FF0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: posix_memalign:
> 0000000053859000:16384 @16
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385D010:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385E020:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385F030:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CD160:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CE170:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CF180:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D0190:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D11A0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D21B0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D31C0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D41D0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D51E0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D61F0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D7200:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D8210:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D9220:4096
>
>
> Infact there are lot of such calls during a reload
>
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA17ED00:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA1B0FF0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA1E12C0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA211590:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA243880:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA271B30:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA2A3E20:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA2D20D0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA3063E0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA334690:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA366980:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA396C50:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA3C8F40:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA3F9210:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA4294E0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA45B7D0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA489A80:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA4BBD70:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA4EA020:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA51E330:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA54C5E0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA57E8D0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA5AEBA0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA5DEE70:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA611160:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA641430:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA671700:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA6A29E0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA6D5CE0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA707FD0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA736280:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA768570:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA796820:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA7CAB30:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA7F8DE0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA82B0D0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA85B3A0:16384 @16
>
>
>
> What is perplexing is that the system has enough free (available RAM)
> #############
> # free -g
>               total        used        free      shared buff/cache 
>  available
> Mem:            125          54          24           8         46   
>       58
> Swap:             0           0           0
> #############
>
> # ulimit -a
> core file size          (blocks, -c) 0
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 514579
> max locked memory       (kbytes, -l) 64
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 1024
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 8192
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 514579
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
>
> #########################################
>
> There is no other thing limiting memory allocation
>
> Any way to prevent this or probably identify/prevent this
>
>
> On Tue, Jul 31, 2018 at 7:08 PM Maxim Dounin <[email protected]
> <mailto:[email protected]>> wrote:
>
> Hello!
>
> On Tue, Jul 31, 2018 at 09:52:29AM +0530, Anoop Alias wrote:
>
> > I am repeatedly seeing errors like
> >
> > ######################
> > 2018/07/31 03:46:33 [emerg] 2854560#2854560: posix_memalign(16,
> 16384)
> > failed (12: Cannot allocate memory)
> > 2018/07/31 03:54:09 [emerg] 2890190#2890190: posix_memalign(16,
> 16384)
> > failed (12: Cannot allocate memory)
> > 2018/07/31 04:08:36 [emerg] 2939230#2939230: posix_memalign(16,
> 16384)
> > failed (12: Cannot allocate memory)
> > 2018/07/31 04:24:48 [emerg] 2992650#2992650: posix_memalign(16,
> 16384)
> > failed (12: Cannot allocate memory)
> > 2018/07/31 04:42:09 [emerg] 3053092#3053092: posix_memalign(16,
> 16384)
> > failed (12: Cannot allocate memory)
> > 2018/07/31 04:42:17 [emerg] 3053335#3053335: posix_memalign(16,
> 16384)
> > failed (12: Cannot allocate memory)
> > 2018/07/31 04:42:28 [emerg] 3053937#3053937: posix_memalign(16,
> 16384)
> > failed (12: Cannot allocate memory)
> > 2018/07/31 04:47:54 [emerg] 3070638#3070638: posix_memalign(16,
> 16384)
> > failed (12: Cannot allocate memory)
> > ####################
> >
> > on a few servers
> >
> > The servers have enough memory free and the swap usage is 0, yet
> somehow
> > the kernel denies the posix_memalign with ENOMEM ( this is what
> I think is
> > happening!)
> >
> > The numbers requested are always 16, 16k . This makes me suspicious
> >
> > I have no setting in nginx.conf that reference a 16k
> >
> > Is there any chance of finding out what requests this and why
> this is not
> > fulfilled
>
> There are at least some buffers which default to 16k - for
> example, ssl_buffer_size (http://nginx.org/r/ssl_buffer_size).
>
> You may try debugging log to futher find out where the particular
> allocation happens, see here for details:
>
> http://nginx.org/en/docs/debugging_log.html
>
> But I don't really think it worth the effort.  The error is pretty
> clear, and it's better to focus on why these allocations are
> denied.  Likely you are hitting some limit.
>
> --
> Maxim Dounin
> http://mdounin.ru/
> _______________________________________________
> nginx mailing list
> nginx@nginx.org <mailto:[email protected]>
> http://mailman.nginx.org/mailman/listinfo/nginx
>
>
>
> --
> *Anoop P Alias*
>
>
>
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx


_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Anoop Alias
Re: posix_memalign error
August 02, 2018 12:10PM
Hi Igor,

The error happens randomly

2018/08/02 06:52:42 [emerg] 874514#874514: posix_memalign(16, 16384) failed
(12: Cannot allocate memory)
2018/08/02 09:42:53 [emerg] 872996#872996: posix_memalign(16, 16384) failed
(12: Cannot allocate memory)
2018/08/02 10:16:14 [emerg] 877611#877611: posix_memalign(16, 16384) failed
(12: Cannot allocate memory)
2018/08/02 10:16:48 [emerg] 879410#879410: posix_memalign(16, 16384) failed
(12: Cannot allocate memory)
2018/08/02 10:17:55 [emerg] 876563#876563: posix_memalign(16, 16384) failed
(12: Cannot allocate memory)
2018/08/02 10:20:21 [emerg] 879263#879263: posix_memalign(16, 16384) failed
(12: Cannot allocate memory)
2018/08/02 10:20:51 [emerg] 878991#878991: posix_memalign(16, 16384) failed
(12: Cannot allocate memory)

# date
Thu Aug 2 10:58:48 BST 2018

------------------------------------------
# cat /proc/buddyinfo
Node 0, zone DMA 0 0 1 0 2 1 1
0 1 1 3
Node 0, zone DMA32 11722 11057 4663 1647 609 72 10
7 1 0 0
Node 0, zone Normal 755026 710760 398136 21462 1114 18 1
0 0 0 0
Node 1, zone Normal 341295 801810 179604 256 0 0 0
0 0 0 0
-----------------------------------------


slabinfo - version: 2.1
# name <active_objs> <num_objs> <objsize> <objperslab>
<pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata
<active_slabs> <num_slabs> <sharedavail>
SCTPv6 21 21 1536 21 8 : tunables 0 0 0
: slabdata 1 1 0
SCTP 0 0 1408 23 8 : tunables 0 0 0
: slabdata 0 0 0
kcopyd_job 0 0 3312 9 8 : tunables 0 0 0
: slabdata 0 0 0
dm_uevent 0 0 2608 12 8 : tunables 0 0 0
: slabdata 0 0 0
nf_conntrack_ffffffff81acbb00 14054 14892 320 51 4 : tunables
0 0 0 : slabdata 292 292 0
lvp_cache 36 36 224 36 2 : tunables 0 0 0
: slabdata 1 1 0
lve_struct 4140 4140 352 46 4 : tunables 0 0 0
: slabdata 90 90 0
fat_inode_cache 0 0 744 44 8 : tunables 0 0 0
: slabdata 0 0 0
fat_cache 0 0 40 102 1 : tunables 0 0 0
: slabdata 0 0 0
isofs_inode_cache 0 0 664 49 8 : tunables 0 0 0
: slabdata 0 0 0
ext4_inode_cache 30 30 1088 30 8 : tunables 0 0 0
: slabdata 1 1 0
ext4_xattr 0 0 88 46 1 : tunables 0 0 0
: slabdata 0 0 0
ext4_free_data 0 0 64 64 1 : tunables 0 0 0
: slabdata 0 0 0
ext4_allocation_context 32 32 128 32 1 : tunables 0
0 0 : slabdata 1 1 0
ext4_io_end 0 0 72 56 1 : tunables 0 0 0
: slabdata 0 0 0
ext4_extent_status 102 102 40 102 1 : tunables 0 0 0
: slabdata 1 1 0
jbd2_journal_handle 0 0 48 85 1 : tunables 0 0
0 : slabdata 0 0 0
jbd2_journal_head 0 0 112 36 1 : tunables 0 0 0
: slabdata 0 0 0
jbd2_revoke_table_s 256 256 16 256 1 : tunables 0 0
0 : slabdata 1 1 0
jbd2_revoke_record_s 0 0 32 128 1 : tunables 0 0
0 : slabdata 0 0 0
kvm_async_pf 0 0 136 30 1 : tunables 0 0 0
: slabdata 0 0 0
kvm_vcpu 0 0 18560 1 8 : tunables 0 0 0
: slabdata 0 0 0
xfs_dqtrx 992 992 528 31 4 : tunables 0 0 0
: slabdata 32 32 0
xfs_dquot 3264 3264 472 34 4 : tunables 0 0 0
: slabdata 96 96 0
xfs_ili 4342175 4774399 152 53 2 : tunables 0 0
0 : slabdata 90083 90083 0
xfs_inode 4915588 5486076 1088 30 8 : tunables 0 0
0 : slabdata 182871 182871 0
xfs_efd_item 2680 2760 400 40 4 : tunables 0 0 0
: slabdata 69 69 0
xfs_da_state 1088 1088 480 34 4 : tunables 0 0 0
: slabdata 32 32 0
xfs_btree_cur 1248 1248 208 39 2 : tunables 0 0 0
: slabdata 32 32 0
xfs_log_ticket 14874 15048 184 44 2 : tunables 0 0 0
: slabdata 342 342 0
xfs_ioend 12909 13104 104 39 1 : tunables 0 0 0
: slabdata 336 336 0
scsi_cmd_cache 5400 5652 448 36 4 : tunables 0 0 0
: slabdata 157 157 0
ve_struct 0 0 848 38 8 : tunables 0 0 0
: slabdata 0 0 0
ip6_dst_cache 1152 1152 448 36 4 : tunables 0 0 0
: slabdata 32 32 0
RAWv6 910 910 1216 26 8 : tunables 0 0 0
: slabdata 35 35 0
UDPLITEv6 0 0 1216 26 8 : tunables 0 0 0
: slabdata 0 0 0
UDPv6 832 832 1216 26 8 : tunables 0 0 0
: slabdata 32 32 0
tw_sock_TCPv6 1152 1376 256 32 2 : tunables 0 0 0
: slabdata 43 43 0
TCPv6 510 510 2176 15 8 : tunables 0 0 0
: slabdata 34 34 0
cfq_queue 3698 5145 232 35 2 : tunables 0 0 0
: slabdata 147 147 0
bsg_cmd 0 0 312 52 4 : tunables 0 0 0
: slabdata 0 0 0
mqueue_inode_cache 136 136 960 34 8 : tunables 0 0 0
: slabdata 4 4 0
hugetlbfs_inode_cache 1632 1632 632 51 8 : tunables 0 0
0 : slabdata 32 32 0
configfs_dir_cache 1472 1472 88 46 1 : tunables 0 0 0
: slabdata 32 32 0
dquot 0 0 256 32 2 : tunables 0 0 0
: slabdata 0 0 0
userfaultfd_ctx_cache 32 32 128 32 1 : tunables 0 0
0 : slabdata 1 1 0
fanotify_event_info 2336 2336 56 73 1 : tunables 0 0
0 : slabdata 32 32 0
dio 6171 6222 640 51 8 : tunables 0 0 0
: slabdata 122 122 0
pid_namespace 42 42 2192 14 8 : tunables 0 0 0
: slabdata 3 3 0
posix_timers_cache 1056 1056 248 33 2 : tunables 0 0 0
: slabdata 32 32 0
UDP-Lite 0 0 1088 30 8 : tunables 0 0 0
: slabdata 0 0 0
flow_cache 2268 2296 144 28 1 : tunables 0 0 0
: slabdata 82 82 0
xfrm_dst_cache 896 896 576 28 4 : tunables 0 0 0
: slabdata 32 32 0
ip_fib_alias 2720 2720 48 85 1 : tunables 0 0 0
: slabdata 32 32 0
RAW 3977 4224 1024 32 8 : tunables 0 0 0
: slabdata 132 132 0
UDP 4110 4110 1088 30 8 : tunables 0 0 0
: slabdata 137 137 0
tw_sock_TCP 4756 5216 256 32 2 : tunables 0 0 0
: slabdata 163 163 0
TCP 2705 2768 1984 16 8 : tunables 0 0 0
: slabdata 173 173 0
scsi_data_buffer 5440 5440 24 170 1 : tunables 0 0 0
: slabdata 32 32 0
blkdev_queue 154 154 2208 14 8 : tunables 0 0 0
: slabdata 11 11 0
blkdev_requests 4397688 4405884 384 42 4 : tunables 0 0
0 : slabdata 104902 104902 0
blkdev_ioc 11232 11232 112 36 1 : tunables 0 0 0
: slabdata 312 312 0
user_namespace 0 0 1304 25 8 : tunables 0 0 0
: slabdata 0 0 0
sock_inode_cache 12282 12282 704 46 8 : tunables 0 0 0
: slabdata 267 267 0
file_lock_cache 20056 20960 200 40 2 : tunables 0 0 0
: slabdata 524 524 0
net_namespace 6 6 5056 6 8 : tunables 0 0 0
: slabdata 1 1 0
shmem_inode_cache 16970 18952 712 46 8 : tunables 0 0 0
: slabdata 412 412 0
Acpi-ParseExt 39491 40432 72 56 1 : tunables 0 0 0
: slabdata 722 722 0
Acpi-State 1683 1683 80 51 1 : tunables 0 0 0
: slabdata 33 33 0
Acpi-Namespace 11424 11424 40 102 1 : tunables 0 0 0
: slabdata 112 112 0
task_delay_info 15336 15336 112 36 1 : tunables 0 0 0
: slabdata 426 426 0
taskstats 1568 1568 328 49 4 : tunables 0 0 0
: slabdata 32 32 0
proc_inode_cache 169897 190608 680 48 8 : tunables 0 0 0
: slabdata 3971 3971 0
sigqueue 2208 2208 168 48 2 : tunables 0 0 0
: slabdata 46 46 0
bdev_cache 792 792 896 36 8 : tunables 0 0 0
: slabdata 22 22 0
sysfs_dir_cache 74698 74698 120 34 1 : tunables 0 0 0
: slabdata 2197 2197 0
mnt_cache 163197 163424 256 32 2 : tunables 0 0 0
: slabdata 5107 5107 0
filp 64607 97257 320 51 4 : tunables 0 0 0
: slabdata 1907 1907 0
inode_cache 370744 370947 616 53 8 : tunables 0 0 0
: slabdata 6999 6999 0
dentry 1316262 2139228 192 42 2 : tunables 0 0
0 : slabdata 50934 50934 0
iint_cache 0 0 80 51 1 : tunables 0 0 0
: slabdata 0 0 0
buffer_head 1441470 2890290 104 39 1 : tunables 0 0
0 : slabdata 74110 74110 0
vm_area_struct 194998 196840 216 37 2 : tunables 0 0 0
: slabdata 5320 5320 0
mm_struct 2679 2760 1600 20 8 : tunables 0 0 0
: slabdata 138 138 0
files_cache 8680 8925 640 51 8 : tunables 0 0 0
: slabdata 175 175 0
signal_cache 3691 3780 1152 28 8 : tunables 0 0 0
: slabdata 135 135 0
sighand_cache 1950 2160 2112 15 8 : tunables 0 0 0
: slabdata 144 144 0
task_xstate 8070 8658 832 39 8 : tunables 0 0 0
: slabdata 222 222 0
task_struct 1913 2088 4080 8 8 : tunables 0 0 0
: slabdata 261 261 0
cred_jar 31699 33936 192 42 2 : tunables 0 0 0
: slabdata 808 808 0
anon_vma_chain 164026 168704 64 64 1 : tunables 0 0 0
: slabdata 2636 2636 0
anon_vma 84104 84594 88 46 1 : tunables 0 0 0
: slabdata 1839 1839 0
pid 11127 12576 128 32 1 : tunables 0 0 0
: slabdata 393 393 0
shared_policy_node 9350 9350 48 85 1 : tunables 0 0 0
: slabdata 110 110 0
numa_policy 62 62 264 31 2 : tunables 0 0 0
: slabdata 2 2 0
radix_tree_node 771778 1194312 584 28 4 : tunables 0 0 0
: slabdata 42654 42654 0
idr_layer_cache 2538 2565 2112 15 8 : tunables 0 0 0
: slabdata 171 171 0
dma-kmalloc-8192 0 0 8192 4 8 : tunables 0 0 0
: slabdata 0 0 0
dma-kmalloc-4096 0 0 4096 8 8 : tunables 0 0 0
: slabdata 0 0 0
dma-kmalloc-2048 0 0 2048 16 8 : tunables 0 0 0
: slabdata 0 0 0
dma-kmalloc-1024 0 0 1024 32 8 : tunables 0 0 0
: slabdata 0 0 0
dma-kmalloc-512 0 0 512 32 4 : tunables 0 0 0
: slabdata 0 0 0
dma-kmalloc-256 0 0 256 32 2 : tunables 0 0 0
: slabdata 0 0 0
dma-kmalloc-128 0 0 128 32 1 : tunables 0 0 0
: slabdata 0 0 0
dma-kmalloc-64 0 0 64 64 1 : tunables 0 0 0
: slabdata 0 0 0
dma-kmalloc-32 0 0 32 128 1 : tunables 0 0 0
: slabdata 0 0 0
dma-kmalloc-16 0 0 16 256 1 : tunables 0 0 0
: slabdata 0 0 0
dma-kmalloc-8 0 0 8 512 1 : tunables 0 0 0
: slabdata 0 0 0
dma-kmalloc-192 0 0 192 42 2 : tunables 0 0 0
: slabdata 0 0 0
dma-kmalloc-96 0 0 96 42 1 : tunables 0 0 0
: slabdata 0 0 0
kmalloc-8192 385 388 8192 4 8 : tunables 0 0 0
: slabdata 97 97 0
kmalloc-4096 9296 10088 4096 8 8 : tunables 0 0 0
: slabdata 1261 1261 0
kmalloc-2048 65061 133536 2048 16 8 : tunables 0 0 0
: slabdata 8346 8346 0
kmalloc-1024 11987 21120 1024 32 8 : tunables 0 0 0
: slabdata 660 660 0
kmalloc-512 107510 187072 512 32 4 : tunables 0 0 0
: slabdata 5846 5846 0
kmalloc-256 160498 199104 256 32 2 : tunables 0 0 0
: slabdata 6222 6222 0
kmalloc-192 144975 237426 192 42 2 : tunables 0 0 0
: slabdata 5653 5653 0
kmalloc-128 36799 108096 128 32 1 : tunables 0 0 0
: slabdata 3378 3378 0
kmalloc-96 99510 238896 96 42 1 : tunables 0 0 0
: slabdata 5688 5688 0
kmalloc-64 7978152 8593280 64 64 1 : tunables 0 0
0 : slabdata 134270 134270 0
kmalloc-32 2939882 3089664 32 128 1 : tunables 0 0
0 : slabdata 24138 24138 0
kmalloc-16 172057 172288 16 256 1 : tunables 0 0 0
: slabdata 673 673 0
kmalloc-8 109568 109568 8 512 1 : tunables 0 0 0
: slabdata 214 214 0
kmem_cache_node 893 896 64 64 1 : tunables 0 0 0
: slabdata 14 14 0
kmem_cache 612 612 320 51 4 : tunables 0 0 0
: slabdata 12 12 0

-------------------------------------------------


# uname -r
3.10.0-714.10.2.lve1.5.17.1.el7.x86_64

--------------------------------------------------------

Core part of glances
http://i.imgur.com/La5JbQn.png
-----------------------------------------------------------

Thank you very much for looking into this


On Thu, Aug 2, 2018 at 12:37 PM Igor A. Ippolitov <[email protected]>
wrote:

> Anoop,
>
> I doubt this will be the solution, but may we have a look at
> /proc/buddyinfo and /proc/slabinfo the moment when nginx can't allocate
> memory?
>
> On 02.08.2018 08:15, Anoop Alias wrote:
>
> Hi Maxim,
>
> I enabled debug and the memalign call is happening on nginx reloads and
> the ENOMEM happen sometimes on the reload(not on all reloads)
>
> 2018/08/02 05:59:08 [notice] 872052#872052: signal process started
> 2018/08/02 05:59:23 [notice] 871570#871570: signal 1 (SIGHUP) received
> from 872052, reconfiguring
> 2018/08/02 05:59:23 [debug] 871570#871570: wake up, sigio 0
> 2018/08/02 05:59:23 [notice] 871570#871570: reconfiguring
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 0000000002B0DA00:16384 @16 === > the memalign call on reload
> 2018/08/02 05:59:23 [debug] 871570#871570: malloc: 00000000087924D0:4560
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 000000000E442E00:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: malloc: 0000000005650850:4096
> 20
>
>
>
>
> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #71
> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #72
> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #73
> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #74
> 2018/08/02 05:48:49 [debug] 871275#871275: add cleanup: 000000005340D728
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000024D3260:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000517BAF10:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053854FC0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053855FD0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053856FE0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053857FF0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: posix_memalign:
> 0000000053859000:16384 @16
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385D010:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385E020:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385F030:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CD160:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CE170:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CF180:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D0190:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D11A0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D21B0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D31C0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D41D0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D51E0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D61F0:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D7200:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D8210:4096
> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D9220:4096
>
>
> Infact there are lot of such calls during a reload
>
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA17ED00:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA1B0FF0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA1E12C0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA211590:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA243880:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA271B30:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA2A3E20:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA2D20D0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA3063E0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA334690:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA366980:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA396C50:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA3C8F40:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA3F9210:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA4294E0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA45B7D0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA489A80:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA4BBD70:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA4EA020:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA51E330:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA54C5E0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA57E8D0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA5AEBA0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA5DEE70:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA611160:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA641430:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA671700:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA6A29E0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA6D5CE0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA707FD0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA736280:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA768570:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA796820:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA7CAB30:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA7F8DE0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA82B0D0:16384 @16
> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
> 00000000BA85B3A0:16384 @16
>
>
>
> What is perplexing is that the system has enough free (available RAM)
> #############
> # free -g
> total used free shared buff/cache
> available
> Mem: 125 54 24 8 46
> 58
> Swap: 0 0 0
> #############
>
> # ulimit -a
> core file size (blocks, -c) 0
> data seg size (kbytes, -d) unlimited
> scheduling priority (-e) 0
> file size (blocks, -f) unlimited
> pending signals (-i) 514579
> max locked memory (kbytes, -l) 64
> max memory size (kbytes, -m) unlimited
> open files (-n) 1024
> pipe size (512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> real-time priority (-r) 0
> stack size (kbytes, -s) 8192
> cpu time (seconds, -t) unlimited
> max user processes (-u) 514579
> virtual memory (kbytes, -v) unlimited
> file locks (-x) unlimited
>
> #########################################
>
> There is no other thing limiting memory allocation
>
> Any way to prevent this or probably identify/prevent this
>
>
> On Tue, Jul 31, 2018 at 7:08 PM Maxim Dounin <[email protected]> wrote:
>
>> Hello!
>>
>> On Tue, Jul 31, 2018 at 09:52:29AM +0530, Anoop Alias wrote:
>>
>> > I am repeatedly seeing errors like
>> >
>> > ######################
>> > 2018/07/31 03:46:33 [emerg] 2854560#2854560: posix_memalign(16, 16384)
>> > failed (12: Cannot allocate memory)
>> > 2018/07/31 03:54:09 [emerg] 2890190#2890190: posix_memalign(16, 16384)
>> > failed (12: Cannot allocate memory)
>> > 2018/07/31 04:08:36 [emerg] 2939230#2939230: posix_memalign(16, 16384)
>> > failed (12: Cannot allocate memory)
>> > 2018/07/31 04:24:48 [emerg] 2992650#2992650: posix_memalign(16, 16384)
>> > failed (12: Cannot allocate memory)
>> > 2018/07/31 04:42:09 [emerg] 3053092#3053092: posix_memalign(16, 16384)
>> > failed (12: Cannot allocate memory)
>> > 2018/07/31 04:42:17 [emerg] 3053335#3053335: posix_memalign(16, 16384)
>> > failed (12: Cannot allocate memory)
>> > 2018/07/31 04:42:28 [emerg] 3053937#3053937: posix_memalign(16, 16384)
>> > failed (12: Cannot allocate memory)
>> > 2018/07/31 04:47:54 [emerg] 3070638#3070638: posix_memalign(16, 16384)
>> > failed (12: Cannot allocate memory)
>> > ####################
>> >
>> > on a few servers
>> >
>> > The servers have enough memory free and the swap usage is 0, yet somehow
>> > the kernel denies the posix_memalign with ENOMEM ( this is what I think
>> is
>> > happening!)
>> >
>> > The numbers requested are always 16, 16k . This makes me suspicious
>> >
>> > I have no setting in nginx.conf that reference a 16k
>> >
>> > Is there any chance of finding out what requests this and why this is
>> not
>> > fulfilled
>>
>> There are at least some buffers which default to 16k - for
>> example, ssl_buffer_size (http://nginx.org/r/ssl_buffer_size).
>>
>> You may try debugging log to futher find out where the particular
>> allocation happens, see here for details:
>>
>> http://nginx.org/en/docs/debugging_log.html
>>
>> But I don't really think it worth the effort. The error is pretty
>> clear, and it's better to focus on why these allocations are
>> denied. Likely you are hitting some limit.
>>
>> --
>> Maxim Dounin
>> http://mdounin.ru/
>> _______________________________________________
>> nginx mailing list
>> nginx@nginx.org
>> http://mailman.nginx.org/mailman/listinfo/nginx
>>
>
>
> --
> *Anoop P Alias*
>
>
>
> _______________________________________________
> nginx mailing listnginx@nginx.orghttp://mailman.nginx.org/mailman/listinfo/nginx
>
>
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx



--
*Anoop P Alias*
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Igor A. Ippolitov
Re: posix_memalign error
August 02, 2018 02:40PM
Anoop,

There are two guesses: either mmap allocations limit is hit or memory
is  way too fragmented.
Could you please track amount of mapped regions for a worker with pmap
and amount of 16k areas in Normal zones (it is the third number)?

You can also set vm.max_map_count to a higher number (like 20 times
higher than default) and look if the error is gone.

Please, let me know if increasing vm.max_map_count helps you.

On 02.08.2018 13:06, Anoop Alias wrote:
> Hi Igor,
>
> The error happens randomly
>
> 2018/08/02 06:52:42 [emerg] 874514#874514: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/08/02 09:42:53 [emerg] 872996#872996: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/08/02 10:16:14 [emerg] 877611#877611: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/08/02 10:16:48 [emerg] 879410#879410: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/08/02 10:17:55 [emerg] 876563#876563: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/08/02 10:20:21 [emerg] 879263#879263: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/08/02 10:20:51 [emerg] 878991#878991: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
>
> # date
> Thu Aug  2 10:58:48 BST 2018
>
> ------------------------------------------
> # cat /proc/buddyinfo
> Node 0, zone      DMA      0      0      1      0 2      1      1     
> 0      1      1      3
> Node 0, zone    DMA32  11722  11057   4663   1647 609     72     10   
>   7      1      0      0
> Node 0, zone   Normal 755026 710760 398136  21462  1114     18      1 
>     0      0      0      0
> Node 1, zone   Normal 341295 801810 179604    256 0      0      0     
> 0      0      0      0
> -----------------------------------------
>
>
> slabinfo - version: 2.1
> # name            <active_objs> <num_objs> <objsize> <objperslab>
> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> :
> slabdata <active_slabs> <num_slabs> <sharedavail>
> SCTPv6                21     21   1536   21    8 : tunables    0    0 
>   0 : slabdata      1      1      0
> SCTP                   0      0   1408   23    8 : tunables    0    0 
>   0 : slabdata      0      0      0
> kcopyd_job             0      0   3312    9    8 : tunables    0    0 
>   0 : slabdata      0      0      0
> dm_uevent              0      0   2608   12    8 : tunables    0    0 
>   0 : slabdata      0      0      0
> nf_conntrack_ffffffff81acbb00  14054  14892    320  51    4 :
> tunables    0    0    0 : slabdata    292 292      0
> lvp_cache             36     36    224   36    2 : tunables    0    0 
>   0 : slabdata      1      1      0
> lve_struct          4140   4140    352   46    4 : tunables    0    0 
>   0 : slabdata     90     90      0
> fat_inode_cache        0      0    744   44    8 : tunables    0    0 
>   0 : slabdata      0      0      0
> fat_cache              0      0     40  102    1 : tunables    0    0 
>   0 : slabdata      0      0      0
> isofs_inode_cache      0      0    664   49    8 : tunables    0    0 
>   0 : slabdata      0      0      0
> ext4_inode_cache      30     30   1088   30    8 : tunables    0    0 
>   0 : slabdata      1      1      0
> ext4_xattr             0      0     88   46    1 : tunables    0    0 
>   0 : slabdata      0      0      0
> ext4_free_data         0      0     64   64    1 : tunables    0    0 
>   0 : slabdata      0      0      0
> ext4_allocation_context     32     32    128   32    1 : tunables   
> 0    0    0 : slabdata      1      1      0
> ext4_io_end            0      0     72   56    1 : tunables    0    0 
>   0 : slabdata      0      0      0
> ext4_extent_status    102    102     40  102    1 : tunables    0   
> 0    0 : slabdata      1      1      0
> jbd2_journal_handle      0      0     48   85    1 : tunables    0   
> 0    0 : slabdata      0      0      0
> jbd2_journal_head      0      0    112   36    1 : tunables    0    0 
>   0 : slabdata      0      0      0
> jbd2_revoke_table_s    256    256     16  256    1 : tunables    0   
> 0    0 : slabdata      1      1      0
> jbd2_revoke_record_s      0      0     32  128    1 : tunables    0   
> 0    0 : slabdata      0      0      0
> kvm_async_pf           0      0    136   30    1 : tunables    0    0 
>   0 : slabdata      0      0      0
> kvm_vcpu               0      0  18560    1    8 : tunables    0    0 
>   0 : slabdata      0      0      0
> xfs_dqtrx            992    992    528   31    4 : tunables    0    0 
>   0 : slabdata     32     32      0
> xfs_dquot           3264   3264    472   34    4 : tunables    0    0 
>   0 : slabdata     96     96      0
> xfs_ili           4342175 4774399    152   53    2 : tunables    0   
> 0    0 : slabdata  90083  90083      0
> xfs_inode         4915588 5486076   1088   30    8 : tunables    0   
> 0    0 : slabdata 182871 182871      0
> xfs_efd_item        2680   2760    400   40    4 : tunables    0    0 
>   0 : slabdata     69     69      0
> xfs_da_state        1088   1088    480   34    4 : tunables    0    0 
>   0 : slabdata     32     32      0
> xfs_btree_cur       1248   1248    208   39    2 : tunables    0    0 
>   0 : slabdata     32     32      0
> xfs_log_ticket     14874  15048    184   44    2 : tunables    0    0 
>   0 : slabdata    342    342      0
> xfs_ioend          12909  13104    104   39    1 : tunables    0    0 
>   0 : slabdata    336    336      0
> scsi_cmd_cache      5400   5652    448   36    4 : tunables    0    0 
>   0 : slabdata    157    157      0
> ve_struct              0      0    848   38    8 : tunables    0    0 
>   0 : slabdata      0      0      0
> ip6_dst_cache       1152   1152    448   36    4 : tunables    0    0 
>   0 : slabdata     32     32      0
> RAWv6                910    910   1216   26    8 : tunables    0    0 
>   0 : slabdata     35     35      0
> UDPLITEv6              0      0   1216   26    8 : tunables    0    0 
>   0 : slabdata      0      0      0
> UDPv6                832    832   1216   26    8 : tunables    0    0 
>   0 : slabdata     32     32      0
> tw_sock_TCPv6       1152   1376    256   32    2 : tunables    0    0 
>   0 : slabdata     43     43      0
> TCPv6                510    510   2176   15    8 : tunables    0    0 
>   0 : slabdata     34     34      0
> cfq_queue           3698   5145    232   35    2 : tunables    0    0 
>   0 : slabdata    147    147      0
> bsg_cmd                0      0    312   52    4 : tunables    0    0 
>   0 : slabdata      0      0      0
> mqueue_inode_cache    136    136    960   34    8 : tunables    0   
> 0    0 : slabdata      4      4      0
> hugetlbfs_inode_cache   1632   1632    632   51    8 : tunables    0 
>   0    0 : slabdata     32     32      0
> configfs_dir_cache   1472   1472     88   46    1 : tunables    0   
> 0    0 : slabdata     32     32      0
> dquot                  0      0    256   32    2 : tunables    0    0 
>   0 : slabdata      0      0      0
> userfaultfd_ctx_cache     32     32    128   32    1 : tunables    0 
>   0    0 : slabdata      1      1      0
> fanotify_event_info   2336   2336     56   73    1 : tunables    0   
> 0    0 : slabdata     32     32      0
> dio                 6171   6222    640   51    8 : tunables    0    0 
>   0 : slabdata    122    122      0
> pid_namespace         42     42   2192   14    8 : tunables    0    0 
>   0 : slabdata      3      3      0
> posix_timers_cache   1056   1056    248   33    2 : tunables    0   
> 0    0 : slabdata     32     32      0
> UDP-Lite               0      0   1088   30    8 : tunables    0    0 
>   0 : slabdata      0      0      0
> flow_cache          2268   2296    144   28    1 : tunables    0    0 
>   0 : slabdata     82     82      0
> xfrm_dst_cache       896    896    576   28    4 : tunables    0    0 
>   0 : slabdata     32     32      0
> ip_fib_alias        2720   2720     48   85    1 : tunables    0    0 
>   0 : slabdata     32     32      0
> RAW                 3977   4224   1024   32    8 : tunables    0    0 
>   0 : slabdata    132    132      0
> UDP                 4110   4110   1088   30    8 : tunables    0    0 
>   0 : slabdata    137    137      0
> tw_sock_TCP         4756   5216    256   32    2 : tunables    0    0 
>   0 : slabdata    163    163      0
> TCP                 2705   2768   1984   16    8 : tunables    0    0 
>   0 : slabdata    173    173      0
> scsi_data_buffer    5440   5440     24  170    1 : tunables    0    0 
>   0 : slabdata     32     32      0
> blkdev_queue         154    154   2208   14    8 : tunables    0    0 
>   0 : slabdata     11     11      0
> blkdev_requests   4397688 4405884    384   42    4 : tunables    0   
> 0    0 : slabdata 104902 104902      0
> blkdev_ioc         11232  11232    112   36    1 : tunables    0    0 
>   0 : slabdata    312    312      0
> user_namespace         0      0   1304   25    8 : tunables    0    0 
>   0 : slabdata      0      0      0
> sock_inode_cache   12282  12282    704   46    8 : tunables    0    0 
>   0 : slabdata    267    267      0
> file_lock_cache    20056  20960    200   40    2 : tunables    0    0 
>   0 : slabdata    524    524      0
> net_namespace          6      6   5056    6    8 : tunables    0    0 
>   0 : slabdata      1      1      0
> shmem_inode_cache  16970  18952    712   46    8 : tunables    0    0 
>   0 : slabdata    412    412      0
> Acpi-ParseExt      39491  40432     72   56    1 : tunables    0    0 
>   0 : slabdata    722    722      0
> Acpi-State          1683   1683     80   51    1 : tunables    0    0 
>   0 : slabdata     33     33      0
> Acpi-Namespace     11424  11424     40  102    1 : tunables    0    0 
>   0 : slabdata    112    112      0
> task_delay_info    15336  15336    112   36    1 : tunables    0    0 
>   0 : slabdata    426    426      0
> taskstats           1568   1568    328   49    4 : tunables    0    0 
>   0 : slabdata     32     32      0
> proc_inode_cache  169897 190608    680   48    8 : tunables    0    0 
>   0 : slabdata   3971   3971      0
> sigqueue            2208   2208    168   48    2 : tunables    0    0 
>   0 : slabdata     46     46      0
> bdev_cache           792    792    896   36    8 : tunables    0    0 
>   0 : slabdata     22     22      0
> sysfs_dir_cache    74698  74698    120   34    1 : tunables    0    0 
>   0 : slabdata   2197   2197      0
> mnt_cache         163197 163424    256   32    2 : tunables    0    0 
>   0 : slabdata   5107   5107      0
> filp               64607  97257    320   51    4 : tunables    0    0 
>   0 : slabdata   1907   1907      0
> inode_cache       370744 370947    616   53    8 : tunables    0    0 
>   0 : slabdata   6999   6999      0
> dentry            1316262 2139228    192   42    2 : tunables    0   
> 0    0 : slabdata  50934  50934      0
> iint_cache             0      0     80   51    1 : tunables    0    0 
>   0 : slabdata      0      0      0
> buffer_head       1441470 2890290    104   39    1 : tunables    0   
> 0    0 : slabdata  74110  74110      0
> vm_area_struct    194998 196840    216   37    2 : tunables    0    0 
>   0 : slabdata   5320   5320      0
> mm_struct           2679   2760   1600   20    8 : tunables    0    0 
>   0 : slabdata    138    138      0
> files_cache         8680   8925    640   51    8 : tunables    0    0 
>   0 : slabdata    175    175      0
> signal_cache        3691   3780   1152   28    8 : tunables    0    0 
>   0 : slabdata    135    135      0
> sighand_cache       1950   2160   2112   15    8 : tunables    0    0 
>   0 : slabdata    144    144      0
> task_xstate         8070   8658    832   39    8 : tunables    0    0 
>   0 : slabdata    222    222      0
> task_struct         1913   2088   4080    8    8 : tunables    0    0 
>   0 : slabdata    261    261      0
> cred_jar           31699  33936    192   42    2 : tunables    0    0 
>   0 : slabdata    808    808      0
> anon_vma_chain    164026 168704     64   64    1 : tunables    0    0 
>   0 : slabdata   2636   2636      0
> anon_vma           84104  84594     88   46    1 : tunables    0    0 
>   0 : slabdata   1839   1839      0
> pid                11127  12576    128   32    1 : tunables    0    0 
>   0 : slabdata    393    393      0
> shared_policy_node   9350   9350     48   85    1 : tunables    0   
> 0    0 : slabdata    110    110      0
> numa_policy           62     62    264   31    2 : tunables    0    0 
>   0 : slabdata      2      2      0
> radix_tree_node   771778 1194312    584   28    4 : tunables    0   
> 0    0 : slabdata  42654  42654      0
> idr_layer_cache     2538   2565   2112   15    8 : tunables    0    0 
>   0 : slabdata    171    171      0
> dma-kmalloc-8192       0      0   8192    4    8 : tunables    0    0 
>   0 : slabdata      0      0      0
> dma-kmalloc-4096       0      0   4096    8    8 : tunables    0    0 
>   0 : slabdata      0      0      0
> dma-kmalloc-2048       0      0   2048   16    8 : tunables    0    0 
>   0 : slabdata      0      0      0
> dma-kmalloc-1024       0      0   1024   32    8 : tunables    0    0 
>   0 : slabdata      0      0      0
> dma-kmalloc-512        0      0    512   32    4 : tunables    0    0 
>   0 : slabdata      0      0      0
> dma-kmalloc-256        0      0    256   32    2 : tunables    0    0 
>   0 : slabdata      0      0      0
> dma-kmalloc-128        0      0    128   32    1 : tunables    0    0 
>   0 : slabdata      0      0      0
> dma-kmalloc-64         0      0     64   64    1 : tunables    0    0 
>   0 : slabdata      0      0      0
> dma-kmalloc-32         0      0     32  128    1 : tunables    0    0 
>   0 : slabdata      0      0      0
> dma-kmalloc-16         0      0     16  256    1 : tunables    0    0 
>   0 : slabdata      0      0      0
> dma-kmalloc-8          0      0      8  512    1 : tunables    0    0 
>   0 : slabdata      0      0      0
> dma-kmalloc-192        0      0    192   42    2 : tunables    0    0 
>   0 : slabdata      0      0      0
> dma-kmalloc-96         0      0     96   42    1 : tunables    0    0 
>   0 : slabdata      0      0      0
> kmalloc-8192         385    388   8192    4    8 : tunables    0    0 
>   0 : slabdata     97     97      0
> kmalloc-4096        9296  10088   4096    8    8 : tunables    0    0 
>   0 : slabdata   1261   1261      0
> kmalloc-2048       65061 133536   2048   16    8 : tunables    0    0 
>   0 : slabdata   8346   8346      0
> kmalloc-1024       11987  21120   1024   32    8 : tunables    0    0 
>   0 : slabdata    660    660      0
> kmalloc-512       107510 187072    512   32    4 : tunables    0    0 
>   0 : slabdata   5846   5846      0
> kmalloc-256       160498 199104    256   32    2 : tunables    0    0 
>   0 : slabdata   6222   6222      0
> kmalloc-192       144975 237426    192   42    2 : tunables    0    0 
>   0 : slabdata   5653   5653      0
> kmalloc-128        36799 108096    128   32    1 : tunables    0    0 
>   0 : slabdata   3378   3378      0
> kmalloc-96         99510 238896     96   42    1 : tunables    0    0 
>   0 : slabdata   5688   5688      0
> kmalloc-64        7978152 8593280     64   64    1 : tunables    0   
> 0    0 : slabdata 134270 134270      0
> kmalloc-32        2939882 3089664     32  128    1 : tunables    0   
> 0    0 : slabdata  24138  24138      0
> kmalloc-16        172057 172288     16  256    1 : tunables    0    0 
>   0 : slabdata    673    673      0
> kmalloc-8         109568 109568      8  512    1 : tunables    0    0 
>   0 : slabdata    214    214      0
> kmem_cache_node      893    896     64   64    1 : tunables    0    0 
>   0 : slabdata     14     14      0
> kmem_cache           612    612    320   51    4 : tunables    0    0 
>   0 : slabdata     12     12      0
>
> -------------------------------------------------
>
>
> # uname -r
> 3.10.0-714.10.2.lve1.5.17.1.el7.x86_64
>
> --------------------------------------------------------
>
> Core part of glances
> http://i.imgur.com/La5JbQn.png
> -----------------------------------------------------------
>
> Thank you very much for looking into this
>
>
> On Thu, Aug 2, 2018 at 12:37 PM Igor A. Ippolitov
> <[email protected] <mailto:[email protected]>> wrote:
>
> Anoop,
>
> I doubt this will be the solution, but may we have a look at
> /proc/buddyinfo and /proc/slabinfo the moment when nginx can't
> allocate memory?
>
> On 02.08.2018 08:15, Anoop Alias wrote:
>> Hi Maxim,
>>
>> I enabled debug and the memalign call is happening on nginx
>> reloads and the ENOMEM happen sometimes on the reload(not on all
>> reloads)
>>
>> 2018/08/02 05:59:08 [notice] 872052#872052: signal process started
>> 2018/08/02 05:59:23 [notice] 871570#871570: signal 1 (SIGHUP)
>> received from 872052, reconfiguring
>> 2018/08/02 05:59:23 [debug] 871570#871570: wake up, sigio 0
>> 2018/08/02 05:59:23 [notice] 871570#871570: reconfiguring
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 0000000002B0DA00:16384 @16      === > the memalign call on reload
>> 2018/08/02 05:59:23 [debug] 871570#871570: malloc:
>> 00000000087924D0:4560
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 000000000E442E00:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: malloc:
>> 0000000005650850:4096
>> 20
>>
>>
>>
>>
>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #71
>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #72
>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #73
>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #74
>> 2018/08/02 05:48:49 [debug] 871275#871275: add cleanup:
>> 000000005340D728
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 00000000024D3260:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 00000000517BAF10:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 0000000053854FC0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 0000000053855FD0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 0000000053856FE0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 0000000053857FF0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: posix_memalign:
>> 0000000053859000:16384 @16
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 000000005385D010:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 000000005385E020:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 000000005385F030:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 00000000536CD160:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 00000000536CE170:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 00000000536CF180:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 00000000536D0190:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 00000000536D11A0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 00000000536D21B0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 00000000536D31C0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 00000000536D41D0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 00000000536D51E0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 00000000536D61F0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 00000000536D7200:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 00000000536D8210:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>> 00000000536D9220:4096
>>
>>
>> Infact there are lot of such calls during a reload
>>
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA17ED00:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA1B0FF0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA1E12C0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA211590:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA243880:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA271B30:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA2A3E20:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA2D20D0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA3063E0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA334690:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA366980:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA396C50:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA3C8F40:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA3F9210:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA4294E0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA45B7D0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA489A80:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA4BBD70:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA4EA020:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA51E330:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA54C5E0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA57E8D0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA5AEBA0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA5DEE70:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA611160:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA641430:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA671700:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA6A29E0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA6D5CE0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA707FD0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA736280:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA768570:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA796820:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA7CAB30:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA7F8DE0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA82B0D0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA85B3A0:16384 @16
>>
>>
>>
>> What is perplexing is that the system has enough free (available RAM)
>> #############
>> # free -g
>>               total        used        free shared  buff/cache 
>>  available
>> Mem:            125          54          24      8          46   
>>       58
>> Swap:             0           0           0
>> #############
>>
>> # ulimit -a
>> core file size          (blocks, -c) 0
>> data seg size           (kbytes, -d) unlimited
>> scheduling priority             (-e) 0
>> file size               (blocks, -f) unlimited
>> pending signals                 (-i) 514579
>> max locked memory       (kbytes, -l) 64
>> max memory size         (kbytes, -m) unlimited
>> open files                      (-n) 1024
>> pipe size            (512 bytes, -p) 8
>> POSIX message queues     (bytes, -q) 819200
>> real-time priority              (-r) 0
>> stack size              (kbytes, -s) 8192
>> cpu time               (seconds, -t) unlimited
>> max user processes              (-u) 514579
>> virtual memory          (kbytes, -v) unlimited
>> file locks                      (-x) unlimited
>>
>> #########################################
>>
>> There is no other thing limiting memory allocation
>>
>> Any way to prevent this or probably identify/prevent this
>>
>>
>> On Tue, Jul 31, 2018 at 7:08 PM Maxim Dounin <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>> Hello!
>>
>> On Tue, Jul 31, 2018 at 09:52:29AM +0530, Anoop Alias wrote:
>>
>> > I am repeatedly seeing errors like
>> >
>> > ######################
>> > 2018/07/31 03:46:33 [emerg] 2854560#2854560:
>> posix_memalign(16, 16384)
>> > failed (12: Cannot allocate memory)
>> > 2018/07/31 03:54:09 [emerg] 2890190#2890190:
>> posix_memalign(16, 16384)
>> > failed (12: Cannot allocate memory)
>> > 2018/07/31 04:08:36 [emerg] 2939230#2939230:
>> posix_memalign(16, 16384)
>> > failed (12: Cannot allocate memory)
>> > 2018/07/31 04:24:48 [emerg] 2992650#2992650:
>> posix_memalign(16, 16384)
>> > failed (12: Cannot allocate memory)
>> > 2018/07/31 04:42:09 [emerg] 3053092#3053092:
>> posix_memalign(16, 16384)
>> > failed (12: Cannot allocate memory)
>> > 2018/07/31 04:42:17 [emerg] 3053335#3053335:
>> posix_memalign(16, 16384)
>> > failed (12: Cannot allocate memory)
>> > 2018/07/31 04:42:28 [emerg] 3053937#3053937:
>> posix_memalign(16, 16384)
>> > failed (12: Cannot allocate memory)
>> > 2018/07/31 04:47:54 [emerg] 3070638#3070638:
>> posix_memalign(16, 16384)
>> > failed (12: Cannot allocate memory)
>> > ####################
>> >
>> > on a few servers
>> >
>> > The servers have enough memory free and the swap usage is
>> 0, yet somehow
>> > the kernel denies the posix_memalign with ENOMEM ( this is
>> what I think is
>> > happening!)
>> >
>> > The numbers requested are always 16, 16k . This makes me
>> suspicious
>> >
>> > I have no setting in nginx.conf that reference a 16k
>> >
>> > Is there any chance of finding out what requests this and
>> why this is not
>> > fulfilled
>>
>> There are at least some buffers which default to 16k - for
>> example, ssl_buffer_size (http://nginx.org/r/ssl_buffer_size).
>>
>> You may try debugging log to futher find out where the
>> particular
>> allocation happens, see here for details:
>>
>> http://nginx.org/en/docs/debugging_log.html
>>
>> But I don't really think it worth the effort.  The error is
>> pretty
>> clear, and it's better to focus on why these allocations are
>> denied.  Likely you are hitting some limit.
>>
>> --
>> Maxim Dounin
>> http://mdounin.ru/
>> _______________________________________________
>> nginx mailing list
>> nginx@nginx.org <mailto:[email protected]>
>> http://mailman.nginx.org/mailman/listinfo/nginx
>>
>>
>>
>> --
>> *Anoop P Alias*
>>
>>
>>
>> _______________________________________________
>> nginx mailing list
>> nginx@nginx.org <mailto:[email protected]>
>> http://mailman.nginx.org/mailman/listinfo/nginx
>
>
> _______________________________________________
> nginx mailing list
> nginx@nginx.org <mailto:[email protected]>
> http://mailman.nginx.org/mailman/listinfo/nginx
>
>
>
> --
> *Anoop P Alias*
>
>
>
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx


_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Anoop Alias
Re: posix_memalign error
August 04, 2018 07:00AM
Hi Igor,

Setting vm.max_map_count to 20x the normal value did not help

The issue happens on a group of servers and among the group, it shows up
only in servers which have ~10k server{} blocks

On servers that have lower number of server{} blocks , the ENOMEM issue is
not there

Also, I can find that the RAM usage of the Nginx process is directly
proportional to the number of server {} blocks

For example on a server having the problem

# ps_mem| head -1 && ps_mem |grep nginx
Private + Shared = RAM used Program
1.0 GiB + 2.8 GiB = 3.8 GiB nginx (3)


That is for a single worker process with 4 threads in thread_pool
# pstree|grep nginx
|-nginx-+-nginx---4*[{nginx}]
| `-nginx

Whatever config change I try the memory usage seem to mostly depend on the
number of server contexts defined

Now the issue mostly happen in nginx reload ,when one more worker process
will be active in shutting down mode

I believe the memalign error is thrown by the worker being shutdown, this
is because the sites work after the error and also the pid mentioned in the
error would have gone when I check ps


# pmap 948965|grep 16K
00007f2923ff2000 16K r-x-- ngx_http_redis2_module.so
00007f2924fd7000 16K r---- libc-2.17.so
00007f2925431000 16K rw--- [ anon ]
00007f292584a000 16K rw--- [ anon ]

Aug 4 05:50:00 b kernel: SysRq : Show Memory
Aug 4 05:50:00 b kernel: Mem-Info:
Aug 4 05:50:00 b kernel: active_anon:7757394 inactive_anon:1021319
isolated_anon:0#012 active_file:3733324 inactive_file:2136476
isolated_file:0#012 unevictable:0 dirty:1766 writeback:6 wbtmp:0
unstable:0#012 slab_reclaimable:2003687 slab_unreclaimable:901391#012
mapped:316734 shmem:2381810 pagetables:63163 bounce:0#012 free:4851283
free_pcp:11332 free_cma:0
Aug 4 05:50:00 bravo kernel: Node 0 DMA free:15888kB min:8kB low:8kB
high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB
inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB
present:15972kB managed:15888kB mlocked:0kB dirty:0kB writeback:0kB
mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB
kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB
local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0
all_unreclaimable? yes
Aug 4 05:50:00 b kernel: lowmem_reserve[]: 0 1679 64139 64139

# cat /proc/buddyinfo
Node 0, zone DMA 0 0 1 0 2 1 1
0 1 1 3
Node 0, zone DMA32 5284 6753 6677 1083 410 59 1
0 0 0 0
Node 0, zone Normal 500327 638958 406737 14690 872 106 11
0 0 0 0
Node 1, zone Normal 584840 291640 188 0 0 0 0
0 0 0 0


The only correlation I see in having the error is the number of server {}
blocks (close to 10k) which then makes the nginx process consume ~ 4GB of
mem with a single worker process and then a reload is done




On Thu, Aug 2, 2018 at 6:02 PM Igor A. Ippolitov <[email protected]>
wrote:

> Anoop,
>
> There are two guesses: either mmap allocations limit is hit or memory is
> way too fragmented.
> Could you please track amount of mapped regions for a worker with pmap and
> amount of 16k areas in Normal zones (it is the third number)?
>
> You can also set vm.max_map_count to a higher number (like 20 times higher
> than default) and look if the error is gone.
>
> Please, let me know if increasing vm.max_map_count helps you.
>
> On 02.08.2018 13:06, Anoop Alias wrote:
>
> Hi Igor,
>
> The error happens randomly
>
> 2018/08/02 06:52:42 [emerg] 874514#874514: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/08/02 09:42:53 [emerg] 872996#872996: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/08/02 10:16:14 [emerg] 877611#877611: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/08/02 10:16:48 [emerg] 879410#879410: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/08/02 10:17:55 [emerg] 876563#876563: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/08/02 10:20:21 [emerg] 879263#879263: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
> 2018/08/02 10:20:51 [emerg] 878991#878991: posix_memalign(16, 16384)
> failed (12: Cannot allocate memory)
>
> # date
> Thu Aug 2 10:58:48 BST 2018
>
> ------------------------------------------
> # cat /proc/buddyinfo
> Node 0, zone DMA 0 0 1 0 2 1 1
> 0 1 1 3
> Node 0, zone DMA32 11722 11057 4663 1647 609 72 10
> 7 1 0 0
> Node 0, zone Normal 755026 710760 398136 21462 1114 18 1
> 0 0 0 0
> Node 1, zone Normal 341295 801810 179604 256 0 0 0
> 0 0 0 0
> -----------------------------------------
>
>
> slabinfo - version: 2.1
> # name <active_objs> <num_objs> <objsize> <objperslab>
> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata
> <active_slabs> <num_slabs> <sharedavail>
> SCTPv6 21 21 1536 21 8 : tunables 0 0 0
> : slabdata 1 1 0
> SCTP 0 0 1408 23 8 : tunables 0 0 0
> : slabdata 0 0 0
> kcopyd_job 0 0 3312 9 8 : tunables 0 0 0
> : slabdata 0 0 0
> dm_uevent 0 0 2608 12 8 : tunables 0 0 0
> : slabdata 0 0 0
> nf_conntrack_ffffffff81acbb00 14054 14892 320 51 4 : tunables
> 0 0 0 : slabdata 292 292 0
> lvp_cache 36 36 224 36 2 : tunables 0 0 0
> : slabdata 1 1 0
> lve_struct 4140 4140 352 46 4 : tunables 0 0 0
> : slabdata 90 90 0
> fat_inode_cache 0 0 744 44 8 : tunables 0 0 0
> : slabdata 0 0 0
> fat_cache 0 0 40 102 1 : tunables 0 0 0
> : slabdata 0 0 0
> isofs_inode_cache 0 0 664 49 8 : tunables 0 0 0
> : slabdata 0 0 0
> ext4_inode_cache 30 30 1088 30 8 : tunables 0 0 0
> : slabdata 1 1 0
> ext4_xattr 0 0 88 46 1 : tunables 0 0 0
> : slabdata 0 0 0
> ext4_free_data 0 0 64 64 1 : tunables 0 0 0
> : slabdata 0 0 0
> ext4_allocation_context 32 32 128 32 1 : tunables 0
> 0 0 : slabdata 1 1 0
> ext4_io_end 0 0 72 56 1 : tunables 0 0 0
> : slabdata 0 0 0
> ext4_extent_status 102 102 40 102 1 : tunables 0 0
> 0 : slabdata 1 1 0
> jbd2_journal_handle 0 0 48 85 1 : tunables 0 0
> 0 : slabdata 0 0 0
> jbd2_journal_head 0 0 112 36 1 : tunables 0 0 0
> : slabdata 0 0 0
> jbd2_revoke_table_s 256 256 16 256 1 : tunables 0 0
> 0 : slabdata 1 1 0
> jbd2_revoke_record_s 0 0 32 128 1 : tunables 0 0
> 0 : slabdata 0 0 0
> kvm_async_pf 0 0 136 30 1 : tunables 0 0 0
> : slabdata 0 0 0
> kvm_vcpu 0 0 18560 1 8 : tunables 0 0 0
> : slabdata 0 0 0
> xfs_dqtrx 992 992 528 31 4 : tunables 0 0 0
> : slabdata 32 32 0
> xfs_dquot 3264 3264 472 34 4 : tunables 0 0 0
> : slabdata 96 96 0
> xfs_ili 4342175 4774399 152 53 2 : tunables 0 0
> 0 : slabdata 90083 90083 0
> xfs_inode 4915588 5486076 1088 30 8 : tunables 0 0
> 0 : slabdata 182871 182871 0
> xfs_efd_item 2680 2760 400 40 4 : tunables 0 0 0
> : slabdata 69 69 0
> xfs_da_state 1088 1088 480 34 4 : tunables 0 0 0
> : slabdata 32 32 0
> xfs_btree_cur 1248 1248 208 39 2 : tunables 0 0 0
> : slabdata 32 32 0
> xfs_log_ticket 14874 15048 184 44 2 : tunables 0 0 0
> : slabdata 342 342 0
> xfs_ioend 12909 13104 104 39 1 : tunables 0 0 0
> : slabdata 336 336 0
> scsi_cmd_cache 5400 5652 448 36 4 : tunables 0 0 0
> : slabdata 157 157 0
> ve_struct 0 0 848 38 8 : tunables 0 0 0
> : slabdata 0 0 0
> ip6_dst_cache 1152 1152 448 36 4 : tunables 0 0 0
> : slabdata 32 32 0
> RAWv6 910 910 1216 26 8 : tunables 0 0 0
> : slabdata 35 35 0
> UDPLITEv6 0 0 1216 26 8 : tunables 0 0 0
> : slabdata 0 0 0
> UDPv6 832 832 1216 26 8 : tunables 0 0 0
> : slabdata 32 32 0
> tw_sock_TCPv6 1152 1376 256 32 2 : tunables 0 0 0
> : slabdata 43 43 0
> TCPv6 510 510 2176 15 8 : tunables 0 0 0
> : slabdata 34 34 0
> cfq_queue 3698 5145 232 35 2 : tunables 0 0 0
> : slabdata 147 147 0
> bsg_cmd 0 0 312 52 4 : tunables 0 0 0
> : slabdata 0 0 0
> mqueue_inode_cache 136 136 960 34 8 : tunables 0 0
> 0 : slabdata 4 4 0
> hugetlbfs_inode_cache 1632 1632 632 51 8 : tunables 0 0
> 0 : slabdata 32 32 0
> configfs_dir_cache 1472 1472 88 46 1 : tunables 0 0
> 0 : slabdata 32 32 0
> dquot 0 0 256 32 2 : tunables 0 0 0
> : slabdata 0 0 0
> userfaultfd_ctx_cache 32 32 128 32 1 : tunables 0 0
> 0 : slabdata 1 1 0
> fanotify_event_info 2336 2336 56 73 1 : tunables 0 0
> 0 : slabdata 32 32 0
> dio 6171 6222 640 51 8 : tunables 0 0 0
> : slabdata 122 122 0
> pid_namespace 42 42 2192 14 8 : tunables 0 0 0
> : slabdata 3 3 0
> posix_timers_cache 1056 1056 248 33 2 : tunables 0 0
> 0 : slabdata 32 32 0
> UDP-Lite 0 0 1088 30 8 : tunables 0 0 0
> : slabdata 0 0 0
> flow_cache 2268 2296 144 28 1 : tunables 0 0 0
> : slabdata 82 82 0
> xfrm_dst_cache 896 896 576 28 4 : tunables 0 0 0
> : slabdata 32 32 0
> ip_fib_alias 2720 2720 48 85 1 : tunables 0 0 0
> : slabdata 32 32 0
> RAW 3977 4224 1024 32 8 : tunables 0 0 0
> : slabdata 132 132 0
> UDP 4110 4110 1088 30 8 : tunables 0 0 0
> : slabdata 137 137 0
> tw_sock_TCP 4756 5216 256 32 2 : tunables 0 0 0
> : slabdata 163 163 0
> TCP 2705 2768 1984 16 8 : tunables 0 0 0
> : slabdata 173 173 0
> scsi_data_buffer 5440 5440 24 170 1 : tunables 0 0 0
> : slabdata 32 32 0
> blkdev_queue 154 154 2208 14 8 : tunables 0 0 0
> : slabdata 11 11 0
> blkdev_requests 4397688 4405884 384 42 4 : tunables 0 0
> 0 : slabdata 104902 104902 0
> blkdev_ioc 11232 11232 112 36 1 : tunables 0 0 0
> : slabdata 312 312 0
> user_namespace 0 0 1304 25 8 : tunables 0 0 0
> : slabdata 0 0 0
> sock_inode_cache 12282 12282 704 46 8 : tunables 0 0 0
> : slabdata 267 267 0
> file_lock_cache 20056 20960 200 40 2 : tunables 0 0 0
> : slabdata 524 524 0
> net_namespace 6 6 5056 6 8 : tunables 0 0 0
> : slabdata 1 1 0
> shmem_inode_cache 16970 18952 712 46 8 : tunables 0 0 0
> : slabdata 412 412 0
> Acpi-ParseExt 39491 40432 72 56 1 : tunables 0 0 0
> : slabdata 722 722 0
> Acpi-State 1683 1683 80 51 1 : tunables 0 0 0
> : slabdata 33 33 0
> Acpi-Namespace 11424 11424 40 102 1 : tunables 0 0 0
> : slabdata 112 112 0
> task_delay_info 15336 15336 112 36 1 : tunables 0 0 0
> : slabdata 426 426 0
> taskstats 1568 1568 328 49 4 : tunables 0 0 0
> : slabdata 32 32 0
> proc_inode_cache 169897 190608 680 48 8 : tunables 0 0 0
> : slabdata 3971 3971 0
> sigqueue 2208 2208 168 48 2 : tunables 0 0 0
> : slabdata 46 46 0
> bdev_cache 792 792 896 36 8 : tunables 0 0 0
> : slabdata 22 22 0
> sysfs_dir_cache 74698 74698 120 34 1 : tunables 0 0 0
> : slabdata 2197 2197 0
> mnt_cache 163197 163424 256 32 2 : tunables 0 0 0
> : slabdata 5107 5107 0
> filp 64607 97257 320 51 4 : tunables 0 0 0
> : slabdata 1907 1907 0
> inode_cache 370744 370947 616 53 8 : tunables 0 0 0
> : slabdata 6999 6999 0
> dentry 1316262 2139228 192 42 2 : tunables 0 0
> 0 : slabdata 50934 50934 0
> iint_cache 0 0 80 51 1 : tunables 0 0 0
> : slabdata 0 0 0
> buffer_head 1441470 2890290 104 39 1 : tunables 0 0
> 0 : slabdata 74110 74110 0
> vm_area_struct 194998 196840 216 37 2 : tunables 0 0 0
> : slabdata 5320 5320 0
> mm_struct 2679 2760 1600 20 8 : tunables 0 0 0
> : slabdata 138 138 0
> files_cache 8680 8925 640 51 8 : tunables 0 0 0
> : slabdata 175 175 0
> signal_cache 3691 3780 1152 28 8 : tunables 0 0 0
> : slabdata 135 135 0
> sighand_cache 1950 2160 2112 15 8 : tunables 0 0 0
> : slabdata 144 144 0
> task_xstate 8070 8658 832 39 8 : tunables 0 0 0
> : slabdata 222 222 0
> task_struct 1913 2088 4080 8 8 : tunables 0 0 0
> : slabdata 261 261 0
> cred_jar 31699 33936 192 42 2 : tunables 0 0 0
> : slabdata 808 808 0
> anon_vma_chain 164026 168704 64 64 1 : tunables 0 0 0
> : slabdata 2636 2636 0
> anon_vma 84104 84594 88 46 1 : tunables 0 0 0
> : slabdata 1839 1839 0
> pid 11127 12576 128 32 1 : tunables 0 0 0
> : slabdata 393 393 0
> shared_policy_node 9350 9350 48 85 1 : tunables 0 0
> 0 : slabdata 110 110 0
> numa_policy 62 62 264 31 2 : tunables 0 0 0
> : slabdata 2 2 0
> radix_tree_node 771778 1194312 584 28 4 : tunables 0 0
> 0 : slabdata 42654 42654 0
> idr_layer_cache 2538 2565 2112 15 8 : tunables 0 0 0
> : slabdata 171 171 0
> dma-kmalloc-8192 0 0 8192 4 8 : tunables 0 0 0
> : slabdata 0 0 0
> dma-kmalloc-4096 0 0 4096 8 8 : tunables 0 0 0
> : slabdata 0 0 0
> dma-kmalloc-2048 0 0 2048 16 8 : tunables 0 0 0
> : slabdata 0 0 0
> dma-kmalloc-1024 0 0 1024 32 8 : tunables 0 0 0
> : slabdata 0 0 0
> dma-kmalloc-512 0 0 512 32 4 : tunables 0 0 0
> : slabdata 0 0 0
> dma-kmalloc-256 0 0 256 32 2 : tunables 0 0 0
> : slabdata 0 0 0
> dma-kmalloc-128 0 0 128 32 1 : tunables 0 0 0
> : slabdata 0 0 0
> dma-kmalloc-64 0 0 64 64 1 : tunables 0 0 0
> : slabdata 0 0 0
> dma-kmalloc-32 0 0 32 128 1 : tunables 0 0 0
> : slabdata 0 0 0
> dma-kmalloc-16 0 0 16 256 1 : tunables 0 0 0
> : slabdata 0 0 0
> dma-kmalloc-8 0 0 8 512 1 : tunables 0 0 0
> : slabdata 0 0 0
> dma-kmalloc-192 0 0 192 42 2 : tunables 0 0 0
> : slabdata 0 0 0
> dma-kmalloc-96 0 0 96 42 1 : tunables 0 0 0
> : slabdata 0 0 0
> kmalloc-8192 385 388 8192 4 8 : tunables 0 0 0
> : slabdata 97 97 0
> kmalloc-4096 9296 10088 4096 8 8 : tunables 0 0 0
> : slabdata 1261 1261 0
> kmalloc-2048 65061 133536 2048 16 8 : tunables 0 0 0
> : slabdata 8346 8346 0
> kmalloc-1024 11987 21120 1024 32 8 : tunables 0 0 0
> : slabdata 660 660 0
> kmalloc-512 107510 187072 512 32 4 : tunables 0 0 0
> : slabdata 5846 5846 0
> kmalloc-256 160498 199104 256 32 2 : tunables 0 0 0
> : slabdata 6222 6222 0
> kmalloc-192 144975 237426 192 42 2 : tunables 0 0 0
> : slabdata 5653 5653 0
> kmalloc-128 36799 108096 128 32 1 : tunables 0 0 0
> : slabdata 3378 3378 0
> kmalloc-96 99510 238896 96 42 1 : tunables 0 0 0
> : slabdata 5688 5688 0
> kmalloc-64 7978152 8593280 64 64 1 : tunables 0 0
> 0 : slabdata 134270 134270 0
> kmalloc-32 2939882 3089664 32 128 1 : tunables 0 0
> 0 : slabdata 24138 24138 0
> kmalloc-16 172057 172288 16 256 1 : tunables 0 0 0
> : slabdata 673 673 0
> kmalloc-8 109568 109568 8 512 1 : tunables 0 0 0
> : slabdata 214 214 0
> kmem_cache_node 893 896 64 64 1 : tunables 0 0 0
> : slabdata 14 14 0
> kmem_cache 612 612 320 51 4 : tunables 0 0 0
> : slabdata 12 12 0
>
> -------------------------------------------------
>
>
> # uname -r
> 3.10.0-714.10.2.lve1.5.17.1.el7.x86_64
>
> --------------------------------------------------------
>
> Core part of glances
> http://i.imgur.com/La5JbQn.png
> -----------------------------------------------------------
>
> Thank you very much for looking into this
>
>
> On Thu, Aug 2, 2018 at 12:37 PM Igor A. Ippolitov <[email protected]>
> wrote:
>
>> Anoop,
>>
>> I doubt this will be the solution, but may we have a look at
>> /proc/buddyinfo and /proc/slabinfo the moment when nginx can't allocate
>> memory?
>>
>> On 02.08.2018 08:15, Anoop Alias wrote:
>>
>> Hi Maxim,
>>
>> I enabled debug and the memalign call is happening on nginx reloads and
>> the ENOMEM happen sometimes on the reload(not on all reloads)
>>
>> 2018/08/02 05:59:08 [notice] 872052#872052: signal process started
>> 2018/08/02 05:59:23 [notice] 871570#871570: signal 1 (SIGHUP) received
>> from 872052, reconfiguring
>> 2018/08/02 05:59:23 [debug] 871570#871570: wake up, sigio 0
>> 2018/08/02 05:59:23 [notice] 871570#871570: reconfiguring
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 0000000002B0DA00:16384 @16 === > the memalign call on reload
>> 2018/08/02 05:59:23 [debug] 871570#871570: malloc: 00000000087924D0:4560
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 000000000E442E00:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: malloc: 0000000005650850:4096
>> 20
>>
>>
>>
>>
>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #71
>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #72
>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #73
>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #74
>> 2018/08/02 05:48:49 [debug] 871275#871275: add cleanup: 000000005340D728
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000024D3260:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000517BAF10:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053854FC0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053855FD0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053856FE0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053857FF0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: posix_memalign:
>> 0000000053859000:16384 @16
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385D010:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385E020:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385F030:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CD160:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CE170:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CF180:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D0190:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D11A0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D21B0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D31C0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D41D0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D51E0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D61F0:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D7200:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D8210:4096
>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D9220:4096
>>
>>
>> Infact there are lot of such calls during a reload
>>
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA17ED00:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA1B0FF0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA1E12C0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA211590:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA243880:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA271B30:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA2A3E20:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA2D20D0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA3063E0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA334690:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA366980:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA396C50:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA3C8F40:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA3F9210:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA4294E0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA45B7D0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA489A80:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA4BBD70:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA4EA020:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA51E330:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA54C5E0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA57E8D0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA5AEBA0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA5DEE70:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA611160:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA641430:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA671700:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA6A29E0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA6D5CE0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA707FD0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA736280:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA768570:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA796820:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA7CAB30:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA7F8DE0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA82B0D0:16384 @16
>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>> 00000000BA85B3A0:16384 @16
>>
>>
>>
>> What is perplexing is that the system has enough free (available RAM)
>> #############
>> # free -g
>> total used free shared buff/cache
>> available
>> Mem: 125 54 24 8 46
>> 58
>> Swap: 0 0 0
>> #############
>>
>> # ulimit -a
>> core file size (blocks, -c) 0
>> data seg size (kbytes, -d) unlimited
>> scheduling priority (-e) 0
>> file size (blocks, -f) unlimited
>> pending signals (-i) 514579
>> max locked memory (kbytes, -l) 64
>> max memory size (kbytes, -m) unlimited
>> open files (-n) 1024
>> pipe size (512 bytes, -p) 8
>> POSIX message queues (bytes, -q) 819200
>> real-time priority (-r) 0
>> stack size (kbytes, -s) 8192
>> cpu time (seconds, -t) unlimited
>> max user processes (-u) 514579
>> virtual memory (kbytes, -v) unlimited
>> file locks (-x) unlimited
>>
>> #########################################
>>
>> There is no other thing limiting memory allocation
>>
>> Any way to prevent this or probably identify/prevent this
>>
>>
>> On Tue, Jul 31, 2018 at 7:08 PM Maxim Dounin <[email protected]> wrote:
>>
>>> Hello!
>>>
>>> On Tue, Jul 31, 2018 at 09:52:29AM +0530, Anoop Alias wrote:
>>>
>>> > I am repeatedly seeing errors like
>>> >
>>> > ######################
>>> > 2018/07/31 03:46:33 [emerg] 2854560#2854560: posix_memalign(16, 16384)
>>> > failed (12: Cannot allocate memory)
>>> > 2018/07/31 03:54:09 [emerg] 2890190#2890190: posix_memalign(16, 16384)
>>> > failed (12: Cannot allocate memory)
>>> > 2018/07/31 04:08:36 [emerg] 2939230#2939230: posix_memalign(16, 16384)
>>> > failed (12: Cannot allocate memory)
>>> > 2018/07/31 04:24:48 [emerg] 2992650#2992650: posix_memalign(16, 16384)
>>> > failed (12: Cannot allocate memory)
>>> > 2018/07/31 04:42:09 [emerg] 3053092#3053092: posix_memalign(16, 16384)
>>> > failed (12: Cannot allocate memory)
>>> > 2018/07/31 04:42:17 [emerg] 3053335#3053335: posix_memalign(16, 16384)
>>> > failed (12: Cannot allocate memory)
>>> > 2018/07/31 04:42:28 [emerg] 3053937#3053937: posix_memalign(16, 16384)
>>> > failed (12: Cannot allocate memory)
>>> > 2018/07/31 04:47:54 [emerg] 3070638#3070638: posix_memalign(16, 16384)
>>> > failed (12: Cannot allocate memory)
>>> > ####################
>>> >
>>> > on a few servers
>>> >
>>> > The servers have enough memory free and the swap usage is 0, yet
>>> somehow
>>> > the kernel denies the posix_memalign with ENOMEM ( this is what I
>>> think is
>>> > happening!)
>>> >
>>> > The numbers requested are always 16, 16k . This makes me suspicious
>>> >
>>> > I have no setting in nginx.conf that reference a 16k
>>> >
>>> > Is there any chance of finding out what requests this and why this is
>>> not
>>> > fulfilled
>>>
>>> There are at least some buffers which default to 16k - for
>>> example, ssl_buffer_size (http://nginx.org/r/ssl_buffer_size).
>>>
>>> You may try debugging log to futher find out where the particular
>>> allocation happens, see here for details:
>>>
>>> http://nginx.org/en/docs/debugging_log.html
>>>
>>> But I don't really think it worth the effort. The error is pretty
>>> clear, and it's better to focus on why these allocations are
>>> denied. Likely you are hitting some limit.
>>>
>>> --
>>> Maxim Dounin
>>> http://mdounin.ru/
>>> _______________________________________________
>>> nginx mailing list
>>> nginx@nginx.org
>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>>
>>
>>
>> --
>> *Anoop P Alias*
>>
>>
>>
>> _______________________________________________
>> nginx mailing listnginx@nginx.orghttp://mailman.nginx.org/mailman/listinfo/nginx
>>
>>
>> _______________________________________________
>> nginx mailing list
>> nginx@nginx.org
>> http://mailman.nginx.org/mailman/listinfo/nginx
>
>
>
> --
> *Anoop P Alias*
>
>
>
> _______________________________________________
> nginx mailing listnginx@nginx.orghttp://mailman.nginx.org/mailman/listinfo/nginx
>
>
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx



--
*Anoop P Alias*
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Igor A. Ippolitov
Re: posix_memalign error
August 06, 2018 03:10PM
Anoop,

I suppose, most of your 10k servers are very similar, right?
Please, post top level configuration and a typical server{}, please.

Also, how do you reload configuration? With 'service nginx reload' or
may be other commands?

It looks like you have a lot of fragmented memory and only 4gb free in
the second numa node.
So, I'd say this is OK that you are getting errors from allocating a 16k
stripes.

Could you please post numastat -m output additionally. Just to make sure
you have half of the memory for the second CPU.
And we'll have a look if memory utilization may be optimized based on
your configuration.

Regards,
Igor.

On 04.08.2018 07:54, Anoop Alias wrote:
> Hi Igor,
>
> Setting vm.max_map_count to 20x the normal value did not help
>
> The issue happens on a group of servers and among the group, it shows
> up only in servers which have ~10k  server{} blocks
>
> On servers that have lower number of server{} blocks , the ENOMEM
> issue is not there
>
> Also, I can find that the RAM usage of the Nginx process is directly
> proportional to the number of server {} blocks
>
> For example on a server having the problem
>
> # ps_mem| head -1 && ps_mem |grep nginx
>  Private  +   Shared  =  RAM used       Program
>   1.0 GiB +   2.8 GiB =   3.8 GiB       nginx (3)
>
>
> That is for a single worker process with 4 threads in thread_pool
> # pstree|grep nginx
>         |-nginx-+-nginx---4*[{nginx}]
>         |       `-nginx
>
> Whatever config change I try the memory usage seem to mostly depend on
> the number of server contexts defined
>
> Now the issue mostly happen in nginx reload ,when one more worker
> process will be active in shutting down mode
>
> I believe the memalign error is thrown by the worker being shutdown,
> this is because the sites work after the error and also the pid
> mentioned in the error would have gone when I check ps
>
>
> # pmap 948965|grep 16K
> 00007f2923ff2000     16K r-x-- ngx_http_redis2_module.so
> 00007f2924fd7000     16K r---- libc-2.17.so http://libc-2.17.so
> 00007f2925431000     16K rw---   [ anon ]
> 00007f292584a000     16K rw---   [ anon ]
>
> Aug  4 05:50:00 b kernel: SysRq : Show Memory
> Aug  4 05:50:00 b kernel: Mem-Info:
> Aug  4 05:50:00 b kernel: active_anon:7757394 inactive_anon:1021319
> isolated_anon:0#012 active_file:3733324 inactive_file:2136476
> isolated_file:0#012 unevictable:0 dirty:1766 writeback:6 wbtmp:0
> unstable:0#012 slab_reclaimable:2003687 slab_unreclaimable:901391#012
> mapped:316734 shmem:2381810 pagetables:63163 bounce:0#012 free:4851283
> free_pcp:11332 free_cma:0
> Aug  4 05:50:00 bravo kernel: Node 0 DMA free:15888kB min:8kB low:8kB
> high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB
> inactive_file:0kB unevictable:0kB isolated(anon):0kB
> isolated(file):0kB present:15972kB managed:15888kB mlocked:0kB
> dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB
> slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB
> bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB
> pages_scanned:0 all_unreclaimable? yes
> Aug  4 05:50:00 b kernel: lowmem_reserve[]: 0 1679 64139 64139
>
> # cat /proc/buddyinfo
> Node 0, zone      DMA      0      0      1      0      2     1      1 
>     0      1      1      3
> Node 0, zone    DMA32   5284   6753   6677   1083    410    59      1 
>     0      0      0      0
> Node 0, zone   Normal 500327 638958 406737  14690    872   106     11 
>     0      0      0      0
> Node 1, zone   Normal 584840 291640    188      0      0     0      0 
>     0      0      0      0
>
>
> The only correlation I see in having the error is the number of
> server {} blocks (close to 10k) which then makes the nginx process
> consume ~ 4GB of mem with a single worker process and then a reload is
> done
>
>
>
>
> On Thu, Aug 2, 2018 at 6:02 PM Igor A. Ippolitov <[email protected]
> <mailto:[email protected]>> wrote:
>
> Anoop,
>
> There are two guesses: either mmap allocations limit is hit or
> memory is  way too fragmented.
> Could you please track amount of mapped regions for a worker with
> pmap and amount of 16k areas in Normal zones (it is the third number)?
>
> You can also set vm.max_map_count to a higher number (like 20
> times higher than default) and look if the error is gone.
>
> Please, let me know if increasing vm.max_map_count helps you.
>
> On 02.08.2018 13:06, Anoop Alias wrote:
>> Hi Igor,
>>
>> The error happens randomly
>>
>> 2018/08/02 06:52:42 [emerg] 874514#874514: posix_memalign(16,
>> 16384) failed (12: Cannot allocate memory)
>> 2018/08/02 09:42:53 [emerg] 872996#872996: posix_memalign(16,
>> 16384) failed (12: Cannot allocate memory)
>> 2018/08/02 10:16:14 [emerg] 877611#877611: posix_memalign(16,
>> 16384) failed (12: Cannot allocate memory)
>> 2018/08/02 10:16:48 [emerg] 879410#879410: posix_memalign(16,
>> 16384) failed (12: Cannot allocate memory)
>> 2018/08/02 10:17:55 [emerg] 876563#876563: posix_memalign(16,
>> 16384) failed (12: Cannot allocate memory)
>> 2018/08/02 10:20:21 [emerg] 879263#879263: posix_memalign(16,
>> 16384) failed (12: Cannot allocate memory)
>> 2018/08/02 10:20:51 [emerg] 878991#878991: posix_memalign(16,
>> 16384) failed (12: Cannot allocate memory)
>>
>> # date
>> Thu Aug  2 10:58:48 BST 2018
>>
>> ------------------------------------------
>> # cat /proc/buddyinfo
>> Node 0, zone      DMA      0      0      1 0      2      1     
>> 1      0      1      1 3
>> Node 0, zone    DMA32  11722  11057   4663  1647    609     72   
>>  10      7      1      0   0
>> Node 0, zone   Normal 755026 710760 398136 21462   1114     18   
>>   1      0      0      0   0
>> Node 1, zone   Normal 341295 801810 179604 256      0      0     
>> 0      0      0      0 0
>> -----------------------------------------
>>
>>
>> slabinfo - version: 2.1
>> # name            <active_objs> <num_objs> <objsize> <objperslab>
>> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> :
>> slabdata <active_slabs> <num_slabs> <sharedavail>
>> SCTPv6                21     21   1536   21 8 : tunables    0   
>> 0    0 : slabdata      1 1      0
>> SCTP                   0      0   1408   23 8 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> kcopyd_job             0      0   3312    9 8 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> dm_uevent              0      0   2608   12 8 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> nf_conntrack_ffffffff81acbb00  14054  14892 320   51    4 :
>> tunables    0    0    0 : slabdata    292    292      0
>> lvp_cache             36     36    224   36 2 : tunables    0   
>> 0    0 : slabdata      1 1      0
>> lve_struct          4140   4140    352   46 4 : tunables    0   
>> 0    0 : slabdata     90  90      0
>> fat_inode_cache        0      0    744   44 8 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> fat_cache              0      0     40  102 1 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> isofs_inode_cache      0      0    664   49 8 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> ext4_inode_cache      30     30   1088   30 8 : tunables    0   
>> 0    0 : slabdata      1 1      0
>> ext4_xattr             0      0     88   46 1 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> ext4_free_data         0      0     64   64 1 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> ext4_allocation_context     32     32    128  32    1 : tunables 
>>   0    0    0 : slabdata 1      1      0
>> ext4_io_end            0      0     72   56 1 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> ext4_extent_status    102    102     40  102 1 : tunables    0   
>> 0    0 : slabdata      1 1      0
>> jbd2_journal_handle      0      0     48   85   1 : tunables   
>> 0    0    0 : slabdata      0   0      0
>> jbd2_journal_head      0      0    112   36 1 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> jbd2_revoke_table_s    256    256     16  256   1 : tunables   
>> 0    0    0 : slabdata      1   1      0
>> jbd2_revoke_record_s      0      0     32  128   1 : tunables   
>> 0    0    0 : slabdata      0   0      0
>> kvm_async_pf           0      0    136   30 1 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> kvm_vcpu               0      0  18560    1 8 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> xfs_dqtrx            992    992    528   31 4 : tunables    0   
>> 0    0 : slabdata     32  32      0
>> xfs_dquot           3264   3264    472   34 4 : tunables    0   
>> 0    0 : slabdata     96  96      0
>> xfs_ili           4342175 4774399    152   53   2 : tunables   
>> 0    0    0 : slabdata  90083 90083      0
>> xfs_inode         4915588 5486076   1088   30   8 : tunables   
>> 0    0    0 : slabdata 182871 182871      0
>> xfs_efd_item        2680   2760    400   40 4 : tunables    0   
>> 0    0 : slabdata     69  69      0
>> xfs_da_state        1088   1088    480   34 4 : tunables    0   
>> 0    0 : slabdata     32  32      0
>> xfs_btree_cur       1248   1248    208   39 2 : tunables    0   
>> 0    0 : slabdata     32  32      0
>> xfs_log_ticket     14874  15048    184   44 2 : tunables    0   
>> 0    0 : slabdata    342 342      0
>> xfs_ioend          12909  13104    104   39 1 : tunables    0   
>> 0    0 : slabdata    336 336      0
>> scsi_cmd_cache      5400   5652    448   36 4 : tunables    0   
>> 0    0 : slabdata    157 157      0
>> ve_struct              0      0    848   38 8 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> ip6_dst_cache       1152   1152    448   36 4 : tunables    0   
>> 0    0 : slabdata     32  32      0
>> RAWv6                910    910   1216   26 8 : tunables    0   
>> 0    0 : slabdata     35  35      0
>> UDPLITEv6              0      0   1216   26 8 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> UDPv6                832    832   1216   26 8 : tunables    0   
>> 0    0 : slabdata     32  32      0
>> tw_sock_TCPv6       1152   1376    256   32 2 : tunables    0   
>> 0    0 : slabdata     43  43      0
>> TCPv6                510    510   2176   15 8 : tunables    0   
>> 0    0 : slabdata     34  34      0
>> cfq_queue           3698   5145    232   35 2 : tunables    0   
>> 0    0 : slabdata    147 147      0
>> bsg_cmd                0      0    312   52 4 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> mqueue_inode_cache    136    136    960   34 8 : tunables    0   
>> 0    0 : slabdata      4 4      0
>> hugetlbfs_inode_cache   1632   1632    632  51    8 : tunables   
>> 0    0    0 : slabdata  32     32      0
>> configfs_dir_cache   1472   1472     88   46 1 : tunables    0   
>> 0    0 : slabdata     32  32      0
>> dquot                  0      0    256   32 2 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> userfaultfd_ctx_cache     32     32    128  32    1 : tunables   
>> 0    0    0 : slabdata 1      1      0
>> fanotify_event_info   2336   2336     56   73   1 : tunables   
>> 0    0    0 : slabdata     32  32      0
>> dio                 6171   6222    640   51 8 : tunables    0   
>> 0    0 : slabdata    122 122      0
>> pid_namespace         42     42   2192   14 8 : tunables    0   
>> 0    0 : slabdata      3 3      0
>> posix_timers_cache   1056   1056    248   33 2 : tunables    0   
>> 0    0 : slabdata     32  32      0
>> UDP-Lite               0      0   1088   30 8 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> flow_cache          2268   2296    144   28 1 : tunables    0   
>> 0    0 : slabdata     82  82      0
>> xfrm_dst_cache       896    896    576   28 4 : tunables    0   
>> 0    0 : slabdata     32  32      0
>> ip_fib_alias        2720   2720     48   85 1 : tunables    0   
>> 0    0 : slabdata     32  32      0
>> RAW                 3977   4224   1024   32 8 : tunables    0   
>> 0    0 : slabdata    132 132      0
>> UDP                 4110   4110   1088   30 8 : tunables    0   
>> 0    0 : slabdata    137 137      0
>> tw_sock_TCP         4756   5216    256   32 2 : tunables    0   
>> 0    0 : slabdata    163 163      0
>> TCP                 2705   2768   1984   16 8 : tunables    0   
>> 0    0 : slabdata    173 173      0
>> scsi_data_buffer    5440   5440     24  170 1 : tunables    0   
>> 0    0 : slabdata     32  32      0
>> blkdev_queue         154    154   2208   14 8 : tunables    0   
>> 0    0 : slabdata     11  11      0
>> blkdev_requests   4397688 4405884    384   42   4 : tunables   
>> 0    0    0 : slabdata 104902 104902      0
>> blkdev_ioc         11232  11232    112   36 1 : tunables    0   
>> 0    0 : slabdata    312 312      0
>> user_namespace         0      0   1304   25 8 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> sock_inode_cache   12282  12282    704   46 8 : tunables    0   
>> 0    0 : slabdata    267 267      0
>> file_lock_cache    20056  20960    200   40 2 : tunables    0   
>> 0    0 : slabdata    524 524      0
>> net_namespace          6      6   5056    6 8 : tunables    0   
>> 0    0 : slabdata      1 1      0
>> shmem_inode_cache  16970  18952    712   46 8 : tunables    0   
>> 0    0 : slabdata    412 412      0
>> Acpi-ParseExt      39491  40432     72   56 1 : tunables    0   
>> 0    0 : slabdata    722 722      0
>> Acpi-State          1683   1683     80   51 1 : tunables    0   
>> 0    0 : slabdata     33  33      0
>> Acpi-Namespace     11424  11424     40  102 1 : tunables    0   
>> 0    0 : slabdata    112 112      0
>> task_delay_info    15336  15336    112   36 1 : tunables    0   
>> 0    0 : slabdata    426 426      0
>> taskstats           1568   1568    328   49 4 : tunables    0   
>> 0    0 : slabdata     32  32      0
>> proc_inode_cache  169897 190608    680   48 8 : tunables    0   
>> 0    0 : slabdata   3971  3971      0
>> sigqueue            2208   2208    168   48 2 : tunables    0   
>> 0    0 : slabdata     46  46      0
>> bdev_cache           792    792    896   36 8 : tunables    0   
>> 0    0 : slabdata     22  22      0
>> sysfs_dir_cache    74698  74698    120   34 1 : tunables    0   
>> 0    0 : slabdata   2197  2197      0
>> mnt_cache         163197 163424    256   32 2 : tunables    0   
>> 0    0 : slabdata   5107  5107      0
>> filp               64607  97257    320   51 4 : tunables    0   
>> 0    0 : slabdata   1907  1907      0
>> inode_cache       370744 370947    616   53 8 : tunables    0   
>> 0    0 : slabdata   6999  6999      0
>> dentry            1316262 2139228    192   42   2 : tunables   
>> 0    0    0 : slabdata  50934 50934      0
>> iint_cache             0      0     80   51 1 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> buffer_head       1441470 2890290    104   39   1 : tunables   
>> 0    0    0 : slabdata  74110 74110      0
>> vm_area_struct    194998 196840    216   37 2 : tunables    0   
>> 0    0 : slabdata   5320  5320      0
>> mm_struct           2679   2760   1600   20 8 : tunables    0   
>> 0    0 : slabdata    138 138      0
>> files_cache         8680   8925    640   51 8 : tunables    0   
>> 0    0 : slabdata    175 175      0
>> signal_cache        3691   3780   1152   28 8 : tunables    0   
>> 0    0 : slabdata    135 135      0
>> sighand_cache       1950   2160   2112   15 8 : tunables    0   
>> 0    0 : slabdata    144 144      0
>> task_xstate         8070   8658    832   39 8 : tunables    0   
>> 0    0 : slabdata    222 222      0
>> task_struct         1913   2088   4080    8 8 : tunables    0   
>> 0    0 : slabdata    261 261      0
>> cred_jar           31699  33936    192   42 2 : tunables    0   
>> 0    0 : slabdata    808 808      0
>> anon_vma_chain    164026 168704     64   64 1 : tunables    0   
>> 0    0 : slabdata   2636  2636      0
>> anon_vma           84104  84594     88   46 1 : tunables    0   
>> 0    0 : slabdata   1839  1839      0
>> pid                11127  12576    128   32 1 : tunables    0   
>> 0    0 : slabdata    393 393      0
>> shared_policy_node   9350   9350     48   85 1 : tunables    0   
>> 0    0 : slabdata    110 110      0
>> numa_policy           62     62    264   31 2 : tunables    0   
>> 0    0 : slabdata      2 2      0
>> radix_tree_node   771778 1194312    584   28 4 : tunables    0   
>> 0    0 : slabdata  42654 42654      0
>> idr_layer_cache     2538   2565   2112   15 8 : tunables    0   
>> 0    0 : slabdata    171 171      0
>> dma-kmalloc-8192       0      0   8192    4 8 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> dma-kmalloc-4096       0      0   4096    8 8 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> dma-kmalloc-2048       0      0   2048   16 8 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> dma-kmalloc-1024       0      0   1024   32 8 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> dma-kmalloc-512        0      0    512   32 4 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> dma-kmalloc-256        0      0    256   32 2 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> dma-kmalloc-128        0      0    128   32 1 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> dma-kmalloc-64         0      0     64   64 1 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> dma-kmalloc-32         0      0     32  128 1 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> dma-kmalloc-16         0      0     16  256 1 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> dma-kmalloc-8          0      0      8  512 1 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> dma-kmalloc-192        0      0    192   42 2 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> dma-kmalloc-96         0      0     96   42 1 : tunables    0   
>> 0    0 : slabdata      0 0      0
>> kmalloc-8192         385    388   8192    4 8 : tunables    0   
>> 0    0 : slabdata     97  97      0
>> kmalloc-4096        9296  10088   4096    8 8 : tunables    0   
>> 0    0 : slabdata   1261  1261      0
>> kmalloc-2048       65061 133536   2048   16 8 : tunables    0   
>> 0    0 : slabdata   8346  8346      0
>> kmalloc-1024       11987  21120   1024   32 8 : tunables    0   
>> 0    0 : slabdata    660 660      0
>> kmalloc-512       107510 187072    512   32 4 : tunables    0   
>> 0    0 : slabdata   5846  5846      0
>> kmalloc-256       160498 199104    256   32 2 : tunables    0   
>> 0    0 : slabdata   6222  6222      0
>> kmalloc-192       144975 237426    192   42 2 : tunables    0   
>> 0    0 : slabdata   5653  5653      0
>> kmalloc-128        36799 108096    128   32 1 : tunables    0   
>> 0    0 : slabdata   3378  3378      0
>> kmalloc-96         99510 238896     96   42 1 : tunables    0   
>> 0    0 : slabdata   5688  5688      0
>> kmalloc-64        7978152 8593280     64   64   1 : tunables   
>> 0    0    0 : slabdata 134270 134270      0
>> kmalloc-32        2939882 3089664     32  128   1 : tunables   
>> 0    0    0 : slabdata  24138 24138      0
>> kmalloc-16        172057 172288     16  256 1 : tunables    0   
>> 0    0 : slabdata    673 673      0
>> kmalloc-8         109568 109568      8  512 1 : tunables    0   
>> 0    0 : slabdata    214 214      0
>> kmem_cache_node      893    896     64   64 1 : tunables    0   
>> 0    0 : slabdata     14  14      0
>> kmem_cache           612    612    320   51 4 : tunables    0   
>> 0    0 : slabdata     12  12      0
>>
>> -------------------------------------------------
>>
>>
>> # uname -r
>> 3.10.0-714.10.2.lve1.5.17.1.el7.x86_64
>>
>> --------------------------------------------------------
>>
>> Core part of glances
>> http://i.imgur.com/La5JbQn.png
>> -----------------------------------------------------------
>>
>> Thank you very much for looking into this
>>
>>
>> On Thu, Aug 2, 2018 at 12:37 PM Igor A. Ippolitov
>> <[email protected] <mailto:[email protected]>> wrote:
>>
>> Anoop,
>>
>> I doubt this will be the solution, but may we have a look at
>> /proc/buddyinfo and /proc/slabinfo the moment when nginx
>> can't allocate memory?
>>
>> On 02.08.2018 08:15, Anoop Alias wrote:
>>> Hi Maxim,
>>>
>>> I enabled debug and the memalign call is happening on nginx
>>> reloads and the ENOMEM happen sometimes on the reload(not on
>>> all reloads)
>>>
>>> 2018/08/02 05:59:08 [notice] 872052#872052: signal process
>>> started
>>> 2018/08/02 05:59:23 [notice] 871570#871570: signal 1
>>> (SIGHUP) received from 872052, reconfiguring
>>> 2018/08/02 05:59:23 [debug] 871570#871570: wake up, sigio 0
>>> 2018/08/02 05:59:23 [notice] 871570#871570: reconfiguring
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 0000000002B0DA00:16384 @16      === > the memalign call on
>>> reload
>>> 2018/08/02 05:59:23 [debug] 871570#871570: malloc:
>>> 00000000087924D0:4560
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 000000000E442E00:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: malloc:
>>> 0000000005650850:4096
>>> 20
>>>
>>>
>>>
>>>
>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #71
>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #72
>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #73
>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #74
>>> 2018/08/02 05:48:49 [debug] 871275#871275: add cleanup:
>>> 000000005340D728
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 00000000024D3260:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 00000000517BAF10:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 0000000053854FC0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 0000000053855FD0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 0000000053856FE0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 0000000053857FF0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: posix_memalign:
>>> 0000000053859000:16384 @16
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 000000005385D010:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 000000005385E020:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 000000005385F030:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 00000000536CD160:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 00000000536CE170:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 00000000536CF180:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 00000000536D0190:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 00000000536D11A0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 00000000536D21B0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 00000000536D31C0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 00000000536D41D0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 00000000536D51E0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 00000000536D61F0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 00000000536D7200:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 00000000536D8210:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>> 00000000536D9220:4096
>>>
>>>
>>> Infact there are lot of such calls during a reload
>>>
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA17ED00:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA1B0FF0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA1E12C0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA211590:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA243880:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA271B30:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA2A3E20:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA2D20D0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA3063E0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA334690:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA366980:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA396C50:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA3C8F40:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA3F9210:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA4294E0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA45B7D0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA489A80:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA4BBD70:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA4EA020:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA51E330:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA54C5E0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA57E8D0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA5AEBA0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA5DEE70:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA611160:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA641430:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA671700:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA6A29E0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA6D5CE0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA707FD0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA736280:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA768570:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA796820:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA7CAB30:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA7F8DE0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA82B0D0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA85B3A0:16384 @16
>>>
>>>
>>>
>>> What is perplexing is that the system has enough free
>>> (available RAM)
>>> #############
>>> # free -g
>>>               total        used free      shared 
>>> buff/cache   available
>>> Mem:            125          54 24           8          46 
>>>         58
>>> Swap:             0           0  0
>>> #############
>>>
>>> # ulimit -a
>>> core file size          (blocks, -c) 0
>>> data seg size           (kbytes, -d) unlimited
>>> scheduling priority             (-e) 0
>>> file size               (blocks, -f) unlimited
>>> pending signals                 (-i) 514579
>>> max locked memory       (kbytes, -l) 64
>>> max memory size         (kbytes, -m) unlimited
>>> open files                      (-n) 1024
>>> pipe size            (512 bytes, -p) 8
>>> POSIX message queues     (bytes, -q) 819200
>>> real-time priority              (-r) 0
>>> stack size              (kbytes, -s) 8192
>>> cpu time               (seconds, -t) unlimited
>>> max user processes              (-u) 514579
>>> virtual memory          (kbytes, -v) unlimited
>>> file locks                      (-x) unlimited
>>>
>>> #########################################
>>>
>>> There is no other thing limiting memory allocation
>>>
>>> Any way to prevent this or probably identify/prevent this
>>>
>>>
>>> On Tue, Jul 31, 2018 at 7:08 PM Maxim Dounin
>>> <[email protected] <mailto:[email protected]>> wrote:
>>>
>>> Hello!
>>>
>>> On Tue, Jul 31, 2018 at 09:52:29AM +0530, Anoop Alias wrote:
>>>
>>> > I am repeatedly seeing errors like
>>> >
>>> > ######################
>>> > 2018/07/31 03:46:33 [emerg] 2854560#2854560:
>>> posix_memalign(16, 16384)
>>> > failed (12: Cannot allocate memory)
>>> > 2018/07/31 03:54:09 [emerg] 2890190#2890190:
>>> posix_memalign(16, 16384)
>>> > failed (12: Cannot allocate memory)
>>> > 2018/07/31 04:08:36 [emerg] 2939230#2939230:
>>> posix_memalign(16, 16384)
>>> > failed (12: Cannot allocate memory)
>>> > 2018/07/31 04:24:48 [emerg] 2992650#2992650:
>>> posix_memalign(16, 16384)
>>> > failed (12: Cannot allocate memory)
>>> > 2018/07/31 04:42:09 [emerg] 3053092#3053092:
>>> posix_memalign(16, 16384)
>>> > failed (12: Cannot allocate memory)
>>> > 2018/07/31 04:42:17 [emerg] 3053335#3053335:
>>> posix_memalign(16, 16384)
>>> > failed (12: Cannot allocate memory)
>>> > 2018/07/31 04:42:28 [emerg] 3053937#3053937:
>>> posix_memalign(16, 16384)
>>> > failed (12: Cannot allocate memory)
>>> > 2018/07/31 04:47:54 [emerg] 3070638#3070638:
>>> posix_memalign(16, 16384)
>>> > failed (12: Cannot allocate memory)
>>> > ####################
>>> >
>>> > on a few servers
>>> >
>>> > The servers have enough memory free and the swap usage
>>> is 0, yet somehow
>>> > the kernel denies the posix_memalign with ENOMEM (
>>> this is what I think is
>>> > happening!)
>>> >
>>> > The numbers requested are always 16, 16k . This makes
>>> me suspicious
>>> >
>>> > I have no setting in nginx.conf that reference a 16k
>>> >
>>> > Is there any chance of finding out what requests this
>>> and why this is not
>>> > fulfilled
>>>
>>> There are at least some buffers which default to 16k - for
>>> example, ssl_buffer_size
>>> (http://nginx.org/r/ssl_buffer_size).
>>>
>>> You may try debugging log to futher find out where the
>>> particular
>>> allocation happens, see here for details:
>>>
>>> http://nginx.org/en/docs/debugging_log.html
>>>
>>> But I don't really think it worth the effort. The error
>>> is pretty
>>> clear, and it's better to focus on why these allocations
>>> are
>>> denied.  Likely you are hitting some limit.
>>>
>>> --
>>> Maxim Dounin
>>> http://mdounin.ru/
>>> _______________________________________________
>>> nginx mailing list
>>> nginx@nginx.org <mailto:[email protected]>
>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>>
>>>
>>>
>>> --
>>> *Anoop P Alias*
>>>
>>>
>>>
>>> _______________________________________________
>>> nginx mailing list
>>> nginx@nginx.org <mailto:[email protected]>
>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>
>>
>> _______________________________________________
>> nginx mailing list
>> nginx@nginx.org <mailto:[email protected]>
>> http://mailman.nginx.org/mailman/listinfo/nginx
>>
>>
>>
>> --
>> *Anoop P Alias*
>>
>>
>>
>> _______________________________________________
>> nginx mailing list
>> nginx@nginx.org <mailto:[email protected].org>
>> http://mailman.nginx.org/mailman/listinfo/nginx
>
>
> _______________________________________________
> nginx mailing list
> nginx@nginx.org <mailto:[email protected]>
> http://mailman.nginx.org/mailman/listinfo/nginx
>
>
>
> --
> *Anoop P Alias*
>
>
>
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx


_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Anoop Alias
Re: posix_memalign error
August 06, 2018 10:00PM
Hi Igor,

Config is reloaded using

/usr/sbin/nginx -s reload

this is invoked from a python/shell script ( Nginx is installed on a web
control panel )

The top-level Nginx config is in the gist below

https://gist.github.com/AnoopAlias/ba5ad6749a586c7e267672ee65b32b3a

It further includes ~8k server blocks or more in some servers. Out of this
2/3 are server {} blocks with TLS config and 1/3 non-TLS ones

]# pwd
/etc/nginx/sites-enabled
# grep "server {" *|wc -l
7886

And yes most of them are very similar and mostly proxy to upstream httpd

I have tried removing all the loadable modules and even tried an older
version of nginx and all produce the error


# numastat -m

Per-node system memory usage (in MBs):
Node 0 Node 1 Total
--------------- --------------- ---------------
MemTotal 65430.84 65536.00 130966.84
MemFree 5491.26 40.89 5532.15
MemUsed 59939.58 65495.11 125434.69
Active 22295.61 21016.09 43311.70
Inactive 8742.76 4662.48 13405.24
Active(anon) 16717.10 16572.19 33289.29
Inactive(anon) 2931.94 1388.14 4320.08
Active(file) 5578.50 4443.91 10022.41
Inactive(file) 5810.82 3274.34 9085.16
Unevictable 0.00 0.00 0.00
Mlocked 0.00 0.00 0.00
Dirty 7.04 1.64 8.67
Writeback 0.00 0.00 0.00
FilePages 18458.93 10413.97 28872.90
Mapped 862.14 413.38 1275.52
AnonPages 12579.49 15264.37 27843.86
Shmem 7069.52 2695.71 9765.23
KernelStack 18.34 3.03 21.38
PageTables 153.14 107.77 260.90
NFS_Unstable 0.00 0.00 0.00
Bounce 0.00 0.00 0.00
WritebackTmp 0.00 0.00 0.00
Slab 4830.68 2254.55 7085.22
SReclaimable 2061.05 921.72 2982.77
SUnreclaim 2769.62 1332.83 4102.45
AnonHugePages 4.00 2.00 6.00
HugePages_Total 0.00 0.00 0.00
HugePages_Free 0.00 0.00 0.00
HugePages_Surp 0.00 0.00 0.00


Thanks,





On Mon, Aug 6, 2018 at 6:33 PM Igor A. Ippolitov <[email protected]>
wrote:

> Anoop,
>
> I suppose, most of your 10k servers are very similar, right?
> Please, post top level configuration and a typical server{}, please.
>
> Also, how do you reload configuration? With 'service nginx reload' or may
> be other commands?
>
> It looks like you have a lot of fragmented memory and only 4gb free in the
> second numa node.
> So, I'd say this is OK that you are getting errors from allocating a 16k
> stripes.
>
> Could you please post numastat -m output additionally. Just to make sure
> you have half of the memory for the second CPU.
> And we'll have a look if memory utilization may be optimized based on your
> configuration.
>
> Regards,
> Igor.
>
> On 04.08.2018 07:54, Anoop Alias wrote:
>
> Hi Igor,
>
> Setting vm.max_map_count to 20x the normal value did not help
>
> The issue happens on a group of servers and among the group, it shows up
> only in servers which have ~10k server{} blocks
>
> On servers that have lower number of server{} blocks , the ENOMEM issue is
> not there
>
> Also, I can find that the RAM usage of the Nginx process is directly
> proportional to the number of server {} blocks
>
> For example on a server having the problem
>
> # ps_mem| head -1 && ps_mem |grep nginx
> Private + Shared = RAM used Program
> 1.0 GiB + 2.8 GiB = 3.8 GiB nginx (3)
>
>
> That is for a single worker process with 4 threads in thread_pool
> # pstree|grep nginx
> |-nginx-+-nginx---4*[{nginx}]
> | `-nginx
>
> Whatever config change I try the memory usage seem to mostly depend on the
> number of server contexts defined
>
> Now the issue mostly happen in nginx reload ,when one more worker process
> will be active in shutting down mode
>
> I believe the memalign error is thrown by the worker being shutdown, this
> is because the sites work after the error and also the pid mentioned in the
> error would have gone when I check ps
>
>
> # pmap 948965|grep 16K
> 00007f2923ff2000 16K r-x-- ngx_http_redis2_module.so
> 00007f2924fd7000 16K r---- libc-2.17.so
> 00007f2925431000 16K rw--- [ anon ]
> 00007f292584a000 16K rw--- [ anon ]
>
> Aug 4 05:50:00 b kernel: SysRq : Show Memory
> Aug 4 05:50:00 b kernel: Mem-Info:
> Aug 4 05:50:00 b kernel: active_anon:7757394 inactive_anon:1021319
> isolated_anon:0#012 active_file:3733324 inactive_file:2136476 isolated_
> file:0#012 unevictable:0 dirty:1766 writeback:6 wbtmp:0 unstable:0#012
> slab_reclaimable:2003687 slab_unreclaimable:901391#012 mapped:316734
> shmem:2381810 pagetables:63163 bounce:0#012 free:4851283 free_pcp:11332
> free_cma:0
> Aug 4 05:50:00 bravo kernel: Node 0 DMA free:15888kB min:8kB low:8kB
> high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_
> file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB
> present:15972kB managed:15888kB mlocked:0kB dirty:0kB writeback:0kB
> mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB
> kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB
> local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0
> all_unreclaimable? yes
> Aug 4 05:50:00 b kernel: lowmem_reserve[]: 0 1679 64139 64139
>
> # cat /proc/buddyinfo
> Node 0, zone DMA 0 0 1 0 2 1 1
> 0 1 1 3
> Node 0, zone DMA32 5284 6753 6677 1083 410 59 1
> 0 0 0 0
> Node 0, zone Normal 500327 638958 406737 14690 872 106 11
> 0 0 0 0
> Node 1, zone Normal 584840 291640 188 0 0 0 0
> 0 0 0 0
>
>
> The only correlation I see in having the error is the number of server {}
> blocks (close to 10k) which then makes the nginx process consume ~ 4GB of
> mem with a single worker process and then a reload is done
>
>
>
>
> On Thu, Aug 2, 2018 at 6:02 PM Igor A. Ippolitov <[email protected]>
> wrote:
>
>> Anoop,
>>
>> There are two guesses: either mmap allocations limit is hit or memory is
>> way too fragmented.
>> Could you please track amount of mapped regions for a worker with pmap
>> and amount of 16k areas in Normal zones (it is the third number)?
>>
>> You can also set vm.max_map_count to a higher number (like 20 times
>> higher than default) and look if the error is gone.
>>
>> Please, let me know if increasing vm.max_map_count helps you.
>>
>> On 02.08.2018 13:06, Anoop Alias wrote:
>>
>> Hi Igor,
>>
>> The error happens randomly
>>
>> 2018/08/02 06:52:42 [emerg] 874514#874514: posix_memalign(16, 16384)
>> failed (12: Cannot allocate memory)
>> 2018/08/02 09:42:53 [emerg] 872996#872996: posix_memalign(16, 16384)
>> failed (12: Cannot allocate memory)
>> 2018/08/02 10:16:14 [emerg] 877611#877611: posix_memalign(16, 16384)
>> failed (12: Cannot allocate memory)
>> 2018/08/02 10:16:48 [emerg] 879410#879410: posix_memalign(16, 16384)
>> failed (12: Cannot allocate memory)
>> 2018/08/02 10:17:55 [emerg] 876563#876563: posix_memalign(16, 16384)
>> failed (12: Cannot allocate memory)
>> 2018/08/02 10:20:21 [emerg] 879263#879263: posix_memalign(16, 16384)
>> failed (12: Cannot allocate memory)
>> 2018/08/02 10:20:51 [emerg] 878991#878991: posix_memalign(16, 16384)
>> failed (12: Cannot allocate memory)
>>
>> # date
>> Thu Aug 2 10:58:48 BST 2018
>>
>> ------------------------------------------
>> # cat /proc/buddyinfo
>> Node 0, zone DMA 0 0 1 0 2 1 1
>> 0 1 1 3
>> Node 0, zone DMA32 11722 11057 4663 1647 609 72 10
>> 7 1 0 0
>> Node 0, zone Normal 755026 710760 398136 21462 1114 18 1
>> 0 0 0 0
>> Node 1, zone Normal 341295 801810 179604 256 0 0 0
>> 0 0 0 0
>> -----------------------------------------
>>
>>
>> slabinfo - version: 2.1
>> # name <active_objs> <num_objs> <objsize> <objperslab>
>> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata
>> <active_slabs> <num_slabs> <sharedavail>
>> SCTPv6 21 21 1536 21 8 : tunables 0 0
>> 0 : slabdata 1 1 0
>> SCTP 0 0 1408 23 8 : tunables 0 0
>> 0 : slabdata 0 0 0
>> kcopyd_job 0 0 3312 9 8 : tunables 0 0
>> 0 : slabdata 0 0 0
>> dm_uevent 0 0 2608 12 8 : tunables 0 0
>> 0 : slabdata 0 0 0
>> nf_conntrack_ffffffff81acbb00 14054 14892 320 51 4 : tunables
>> 0 0 0 : slabdata 292 292 0
>> lvp_cache 36 36 224 36 2 : tunables 0 0
>> 0 : slabdata 1 1 0
>> lve_struct 4140 4140 352 46 4 : tunables 0 0
>> 0 : slabdata 90 90 0
>> fat_inode_cache 0 0 744 44 8 : tunables 0 0
>> 0 : slabdata 0 0 0
>> fat_cache 0 0 40 102 1 : tunables 0 0
>> 0 : slabdata 0 0 0
>> isofs_inode_cache 0 0 664 49 8 : tunables 0 0
>> 0 : slabdata 0 0 0
>> ext4_inode_cache 30 30 1088 30 8 : tunables 0 0
>> 0 : slabdata 1 1 0
>> ext4_xattr 0 0 88 46 1 : tunables 0 0
>> 0 : slabdata 0 0 0
>> ext4_free_data 0 0 64 64 1 : tunables 0 0
>> 0 : slabdata 0 0 0
>> ext4_allocation_context 32 32 128 32 1 : tunables 0
>> 0 0 : slabdata 1 1 0
>> ext4_io_end 0 0 72 56 1 : tunables 0 0
>> 0 : slabdata 0 0 0
>> ext4_extent_status 102 102 40 102 1 : tunables 0 0
>> 0 : slabdata 1 1 0
>> jbd2_journal_handle 0 0 48 85 1 : tunables 0 0
>> 0 : slabdata 0 0 0
>> jbd2_journal_head 0 0 112 36 1 : tunables 0 0
>> 0 : slabdata 0 0 0
>> jbd2_revoke_table_s 256 256 16 256 1 : tunables 0 0
>> 0 : slabdata 1 1 0
>> jbd2_revoke_record_s 0 0 32 128 1 : tunables 0 0
>> 0 : slabdata 0 0 0
>> kvm_async_pf 0 0 136 30 1 : tunables 0 0
>> 0 : slabdata 0 0 0
>> kvm_vcpu 0 0 18560 1 8 : tunables 0 0
>> 0 : slabdata 0 0 0
>> xfs_dqtrx 992 992 528 31 4 : tunables 0 0
>> 0 : slabdata 32 32 0
>> xfs_dquot 3264 3264 472 34 4 : tunables 0 0
>> 0 : slabdata 96 96 0
>> xfs_ili 4342175 4774399 152 53 2 : tunables 0 0
>> 0 : slabdata 90083 90083 0
>> xfs_inode 4915588 5486076 1088 30 8 : tunables 0 0
>> 0 : slabdata 182871 182871 0
>> xfs_efd_item 2680 2760 400 40 4 : tunables 0 0
>> 0 : slabdata 69 69 0
>> xfs_da_state 1088 1088 480 34 4 : tunables 0 0
>> 0 : slabdata 32 32 0
>> xfs_btree_cur 1248 1248 208 39 2 : tunables 0 0
>> 0 : slabdata 32 32 0
>> xfs_log_ticket 14874 15048 184 44 2 : tunables 0 0
>> 0 : slabdata 342 342 0
>> xfs_ioend 12909 13104 104 39 1 : tunables 0 0
>> 0 : slabdata 336 336 0
>> scsi_cmd_cache 5400 5652 448 36 4 : tunables 0 0
>> 0 : slabdata 157 157 0
>> ve_struct 0 0 848 38 8 : tunables 0 0
>> 0 : slabdata 0 0 0
>> ip6_dst_cache 1152 1152 448 36 4 : tunables 0 0
>> 0 : slabdata 32 32 0
>> RAWv6 910 910 1216 26 8 : tunables 0 0
>> 0 : slabdata 35 35 0
>> UDPLITEv6 0 0 1216 26 8 : tunables 0 0
>> 0 : slabdata 0 0 0
>> UDPv6 832 832 1216 26 8 : tunables 0 0
>> 0 : slabdata 32 32 0
>> tw_sock_TCPv6 1152 1376 256 32 2 : tunables 0 0
>> 0 : slabdata 43 43 0
>> TCPv6 510 510 2176 15 8 : tunables 0 0
>> 0 : slabdata 34 34 0
>> cfq_queue 3698 5145 232 35 2 : tunables 0 0
>> 0 : slabdata 147 147 0
>> bsg_cmd 0 0 312 52 4 : tunables 0 0
>> 0 : slabdata 0 0 0
>> mqueue_inode_cache 136 136 960 34 8 : tunables 0 0
>> 0 : slabdata 4 4 0
>> hugetlbfs_inode_cache 1632 1632 632 51 8 : tunables 0
>> 0 0 : slabdata 32 32 0
>> configfs_dir_cache 1472 1472 88 46 1 : tunables 0 0
>> 0 : slabdata 32 32 0
>> dquot 0 0 256 32 2 : tunables 0 0
>> 0 : slabdata 0 0 0
>> userfaultfd_ctx_cache 32 32 128 32 1 : tunables 0
>> 0 0 : slabdata 1 1 0
>> fanotify_event_info 2336 2336 56 73 1 : tunables 0 0
>> 0 : slabdata 32 32 0
>> dio 6171 6222 640 51 8 : tunables 0 0
>> 0 : slabdata 122 122 0
>> pid_namespace 42 42 2192 14 8 : tunables 0 0
>> 0 : slabdata 3 3 0
>> posix_timers_cache 1056 1056 248 33 2 : tunables 0 0
>> 0 : slabdata 32 32 0
>> UDP-Lite 0 0 1088 30 8 : tunables 0 0
>> 0 : slabdata 0 0 0
>> flow_cache 2268 2296 144 28 1 : tunables 0 0
>> 0 : slabdata 82 82 0
>> xfrm_dst_cache 896 896 576 28 4 : tunables 0 0
>> 0 : slabdata 32 32 0
>> ip_fib_alias 2720 2720 48 85 1 : tunables 0 0
>> 0 : slabdata 32 32 0
>> RAW 3977 4224 1024 32 8 : tunables 0 0
>> 0 : slabdata 132 132 0
>> UDP 4110 4110 1088 30 8 : tunables 0 0
>> 0 : slabdata 137 137 0
>> tw_sock_TCP 4756 5216 256 32 2 : tunables 0 0
>> 0 : slabdata 163 163 0
>> TCP 2705 2768 1984 16 8 : tunables 0 0
>> 0 : slabdata 173 173 0
>> scsi_data_buffer 5440 5440 24 170 1 : tunables 0 0
>> 0 : slabdata 32 32 0
>> blkdev_queue 154 154 2208 14 8 : tunables 0 0
>> 0 : slabdata 11 11 0
>> blkdev_requests 4397688 4405884 384 42 4 : tunables 0 0
>> 0 : slabdata 104902 104902 0
>> blkdev_ioc 11232 11232 112 36 1 : tunables 0 0
>> 0 : slabdata 312 312 0
>> user_namespace 0 0 1304 25 8 : tunables 0 0
>> 0 : slabdata 0 0 0
>> sock_inode_cache 12282 12282 704 46 8 : tunables 0 0
>> 0 : slabdata 267 267 0
>> file_lock_cache 20056 20960 200 40 2 : tunables 0 0
>> 0 : slabdata 524 524 0
>> net_namespace 6 6 5056 6 8 : tunables 0 0
>> 0 : slabdata 1 1 0
>> shmem_inode_cache 16970 18952 712 46 8 : tunables 0 0
>> 0 : slabdata 412 412 0
>> Acpi-ParseExt 39491 40432 72 56 1 : tunables 0 0
>> 0 : slabdata 722 722 0
>> Acpi-State 1683 1683 80 51 1 : tunables 0 0
>> 0 : slabdata 33 33 0
>> Acpi-Namespace 11424 11424 40 102 1 : tunables 0 0
>> 0 : slabdata 112 112 0
>> task_delay_info 15336 15336 112 36 1 : tunables 0 0
>> 0 : slabdata 426 426 0
>> taskstats 1568 1568 328 49 4 : tunables 0 0
>> 0 : slabdata 32 32 0
>> proc_inode_cache 169897 190608 680 48 8 : tunables 0 0
>> 0 : slabdata 3971 3971 0
>> sigqueue 2208 2208 168 48 2 : tunables 0 0
>> 0 : slabdata 46 46 0
>> bdev_cache 792 792 896 36 8 : tunables 0 0
>> 0 : slabdata 22 22 0
>> sysfs_dir_cache 74698 74698 120 34 1 : tunables 0 0
>> 0 : slabdata 2197 2197 0
>> mnt_cache 163197 163424 256 32 2 : tunables 0 0
>> 0 : slabdata 5107 5107 0
>> filp 64607 97257 320 51 4 : tunables 0 0
>> 0 : slabdata 1907 1907 0
>> inode_cache 370744 370947 616 53 8 : tunables 0 0
>> 0 : slabdata 6999 6999 0
>> dentry 1316262 2139228 192 42 2 : tunables 0 0
>> 0 : slabdata 50934 50934 0
>> iint_cache 0 0 80 51 1 : tunables 0 0
>> 0 : slabdata 0 0 0
>> buffer_head 1441470 2890290 104 39 1 : tunables 0 0
>> 0 : slabdata 74110 74110 0
>> vm_area_struct 194998 196840 216 37 2 : tunables 0 0
>> 0 : slabdata 5320 5320 0
>> mm_struct 2679 2760 1600 20 8 : tunables 0 0
>> 0 : slabdata 138 138 0
>> files_cache 8680 8925 640 51 8 : tunables 0 0
>> 0 : slabdata 175 175 0
>> signal_cache 3691 3780 1152 28 8 : tunables 0 0
>> 0 : slabdata 135 135 0
>> sighand_cache 1950 2160 2112 15 8 : tunables 0 0
>> 0 : slabdata 144 144 0
>> task_xstate 8070 8658 832 39 8 : tunables 0 0
>> 0 : slabdata 222 222 0
>> task_struct 1913 2088 4080 8 8 : tunables 0 0
>> 0 : slabdata 261 261 0
>> cred_jar 31699 33936 192 42 2 : tunables 0 0
>> 0 : slabdata 808 808 0
>> anon_vma_chain 164026 168704 64 64 1 : tunables 0 0
>> 0 : slabdata 2636 2636 0
>> anon_vma 84104 84594 88 46 1 : tunables 0 0
>> 0 : slabdata 1839 1839 0
>> pid 11127 12576 128 32 1 : tunables 0 0
>> 0 : slabdata 393 393 0
>> shared_policy_node 9350 9350 48 85 1 : tunables 0 0
>> 0 : slabdata 110 110 0
>> numa_policy 62 62 264 31 2 : tunables 0 0
>> 0 : slabdata 2 2 0
>> radix_tree_node 771778 1194312 584 28 4 : tunables 0 0
>> 0 : slabdata 42654 42654 0
>> idr_layer_cache 2538 2565 2112 15 8 : tunables 0 0
>> 0 : slabdata 171 171 0
>> dma-kmalloc-8192 0 0 8192 4 8 : tunables 0 0
>> 0 : slabdata 0 0 0
>> dma-kmalloc-4096 0 0 4096 8 8 : tunables 0 0
>> 0 : slabdata 0 0 0
>> dma-kmalloc-2048 0 0 2048 16 8 : tunables 0 0
>> 0 : slabdata 0 0 0
>> dma-kmalloc-1024 0 0 1024 32 8 : tunables 0 0
>> 0 : slabdata 0 0 0
>> dma-kmalloc-512 0 0 512 32 4 : tunables 0 0
>> 0 : slabdata 0 0 0
>> dma-kmalloc-256 0 0 256 32 2 : tunables 0 0
>> 0 : slabdata 0 0 0
>> dma-kmalloc-128 0 0 128 32 1 : tunables 0 0
>> 0 : slabdata 0 0 0
>> dma-kmalloc-64 0 0 64 64 1 : tunables 0 0
>> 0 : slabdata 0 0 0
>> dma-kmalloc-32 0 0 32 128 1 : tunables 0 0
>> 0 : slabdata 0 0 0
>> dma-kmalloc-16 0 0 16 256 1 : tunables 0 0
>> 0 : slabdata 0 0 0
>> dma-kmalloc-8 0 0 8 512 1 : tunables 0 0
>> 0 : slabdata 0 0 0
>> dma-kmalloc-192 0 0 192 42 2 : tunables 0 0
>> 0 : slabdata 0 0 0
>> dma-kmalloc-96 0 0 96 42 1 : tunables 0 0
>> 0 : slabdata 0 0 0
>> kmalloc-8192 385 388 8192 4 8 : tunables 0 0
>> 0 : slabdata 97 97 0
>> kmalloc-4096 9296 10088 4096 8 8 : tunables 0 0
>> 0 : slabdata 1261 1261 0
>> kmalloc-2048 65061 133536 2048 16 8 : tunables 0 0
>> 0 : slabdata 8346 8346 0
>> kmalloc-1024 11987 21120 1024 32 8 : tunables 0 0
>> 0 : slabdata 660 660 0
>> kmalloc-512 107510 187072 512 32 4 : tunables 0 0
>> 0 : slabdata 5846 5846 0
>> kmalloc-256 160498 199104 256 32 2 : tunables 0 0
>> 0 : slabdata 6222 6222 0
>> kmalloc-192 144975 237426 192 42 2 : tunables 0 0
>> 0 : slabdata 5653 5653 0
>> kmalloc-128 36799 108096 128 32 1 : tunables 0 0
>> 0 : slabdata 3378 3378 0
>> kmalloc-96 99510 238896 96 42 1 : tunables 0 0
>> 0 : slabdata 5688 5688 0
>> kmalloc-64 7978152 8593280 64 64 1 : tunables 0 0
>> 0 : slabdata 134270 134270 0
>> kmalloc-32 2939882 3089664 32 128 1 : tunables 0 0
>> 0 : slabdata 24138 24138 0
>> kmalloc-16 172057 172288 16 256 1 : tunables 0 0
>> 0 : slabdata 673 673 0
>> kmalloc-8 109568 109568 8 512 1 : tunables 0 0
>> 0 : slabdata 214 214 0
>> kmem_cache_node 893 896 64 64 1 : tunables 0 0
>> 0 : slabdata 14 14 0
>> kmem_cache 612 612 320 51 4 : tunables 0 0
>> 0 : slabdata 12 12 0
>>
>> -------------------------------------------------
>>
>>
>> # uname -r
>> 3.10.0-714.10.2.lve1.5.17.1.el7.x86_64
>>
>> --------------------------------------------------------
>>
>> Core part of glances
>> http://i.imgur.com/La5JbQn.png
>> -----------------------------------------------------------
>>
>> Thank you very much for looking into this
>>
>>
>> On Thu, Aug 2, 2018 at 12:37 PM Igor A. Ippolitov <[email protected]>
>> wrote:
>>
>>> Anoop,
>>>
>>> I doubt this will be the solution, but may we have a look at
>>> /proc/buddyinfo and /proc/slabinfo the moment when nginx can't allocate
>>> memory?
>>>
>>> On 02.08.2018 08:15, Anoop Alias wrote:
>>>
>>> Hi Maxim,
>>>
>>> I enabled debug and the memalign call is happening on nginx reloads and
>>> the ENOMEM happen sometimes on the reload(not on all reloads)
>>>
>>> 2018/08/02 05:59:08 [notice] 872052#872052: signal process started
>>> 2018/08/02 05:59:23 [notice] 871570#871570: signal 1 (SIGHUP) received
>>> from 872052, reconfiguring
>>> 2018/08/02 05:59:23 [debug] 871570#871570: wake up, sigio 0
>>> 2018/08/02 05:59:23 [notice] 871570#871570: reconfiguring
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 0000000002B0DA00:16384 @16 === > the memalign call on reload
>>> 2018/08/02 05:59:23 [debug] 871570#871570: malloc: 00000000087924D0:4560
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 000000000E442E00:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: malloc: 0000000005650850:4096
>>> 20
>>>
>>>
>>>
>>>
>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #71
>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #72
>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #73
>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #74
>>> 2018/08/02 05:48:49 [debug] 871275#871275: add cleanup: 000000005340D728
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000024D3260:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000517BAF10:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053854FC0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053855FD0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053856FE0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053857FF0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: posix_memalign:
>>> 0000000053859000:16384 @16
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385D010:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385E020:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385F030:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CD160:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CE170:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CF180:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D0190:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D11A0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D21B0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D31C0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D41D0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D51E0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D61F0:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D7200:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D8210:4096
>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D9220:4096
>>>
>>>
>>> Infact there are lot of such calls during a reload
>>>
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA17ED00:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA1B0FF0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA1E12C0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA211590:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA243880:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA271B30:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA2A3E20:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA2D20D0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA3063E0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA334690:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA366980:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA396C50:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA3C8F40:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA3F9210:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA4294E0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA45B7D0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA489A80:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA4BBD70:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA4EA020:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA51E330:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA54C5E0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA57E8D0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA5AEBA0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA5DEE70:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA611160:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA641430:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA671700:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA6A29E0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA6D5CE0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA707FD0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA736280:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA768570:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA796820:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA7CAB30:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA7F8DE0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA82B0D0:16384 @16
>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>> 00000000BA85B3A0:16384 @16
>>>
>>>
>>>
>>> What is perplexing is that the system has enough free (available RAM)
>>> #############
>>> # free -g
>>> total used free shared buff/cache
>>> available
>>> Mem: 125 54 24 8 46
>>> 58
>>> Swap: 0 0 0
>>> #############
>>>
>>> # ulimit -a
>>> core file size (blocks, -c) 0
>>> data seg size (kbytes, -d) unlimited
>>> scheduling priority (-e) 0
>>> file size (blocks, -f) unlimited
>>> pending signals (-i) 514579
>>> max locked memory (kbytes, -l) 64
>>> max memory size (kbytes, -m) unlimited
>>> open files (-n) 1024
>>> pipe size (512 bytes, -p) 8
>>> POSIX message queues (bytes, -q) 819200
>>> real-time priority (-r) 0
>>> stack size (kbytes, -s) 8192
>>> cpu time (seconds, -t) unlimited
>>> max user processes (-u) 514579
>>> virtual memory (kbytes, -v) unlimited
>>> file locks (-x) unlimited
>>>
>>> #########################################
>>>
>>> There is no other thing limiting memory allocation
>>>
>>> Any way to prevent this or probably identify/prevent this
>>>
>>>
>>> On Tue, Jul 31, 2018 at 7:08 PM Maxim Dounin <[email protected]> wrote:
>>>
>>>> Hello!
>>>>
>>>> On Tue, Jul 31, 2018 at 09:52:29AM +0530, Anoop Alias wrote:
>>>>
>>>> > I am repeatedly seeing errors like
>>>> >
>>>> > ######################
>>>> > 2018/07/31 03:46:33 [emerg] 2854560#2854560: posix_memalign(16, 16384)
>>>> > failed (12: Cannot allocate memory)
>>>> > 2018/07/31 03:54:09 [emerg] 2890190#2890190: posix_memalign(16, 16384)
>>>> > failed (12: Cannot allocate memory)
>>>> > 2018/07/31 04:08:36 [emerg] 2939230#2939230: posix_memalign(16, 16384)
>>>> > failed (12: Cannot allocate memory)
>>>> > 2018/07/31 04:24:48 [emerg] 2992650#2992650: posix_memalign(16, 16384)
>>>> > failed (12: Cannot allocate memory)
>>>> > 2018/07/31 04:42:09 [emerg] 3053092#3053092: posix_memalign(16, 16384)
>>>> > failed (12: Cannot allocate memory)
>>>> > 2018/07/31 04:42:17 [emerg] 3053335#3053335: posix_memalign(16, 16384)
>>>> > failed (12: Cannot allocate memory)
>>>> > 2018/07/31 04:42:28 [emerg] 3053937#3053937: posix_memalign(16, 16384)
>>>> > failed (12: Cannot allocate memory)
>>>> > 2018/07/31 04:47:54 [emerg] 3070638#3070638: posix_memalign(16, 16384)
>>>> > failed (12: Cannot allocate memory)
>>>> > ####################
>>>> >
>>>> > on a few servers
>>>> >
>>>> > The servers have enough memory free and the swap usage is 0, yet
>>>> somehow
>>>> > the kernel denies the posix_memalign with ENOMEM ( this is what I
>>>> think is
>>>> > happening!)
>>>> >
>>>> > The numbers requested are always 16, 16k . This makes me suspicious
>>>> >
>>>> > I have no setting in nginx.conf that reference a 16k
>>>> >
>>>> > Is there any chance of finding out what requests this and why this is
>>>> not
>>>> > fulfilled
>>>>
>>>> There are at least some buffers which default to 16k - for
>>>> example, ssl_buffer_size (http://nginx.org/r/ssl_buffer_size).
>>>>
>>>> You may try debugging log to futher find out where the particular
>>>> allocation happens, see here for details:
>>>>
>>>> http://nginx.org/en/docs/debugging_log.html
>>>>
>>>> But I don't really think it worth the effort. The error is pretty
>>>> clear, and it's better to focus on why these allocations are
>>>> denied. Likely you are hitting some limit.
>>>>
>>>> --
>>>> Maxim Dounin
>>>> http://mdounin.ru/
>>>> _______________________________________________
>>>> nginx mailing list
>>>> nginx@nginx.org
>>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>>>
>>>
>>>
>>> --
>>> *Anoop P Alias*
>>>
>>>
>>>
>>> _______________________________________________
>>> nginx mailing listnginx@nginx.orghttp://mailman.nginx.org/mailman/listinfo/nginx
>>>
>>>
>>> _______________________________________________
>>> nginx mailing list
>>> nginx@nginx.org
>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>
>>
>>
>> --
>> *Anoop P Alias*
>>
>>
>>
>> _______________________________________________
>> nginx mailing listnginx@nginx.orghttp://mailman.nginx.org/mailman/listinfo/nginx
>>
>>
>> _______________________________________________
>> nginx mailing list
>> nginx@nginx.org
>> http://mailman.nginx.org/mailman/listinfo/nginx
>
>
>
> --
> *Anoop P Alias*
>
>
>
> _______________________________________________
> nginx mailing listnginx@nginx.orghttp://mailman.nginx.org/mailman/listinfo/nginx
>
>
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx



--
*Anoop P Alias*
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Igor A. Ippolitov
Re: posix_memalign error
August 07, 2018 02:00PM
Anoop,

I don't see any troubles with your configuration.
Also, if you have 120G of RAM and a single worker - the problem is not
in nginx.
Do you have other software running on the host?

Basically, you just run out of memory.

You can optimize your reload though: use "service nginx reload" (or
"kill -SIGHUP") to reload nginx configuration.
When you do nginx -s reload - you make nginx parse configuration (and it
requires memory) and then send a signal to the running master. You can
avoid this overhead with 'service' command as it uses 'kill' documented
in the manual page.

On 06.08.2018 22:55, Anoop Alias wrote:
> Hi Igor,
>
> Config is reloaded using
>
> /usr/sbin/nginx -s reload
>
> this is invoked from a python/shell script ( Nginx is installed on a
> web control panel )
>
> The top-level Nginx config is in the gist below
>
> https://gist.github.com/AnoopAlias/ba5ad6749a586c7e267672ee65b32b3a
>
> It further includes ~8k server blocks or more in some servers. Out of
> this 2/3 are server {} blocks with TLS config and 1/3 non-TLS ones
>
> ]# pwd
> /etc/nginx/sites-enabled
> # grep "server {" *|wc -l
> 7886
>
> And yes most of them are very similar and mostly proxy to upstream httpd
>
> I have tried removing all the loadable modules and even tried an older
> version of nginx and all produce the error
>
>
> # numastat -m
>
> Per-node system memory usage (in MBs):
>                           Node 0          Node 1  Total
>                  --------------- --------------- ---------------
> MemTotal                65430.84        65536.00  130966.84
> MemFree                  5491.26           40.89  5532.15
> MemUsed                 59939.58        65495.11  125434.69
> Active                  22295.61        21016.09 43311.70
> Inactive                 8742.76         4662.48 13405.24
> Active(anon)            16717.10        16572.19 33289.29
> Inactive(anon)           2931.94         1388.14  4320.08
> Active(file)             5578.50         4443.91 10022.41
> Inactive(file)           5810.82         3274.34  9085.16
> Unevictable                 0.00            0.00   0.00
> Mlocked                     0.00            0.00   0.00
> Dirty                       7.04            1.64   8.67
> Writeback                   0.00            0.00   0.00
> FilePages               18458.93        10413.97 28872.90
> Mapped                    862.14          413.38  1275.52
> AnonPages               12579.49        15264.37 27843.86
> Shmem                    7069.52         2695.71  9765.23
> KernelStack                18.34            3.03  21.38
> PageTables                153.14          107.77 260.90
> NFS_Unstable                0.00            0.00   0.00
> Bounce                      0.00            0.00   0.00
> WritebackTmp                0.00            0.00   0.00
> Slab                     4830.68         2254.55  7085.22
> SReclaimable             2061.05          921.72  2982.77
> SUnreclaim               2769.62         1332.83  4102.45
> AnonHugePages               4.00            2.00   6.00
> HugePages_Total             0.00            0.00   0.00
> HugePages_Free              0.00            0.00   0.00
> HugePages_Surp              0.00            0.00   0.00
>
>
> Thanks,
>
>
>
>
>
> On Mon, Aug 6, 2018 at 6:33 PM Igor A. Ippolitov <[email protected]
> <mailto:[email protected]>> wrote:
>
> Anoop,
>
> I suppose, most of your 10k servers are very similar, right?
> Please, post top level configuration and a typical server{}, please.
>
> Also, how do you reload configuration? With 'service nginx reload'
> or may be other commands?
>
> It looks like you have a lot of fragmented memory and only 4gb
> free in the second numa node.
> So, I'd say this is OK that you are getting errors from allocating
> a 16k stripes.
>
> Could you please post numastat -m output additionally. Just to
> make sure you have half of the memory for the second CPU.
> And we'll have a look if memory utilization may be optimized based
> on your configuration.
>
> Regards,
> Igor.
>
> On 04.08.2018 07:54, Anoop Alias wrote:
>> Hi Igor,
>>
>> Setting vm.max_map_count to 20x the normal value did not help
>>
>> The issue happens on a group of servers and among the group, it
>> shows up only in servers which have ~10k  server{} blocks
>>
>> On servers that have lower number of server{} blocks , the ENOMEM
>> issue is not there
>>
>> Also, I can find that the RAM usage of the Nginx process is
>> directly proportional to the number of server {} blocks
>>
>> For example on a server having the problem
>>
>> # ps_mem| head -1 && ps_mem |grep nginx
>>  Private  +   Shared  =  RAM used       Program
>>   1.0 GiB +   2.8 GiB =   3.8 GiB       nginx (3)
>>
>>
>> That is for a single worker process with 4 threads in thread_pool
>> # pstree|grep nginx
>>         |-nginx-+-nginx---4*[{nginx}]
>>         |       `-nginx
>>
>> Whatever config change I try the memory usage seem to mostly
>> depend on the number of server contexts defined
>>
>> Now the issue mostly happen in nginx reload ,when one more worker
>> process will be active in shutting down mode
>>
>> I believe the memalign error is thrown by the worker being
>> shutdown, this is because the sites work after the error and also
>> the pid mentioned in the error would have gone when I check ps
>>
>>
>> # pmap 948965|grep 16K
>> 00007f2923ff2000     16K r-x-- ngx_http_redis2_module.so
>> 00007f2924fd7000     16K r---- libc-2.17.so http://libc-2.17.so
>> 00007f2925431000     16K rw---   [ anon ]
>> 00007f292584a000     16K rw---   [ anon ]
>>
>> Aug  4 05:50:00 b kernel: SysRq : Show Memory
>> Aug  4 05:50:00 b kernel: Mem-Info:
>> Aug  4 05:50:00 b kernel: active_anon:7757394
>> inactive_anon:1021319 isolated_anon:0#012 active_file:3733324
>> inactive_file:2136476 isolated_file:0#012 unevictable:0
>> dirty:1766 writeback:6 wbtmp:0 unstable:0#012
>> slab_reclaimable:2003687 slab_unreclaimable:901391#012
>> mapped:316734 shmem:2381810 pagetables:63163 bounce:0#012
>> free:4851283 free_pcp:11332 free_cma:0
>> Aug  4 05:50:00 bravo kernel: Node 0 DMA free:15888kB min:8kB
>> low:8kB high:12kB active_anon:0kB inactive_anon:0kB
>> active_file:0kB inactive_file:0kB unevictable:0kB
>> isolated(anon):0kB isolated(file):0kB present:15972kB
>> managed:15888kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB
>> shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB
>> kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB
>> free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB
>> pages_scanned:0 all_unreclaimable? yes
>> Aug  4 05:50:00 b kernel: lowmem_reserve[]: 0 1679 64139 64139
>>
>> # cat /proc/buddyinfo
>> Node 0, zone      DMA      0      0      1 0      2      1     
>> 1      0      1      1      3
>> Node 0, zone    DMA32   5284   6753   6677  1083    410     59   
>>   1      0      0      0 0
>> Node 0, zone   Normal 500327 638958 406737 14690    872    106   
>>  11      0      0      0 0
>> Node 1, zone   Normal 584840 291640    188 0      0      0     
>> 0      0      0      0      0
>>
>>
>> The only correlation I see in having the error is the number of
>> server {} blocks (close to 10k) which then makes the
>> nginx process consume ~ 4GB of mem with a single worker process
>> and then a reload is done
>>
>>
>>
>>
>> On Thu, Aug 2, 2018 at 6:02 PM Igor A. Ippolitov
>> <[email protected] <mailto:[email protected]>> wrote:
>>
>> Anoop,
>>
>> There are two guesses: either mmap allocations limit is hit
>> or memory is  way too fragmented.
>> Could you please track amount of mapped regions for a worker
>> with pmap and amount of 16k areas in Normal zones (it is the
>> third number)?
>>
>> You can also set vm.max_map_count to a higher number (like 20
>> times higher than default) and look if the error is gone.
>>
>> Please, let me know if increasing vm.max_map_count helps you.
>>
>> On 02.08.2018 13:06, Anoop Alias wrote:
>>> Hi Igor,
>>>
>>> The error happens randomly
>>>
>>> 2018/08/02 06:52:42 [emerg] 874514#874514:
>>> posix_memalign(16, 16384) failed (12: Cannot allocate memory)
>>> 2018/08/02 09:42:53 [emerg] 872996#872996:
>>> posix_memalign(16, 16384) failed (12: Cannot allocate memory)
>>> 2018/08/02 10:16:14 [emerg] 877611#877611:
>>> posix_memalign(16, 16384) failed (12: Cannot allocate memory)
>>> 2018/08/02 10:16:48 [emerg] 879410#879410:
>>> posix_memalign(16, 16384) failed (12: Cannot allocate memory)
>>> 2018/08/02 10:17:55 [emerg] 876563#876563:
>>> posix_memalign(16, 16384) failed (12: Cannot allocate memory)
>>> 2018/08/02 10:20:21 [emerg] 879263#879263:
>>> posix_memalign(16, 16384) failed (12: Cannot allocate memory)
>>> 2018/08/02 10:20:51 [emerg] 878991#878991:
>>> posix_memalign(16, 16384) failed (12: Cannot allocate memory)
>>>
>>> # date
>>> Thu Aug  2 10:58:48 BST 2018
>>>
>>> ------------------------------------------
>>> # cat /proc/buddyinfo
>>> Node 0, zone      DMA      0      0   1      0      2     
>>> 1      1      0   1      1      3
>>> Node 0, zone    DMA32  11722  11057  4663   1647    609   
>>>  72     10      7     1      0      0
>>> Node 0, zone   Normal 755026 710760 398136  21462   1114   
>>>  18      1      0     0      0      0
>>> Node 1, zone   Normal 341295 801810 179604    256      0   
>>>   0      0      0     0      0      0
>>> -----------------------------------------
>>>
>>>
>>> slabinfo - version: 2.1
>>> # name            <active_objs> <num_objs> <objsize>
>>> <objperslab> <pagesperslab> : tunables <limit> <batchcount>
>>> <sharedfactor> : slabdata <active_slabs> <num_slabs>
>>> <sharedavail>
>>> SCTPv6                21     21   1536  21    8 : tunables 
>>>   0    0    0 : slabdata      1      1      0
>>> SCTP                   0      0   1408  23    8 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> kcopyd_job             0      0   3312   9    8 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> dm_uevent              0      0   2608  12    8 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> nf_conntrack_ffffffff81acbb00  14054 14892    320   51    4
>>> : tunables    0 0    0 : slabdata    292    292      0
>>> lvp_cache             36     36    224  36    2 : tunables 
>>>   0    0    0 : slabdata      1      1      0
>>> lve_struct          4140   4140    352  46    4 : tunables 
>>>   0    0    0 : slabdata     90     90      0
>>> fat_inode_cache        0      0    744  44    8 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> fat_cache              0      0     40 102    1 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> isofs_inode_cache      0      0    664  49    8 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> ext4_inode_cache      30     30   1088  30    8 : tunables 
>>>   0    0    0 : slabdata      1      1      0
>>> ext4_xattr             0      0     88  46    1 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> ext4_free_data         0      0     64  64    1 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> ext4_allocation_context     32     32   128   32    1 :
>>> tunables    0    0    0 : slabdata      1      1      0
>>> ext4_io_end            0      0     72  56    1 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> ext4_extent_status    102    102  40  102    1 : tunables   
>>> 0    0    0 : slabdata      1      1      0
>>> jbd2_journal_handle      0      0  48   85    1 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> jbd2_journal_head      0      0    112  36    1 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> jbd2_revoke_table_s    256    256  16  256    1 : tunables 
>>>   0    0    0 : slabdata      1      1      0
>>> jbd2_revoke_record_s      0      0  32  128    1 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> kvm_async_pf           0      0    136  30    1 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> kvm_vcpu               0      0  18560   1    8 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> xfs_dqtrx            992    992    528  31    4 : tunables 
>>>   0    0    0 : slabdata     32     32      0
>>> xfs_dquot           3264   3264    472  34    4 : tunables 
>>>   0    0    0 : slabdata     96     96      0
>>> xfs_ili           4342175 4774399 152   53    2 : tunables 
>>>   0    0    0 : slabdata  90083  90083      0
>>> xfs_inode         4915588 5486076  1088   30    8 :
>>> tunables    0    0    0 : slabdata 182871 182871      0
>>> xfs_efd_item        2680   2760    400  40    4 : tunables 
>>>   0    0    0 : slabdata     69     69      0
>>> xfs_da_state        1088   1088    480  34    4 : tunables 
>>>   0    0    0 : slabdata     32     32      0
>>> xfs_btree_cur       1248   1248    208  39    2 : tunables 
>>>   0    0    0 : slabdata     32     32      0
>>> xfs_log_ticket     14874  15048    184  44    2 : tunables 
>>>   0    0    0 : slabdata    342    342      0
>>> xfs_ioend          12909  13104    104  39    1 : tunables 
>>>   0    0    0 : slabdata    336    336      0
>>> scsi_cmd_cache      5400   5652    448  36    4 : tunables 
>>>   0    0    0 : slabdata    157    157      0
>>> ve_struct              0      0    848  38    8 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> ip6_dst_cache       1152   1152    448  36    4 : tunables 
>>>   0    0    0 : slabdata     32     32      0
>>> RAWv6                910    910   1216  26    8 : tunables 
>>>   0    0    0 : slabdata     35     35      0
>>> UDPLITEv6              0      0   1216  26    8 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> UDPv6                832    832   1216  26    8 : tunables 
>>>   0    0    0 : slabdata     32     32      0
>>> tw_sock_TCPv6       1152   1376    256  32    2 : tunables 
>>>   0    0    0 : slabdata     43     43      0
>>> TCPv6                510    510   2176  15    8 : tunables 
>>>   0    0    0 : slabdata     34     34      0
>>> cfq_queue           3698   5145    232  35    2 : tunables 
>>>   0    0    0 : slabdata    147    147      0
>>> bsg_cmd                0      0    312  52    4 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> mqueue_inode_cache    136    136 960   34    8 : tunables   
>>> 0    0    0 : slabdata      4      4      0
>>> hugetlbfs_inode_cache   1632   1632 632   51    8 :
>>> tunables    0    0    0 : slabdata     32     32      0
>>> configfs_dir_cache   1472   1472  88   46    1 : tunables   
>>> 0    0    0 : slabdata     32     32      0
>>> dquot                  0      0    256  32    2 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> userfaultfd_ctx_cache     32     32 128   32    1 :
>>> tunables    0    0    0 : slabdata      1      1      0
>>> fanotify_event_info   2336   2336  56   73    1 : tunables 
>>>   0    0    0 : slabdata     32     32      0
>>> dio                 6171   6222    640  51    8 : tunables 
>>>   0    0    0 : slabdata    122    122      0
>>> pid_namespace         42     42   2192  14    8 : tunables 
>>>   0    0    0 : slabdata      3      3      0
>>> posix_timers_cache   1056   1056 248   33    2 : tunables   
>>> 0    0    0 : slabdata     32     32      0
>>> UDP-Lite               0      0   1088  30    8 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> flow_cache          2268   2296    144  28    1 : tunables 
>>>   0    0    0 : slabdata     82     82      0
>>> xfrm_dst_cache       896    896    576  28    4 : tunables 
>>>   0    0    0 : slabdata     32     32      0
>>> ip_fib_alias        2720   2720     48  85    1 : tunables 
>>>   0    0    0 : slabdata     32     32      0
>>> RAW                 3977   4224   1024  32    8 : tunables 
>>>   0    0    0 : slabdata    132    132      0
>>> UDP                 4110   4110   1088  30    8 : tunables 
>>>   0    0    0 : slabdata    137    137      0
>>> tw_sock_TCP         4756   5216    256  32    2 : tunables 
>>>   0    0    0 : slabdata    163    163      0
>>> TCP                 2705   2768   1984  16    8 : tunables 
>>>   0    0    0 : slabdata    173    173      0
>>> scsi_data_buffer    5440   5440     24 170    1 : tunables 
>>>   0    0    0 : slabdata     32     32      0
>>> blkdev_queue         154    154   2208  14    8 : tunables 
>>>   0    0    0 : slabdata     11     11      0
>>> blkdev_requests   4397688 4405884 384   42    4 : tunables 
>>>   0    0    0 : slabdata 104902 104902      0
>>> blkdev_ioc         11232  11232    112  36    1 : tunables 
>>>   0    0    0 : slabdata    312    312      0
>>> user_namespace         0      0   1304  25    8 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> sock_inode_cache   12282  12282    704  46    8 : tunables 
>>>   0    0    0 : slabdata    267    267      0
>>> file_lock_cache    20056  20960    200  40    2 : tunables 
>>>   0    0    0 : slabdata    524    524      0
>>> net_namespace          6      6   5056   6    8 : tunables 
>>>   0    0    0 : slabdata      1      1      0
>>> shmem_inode_cache  16970  18952    712  46    8 : tunables 
>>>   0    0    0 : slabdata    412    412      0
>>> Acpi-ParseExt      39491  40432     72  56    1 : tunables 
>>>   0    0    0 : slabdata    722    722      0
>>> Acpi-State          1683   1683     80  51    1 : tunables 
>>>   0    0    0 : slabdata     33     33      0
>>> Acpi-Namespace     11424  11424     40 102    1 : tunables 
>>>   0    0    0 : slabdata    112    112      0
>>> task_delay_info    15336  15336    112  36    1 : tunables 
>>>   0    0    0 : slabdata    426    426      0
>>> taskstats           1568   1568    328  49    4 : tunables 
>>>   0    0    0 : slabdata     32     32      0
>>> proc_inode_cache  169897 190608    680  48    8 : tunables 
>>>   0    0    0 : slabdata   3971   3971      0
>>> sigqueue            2208   2208    168  48    2 : tunables 
>>>   0    0    0 : slabdata     46     46      0
>>> bdev_cache           792    792    896  36    8 : tunables 
>>>   0    0    0 : slabdata     22     22      0
>>> sysfs_dir_cache    74698  74698    120  34    1 : tunables 
>>>   0    0    0 : slabdata   2197   2197      0
>>> mnt_cache         163197 163424    256  32    2 : tunables 
>>>   0    0    0 : slabdata   5107   5107      0
>>> filp               64607  97257    320  51    4 : tunables 
>>>   0    0    0 : slabdata   1907   1907      0
>>> inode_cache       370744 370947    616  53    8 : tunables 
>>>   0    0    0 : slabdata   6999   6999      0
>>> dentry            1316262 2139228 192   42    2 : tunables 
>>>   0    0    0 : slabdata  50934  50934      0
>>> iint_cache             0      0     80  51    1 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> buffer_head       1441470 2890290 104   39    1 : tunables 
>>>   0    0    0 : slabdata  74110  74110      0
>>> vm_area_struct    194998 196840    216  37    2 : tunables 
>>>   0    0    0 : slabdata   5320   5320      0
>>> mm_struct           2679   2760   1600  20    8 : tunables 
>>>   0    0    0 : slabdata    138    138      0
>>> files_cache         8680   8925    640  51    8 : tunables 
>>>   0    0    0 : slabdata    175    175      0
>>> signal_cache        3691   3780   1152  28    8 : tunables 
>>>   0    0    0 : slabdata    135    135      0
>>> sighand_cache       1950   2160   2112  15    8 : tunables 
>>>   0    0    0 : slabdata    144    144      0
>>> task_xstate         8070   8658    832  39    8 : tunables 
>>>   0    0    0 : slabdata    222    222      0
>>> task_struct         1913   2088   4080   8    8 : tunables 
>>>   0    0    0 : slabdata    261    261      0
>>> cred_jar           31699  33936    192  42    2 : tunables 
>>>   0    0    0 : slabdata    808    808      0
>>> anon_vma_chain    164026 168704     64  64    1 : tunables 
>>>   0    0    0 : slabdata   2636   2636      0
>>> anon_vma           84104  84594     88  46    1 : tunables 
>>>   0    0    0 : slabdata   1839   1839      0
>>> pid                11127  12576    128  32    1 : tunables 
>>>   0    0    0 : slabdata    393    393      0
>>> shared_policy_node   9350   9350  48   85    1 : tunables   
>>> 0    0    0 : slabdata    110    110      0
>>> numa_policy           62     62    264  31    2 : tunables 
>>>   0    0    0 : slabdata      2      2      0
>>> radix_tree_node   771778 1194312 584   28    4 : tunables   
>>> 0    0    0 : slabdata  42654  42654      0
>>> idr_layer_cache     2538   2565   2112  15    8 : tunables 
>>>   0    0    0 : slabdata    171    171      0
>>> dma-kmalloc-8192       0      0   8192   4    8 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> dma-kmalloc-4096       0      0   4096   8    8 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> dma-kmalloc-2048       0      0   2048  16    8 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> dma-kmalloc-1024       0      0   1024  32    8 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> dma-kmalloc-512        0      0    512  32    4 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> dma-kmalloc-256        0      0    256  32    2 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> dma-kmalloc-128        0      0    128  32    1 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> dma-kmalloc-64         0      0     64  64    1 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> dma-kmalloc-32         0      0     32 128    1 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> dma-kmalloc-16         0      0     16 256    1 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> dma-kmalloc-8          0      0      8 512    1 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> dma-kmalloc-192        0      0    192  42    2 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> dma-kmalloc-96         0      0     96  42    1 : tunables 
>>>   0    0    0 : slabdata      0      0      0
>>> kmalloc-8192         385    388   8192   4    8 : tunables 
>>>   0    0    0 : slabdata     97     97      0
>>> kmalloc-4096        9296  10088   4096   8    8 : tunables 
>>>   0    0    0 : slabdata   1261   1261      0
>>> kmalloc-2048       65061 133536   2048  16    8 : tunables 
>>>   0    0    0 : slabdata   8346   8346      0
>>> kmalloc-1024       11987  21120   1024  32    8 : tunables 
>>>   0    0    0 : slabdata    660    660      0
>>> kmalloc-512       107510 187072    512  32    4 : tunables 
>>>   0    0    0 : slabdata   5846   5846      0
>>> kmalloc-256       160498 199104    256  32    2 : tunables 
>>>   0    0    0 : slabdata   6222   6222      0
>>> kmalloc-192       144975 237426    192  42    2 : tunables 
>>>   0    0    0 : slabdata   5653   5653      0
>>> kmalloc-128        36799 108096    128  32    1 : tunables 
>>>   0    0    0 : slabdata   3378   3378      0
>>> kmalloc-96         99510 238896     96  42    1 : tunables 
>>>   0    0    0 : slabdata   5688   5688      0
>>> kmalloc-64        7978152 8593280  64   64    1 : tunables 
>>>   0    0    0 : slabdata 134270 134270      0
>>> kmalloc-32        2939882 3089664  32  128    1 : tunables 
>>>   0    0    0 : slabdata  24138  24138      0
>>> kmalloc-16        172057 172288     16 256    1 : tunables 
>>>   0    0    0 : slabdata    673    673      0
>>> kmalloc-8         109568 109568      8 512    1 : tunables 
>>>   0    0    0 : slabdata    214    214      0
>>> kmem_cache_node      893    896     64  64    1 : tunables 
>>>   0    0    0 : slabdata     14     14      0
>>> kmem_cache           612    612    320  51    4 : tunables 
>>>   0    0    0 : slabdata     12     12      0
>>>
>>> -------------------------------------------------
>>>
>>>
>>> # uname -r
>>> 3.10.0-714.10.2.lve1.5.17.1.el7.x86_64
>>>
>>> --------------------------------------------------------
>>>
>>> Core part of glances
>>> http://i.imgur.com/La5JbQn.png
>>> -----------------------------------------------------------
>>>
>>> Thank you very much for looking into this
>>>
>>>
>>> On Thu, Aug 2, 2018 at 12:37 PM Igor A. Ippolitov
>>> <[email protected] <mailto:[email protected]>> wrote:
>>>
>>> Anoop,
>>>
>>> I doubt this will be the solution, but may we have a
>>> look at /proc/buddyinfo and /proc/slabinfo the moment
>>> when nginx can't allocate memory?
>>>
>>> On 02.08.2018 08:15, Anoop Alias wrote:
>>>> Hi Maxim,
>>>>
>>>> I enabled debug and the memalign call is happening on
>>>> nginx reloads and the ENOMEM happen sometimes on the
>>>> reload(not on all reloads)
>>>>
>>>> 2018/08/02 05:59:08 [notice] 872052#872052: signal
>>>> process started
>>>> 2018/08/02 05:59:23 [notice] 871570#871570: signal 1
>>>> (SIGHUP) received from 872052, reconfiguring
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: wake up, sigio 0
>>>> 2018/08/02 05:59:23 [notice] 871570#871570: reconfiguring
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 0000000002B0DA00:16384 @16      === >
>>>> the memalign call on reload
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: malloc:
>>>> 00000000087924D0:4560
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 000000000E442E00:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: malloc:
>>>> 0000000005650850:4096
>>>> 20
>>>>
>>>>
>>>>
>>>>
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind()
>>>> xxxx:443 #71
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind()
>>>> xxxx:443 #72
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind()
>>>> xxxx:443 #73
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind()
>>>> xxxx:443 #74
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: add cleanup:
>>>> 000000005340D728
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 00000000024D3260:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 00000000517BAF10:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 0000000053854FC0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 0000000053855FD0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 0000000053856FE0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 0000000053857FF0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275:
>>>> posix_memalign: 0000000053859000:16384 @16
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 000000005385D010:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 000000005385E020:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 000000005385F030:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 00000000536CD160:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 00000000536CE170:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 00000000536CF180:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 00000000536D0190:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 00000000536D11A0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 00000000536D21B0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 00000000536D31C0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 00000000536D41D0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 00000000536D51E0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 00000000536D61F0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 00000000536D7200:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 00000000536D8210:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>> 00000000536D9220:4096
>>>>
>>>>
>>>> Infact there are lot of such calls during a reload
>>>>
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA17ED00:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA1B0FF0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA1E12C0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA211590:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA243880:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA271B30:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA2A3E20:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA2D20D0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA3063E0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA334690:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA366980:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA396C50:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA3C8F40:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA3F9210:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA4294E0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA45B7D0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA489A80:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA4BBD70:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA4EA020:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA51E330:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA54C5E0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA57E8D0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA5AEBA0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA5DEE70:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA611160:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA641430:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA671700:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA6A29E0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA6D5CE0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA707FD0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA736280:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA768570:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA796820:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA7CAB30:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA7F8DE0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA82B0D0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570:
>>>> posix_memalign: 00000000BA85B3A0:16384 @16
>>>>
>>>>
>>>>
>>>> What is perplexing is that the system has enough free
>>>> (available RAM)
>>>> #############
>>>> # free -g
>>>>               total        used       free      shared 
>>>> buff/cache  available
>>>> Mem:            125          54         24           8 
>>>>         46         58
>>>> Swap:             0           0          0
>>>> #############
>>>>
>>>> # ulimit -a
>>>> core file size          (blocks, -c) 0
>>>> data seg size           (kbytes, -d) unlimited
>>>> scheduling priority  (-e) 0
>>>> file size               (blocks, -f) unlimited
>>>> pending signals  (-i) 514579
>>>> max locked memory       (kbytes, -l) 64
>>>> max memory size         (kbytes, -m) unlimited
>>>> open files (-n) 1024
>>>> pipe size            (512 bytes, -p) 8
>>>> POSIX message queues     (bytes, -q) 819200
>>>> real-time priority (-r) 0
>>>> stack size              (kbytes, -s) 8192
>>>> cpu time               (seconds, -t) unlimited
>>>> max user processes (-u) 514579
>>>> virtual memory          (kbytes, -v) unlimited
>>>> file locks (-x) unlimited
>>>>
>>>> #########################################
>>>>
>>>> There is no other thing limiting memory allocation
>>>>
>>>> Any way to prevent this or probably identify/prevent this
>>>>
>>>>
>>>> On Tue, Jul 31, 2018 at 7:08 PM Maxim Dounin
>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>>
>>>> Hello!
>>>>
>>>> On Tue, Jul 31, 2018 at 09:52:29AM +0530, Anoop
>>>> Alias wrote:
>>>>
>>>> > I am repeatedly seeing errors like
>>>> >
>>>> > ######################
>>>> > 2018/07/31 03:46:33 [emerg] 2854560#2854560:
>>>> posix_memalign(16, 16384)
>>>> > failed (12: Cannot allocate memory)
>>>> > 2018/07/31 03:54:09 [emerg] 2890190#2890190:
>>>> posix_memalign(16, 16384)
>>>> > failed (12: Cannot allocate memory)
>>>> > 2018/07/31 04:08:36 [emerg] 2939230#2939230:
>>>> posix_memalign(16, 16384)
>>>> > failed (12: Cannot allocate memory)
>>>> > 2018/07/31 04:24:48 [emerg] 2992650#2992650:
>>>> posix_memalign(16, 16384)
>>>> > failed (12: Cannot allocate memory)
>>>> > 2018/07/31 04:42:09 [emerg] 3053092#3053092:
>>>> posix_memalign(16, 16384)
>>>> > failed (12: Cannot allocate memory)
>>>> > 2018/07/31 04:42:17 [emerg] 3053335#3053335:
>>>> posix_memalign(16, 16384)
>>>> > failed (12: Cannot allocate memory)
>>>> > 2018/07/31 04:42:28 [emerg] 3053937#3053937:
>>>> posix_memalign(16, 16384)
>>>> > failed (12: Cannot allocate memory)
>>>> > 2018/07/31 04:47:54 [emerg] 3070638#3070638:
>>>> posix_memalign(16, 16384)
>>>> > failed (12: Cannot allocate memory)
>>>> > ####################
>>>> >
>>>> > on a few servers
>>>> >
>>>> > The servers have enough memory free and the swap
>>>> usage is 0, yet somehow
>>>> > the kernel denies the posix_memalign with ENOMEM
>>>> ( this is what I think is
>>>> > happening!)
>>>> >
>>>> > The numbers requested are always 16, 16k . This
>>>> makes me suspicious
>>>> >
>>>> > I have no setting in nginx.conf that reference a 16k
>>>> >
>>>> > Is there any chance of finding out what requests
>>>> this and why this is not
>>>> > fulfilled
>>>>
>>>> There are at least some buffers which default to
>>>> 16k - for
>>>> example, ssl_buffer_size
>>>> (http://nginx.org/r/ssl_buffer_size).
>>>>
>>>> You may try debugging log to futher find out where
>>>> the particular
>>>> allocation happens, see here for details:
>>>>
>>>> http://nginx.org/en/docs/debugging_log.html
>>>>
>>>> But I don't really think it worth the effort.  The
>>>> error is pretty
>>>> clear, and it's better to focus on why these
>>>> allocations are
>>>> denied.  Likely you are hitting some limit.
>>>>
>>>> --
>>>> Maxim Dounin
>>>> http://mdounin.ru/
>>>> _______________________________________________
>>>> nginx mailing list
>>>> nginx@nginx.org <mailto:[email protected]>
>>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>>>
>>>>
>>>>
>>>> --
>>>> *Anoop P Alias*
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> nginx mailing list
>>>> nginx@nginx.org <mailto:[email protected]>
>>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>>
>>>
>>> _______________________________________________
>>> nginx mailing list
>>> nginx@nginx.org <mailto:[email protected]>
>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>>
>>>
>>>
>>> --
>>> *Anoop P Alias*
>>>
>>>
>>>
>>> _______________________________________________
>>> nginx mailing list
>>> nginx@nginx.org <mailto:[email protected]>
>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>
>>
>> _______________________________________________
>> nginx mailing list
>> nginx@nginx.org <mailto:[email protected]>
>> http://mailman.nginx.org/mailman/listinfo/nginx
>>
>>
>>
>> --
>> *Anoop P Alias*
>>
>>
>>
>> _______________________________________________
>> nginx mailing list
>> nginx@nginx.org <mailto:[email protected]>
>> http://mailman.nginx.org/mailman/listinfo/nginx
>
>
> _______________________________________________
> nginx mailing list
> nginx@nginx.org <mailto:[email protected]>
> http://mailman.nginx.org/mailman/listinfo/nginx
>
>
>
> --
> *Anoop P Alias*
>
>
>
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx


_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Anoop Alias
Re: posix_memalign error
August 08, 2018 05:40AM
Hi Igor,

Yes the server runs other software including httpd with a similar number of
vhost

# grep "<VirtualHost" /etc/apache2/conf/httpd.conf|wc -l
5168

I haven't found issue with the other softwares in the logs relating to
memory

Infact httpd (event mpm) use lesser memory to load similar config

# ps_mem| head -1 && ps_mem |grep httpd
Private + Shared = RAM used Program
585.6 MiB + 392.0 MiB = 977.6 MiB httpd (63)

# ps_mem| head -1 && ps_mem |grep nginx
Private + Shared = RAM used Program
999.8 MiB + 2.5 GiB = 3.5 GiB nginx (3)

The server is a shared hosting one and runs CloudLinux , but as far as I
know ,CloudLinux applies limits to only user level process and not nginx

The nginx HUP is needed as this is triggered by changes in apache
configuration and nginx need to reload the new config . For log file reload
SIGUSR1 is used





On Tue, Aug 7, 2018 at 5:50 PM Igor A. Ippolitov <[email protected]>
wrote:

> Anoop,
>
> I don't see any troubles with your configuration.
> Also, if you have 120G of RAM and a single worker - the problem is not in
> nginx.
> Do you have other software running on the host?
>
> Basically, you just run out of memory.
>
> You can optimize your reload though: use "service nginx reload" (or "kill
> -SIGHUP") to reload nginx configuration.
> When you do nginx -s reload - you make nginx parse configuration (and it
> requires memory) and then send a signal to the running master. You can
> avoid this overhead with 'service' command as it uses 'kill' documented in
> the manual page.
>
> On 06.08.2018 22:55, Anoop Alias wrote:
>
> Hi Igor,
>
> Config is reloaded using
>
> /usr/sbin/nginx -s reload
>
> this is invoked from a python/shell script ( Nginx is installed on a web
> control panel )
>
> The top-level Nginx config is in the gist below
>
> https://gist.github.com/AnoopAlias/ba5ad6749a586c7e267672ee65b32b3a
>
> It further includes ~8k server blocks or more in some servers. Out of this
> 2/3 are server {} blocks with TLS config and 1/3 non-TLS ones
>
> ]# pwd
> /etc/nginx/sites-enabled
> # grep "server {" *|wc -l
> 7886
>
> And yes most of them are very similar and mostly proxy to upstream httpd
>
> I have tried removing all the loadable modules and even tried an older
> version of nginx and all produce the error
>
>
> # numastat -m
>
> Per-node system memory usage (in MBs):
> Node 0 Node 1 Total
> --------------- --------------- ---------------
> MemTotal 65430.84 65536.00 130966.84
> MemFree 5491.26 40.89 5532.15
> MemUsed 59939.58 65495.11 125434.69
> Active 22295.61 21016.09 43311.70
> Inactive 8742.76 4662.48 13405.24
> Active(anon) 16717.10 16572.19 33289.29
> Inactive(anon) 2931.94 1388.14 4320.08
> Active(file) 5578.50 4443.91 10022.41
> Inactive(file) 5810.82 3274.34 9085.16
> Unevictable 0.00 0.00 0.00
> Mlocked 0.00 0.00 0.00
> Dirty 7.04 1.64 8.67
> Writeback 0.00 0.00 0.00
> FilePages 18458.93 10413.97 28872.90
> Mapped 862.14 413.38 1275.52
> AnonPages 12579.49 15264.37 27843.86
> Shmem 7069.52 2695.71 9765.23
> KernelStack 18.34 3.03 21.38
> PageTables 153.14 107.77 260.90
> NFS_Unstable 0.00 0.00 0.00
> Bounce 0.00 0.00 0.00
> WritebackTmp 0.00 0.00 0.00
> Slab 4830.68 2254.55 7085.22
> SReclaimable 2061.05 921.72 2982.77
> SUnreclaim 2769.62 1332.83 4102.45
> AnonHugePages 4.00 2.00 6.00
> HugePages_Total 0.00 0.00 0.00
> HugePages_Free 0.00 0.00 0.00
> HugePages_Surp 0.00 0.00 0.00
>
>
> Thanks,
>
>
>
>
>
> On Mon, Aug 6, 2018 at 6:33 PM Igor A. Ippolitov <[email protected]>
> wrote:
>
>> Anoop,
>>
>> I suppose, most of your 10k servers are very similar, right?
>> Please, post top level configuration and a typical server{}, please.
>>
>> Also, how do you reload configuration? With 'service nginx reload' or may
>> be other commands?
>>
>> It looks like you have a lot of fragmented memory and only 4gb free in
>> the second numa node.
>> So, I'd say this is OK that you are getting errors from allocating a 16k
>> stripes.
>>
>> Could you please post numastat -m output additionally. Just to make sure
>> you have half of the memory for the second CPU.
>> And we'll have a look if memory utilization may be optimized based on
>> your configuration.
>>
>> Regards,
>> Igor.
>>
>> On 04.08.2018 07:54, Anoop Alias wrote:
>>
>> Hi Igor,
>>
>> Setting vm.max_map_count to 20x the normal value did not help
>>
>> The issue happens on a group of servers and among the group, it shows up
>> only in servers which have ~10k server{} blocks
>>
>> On servers that have lower number of server{} blocks , the ENOMEM issue
>> is not there
>>
>> Also, I can find that the RAM usage of the Nginx process is directly
>> proportional to the number of server {} blocks
>>
>> For example on a server having the problem
>>
>> # ps_mem| head -1 && ps_mem |grep nginx
>> Private + Shared = RAM used Program
>> 1.0 GiB + 2.8 GiB = 3.8 GiB nginx (3)
>>
>>
>> That is for a single worker process with 4 threads in thread_pool
>> # pstree|grep nginx
>> |-nginx-+-nginx---4*[{nginx}]
>> | `-nginx
>>
>> Whatever config change I try the memory usage seem to mostly depend on
>> the number of server contexts defined
>>
>> Now the issue mostly happen in nginx reload ,when one more worker process
>> will be active in shutting down mode
>>
>> I believe the memalign error is thrown by the worker being shutdown, this
>> is because the sites work after the error and also the pid mentioned in the
>> error would have gone when I check ps
>>
>>
>> # pmap 948965|grep 16K
>> 00007f2923ff2000 16K r-x-- ngx_http_redis2_module.so
>> 00007f2924fd7000 16K r---- libc-2.17.so
>> 00007f2925431000 16K rw--- [ anon ]
>> 00007f292584a000 16K rw--- [ anon ]
>>
>> Aug 4 05:50:00 b kernel: SysRq : Show Memory
>> Aug 4 05:50:00 b kernel: Mem-Info:
>> Aug 4 05:50:00 b kernel: active_anon:7757394 inactive_anon:1021319
>> isolated_anon:0#012 active_file:3733324 inactive_file:2136476 isolated_
>> file:0#012 unevictable:0 dirty:1766 writeback:6 wbtmp:0 unstable:0#012
>> slab_reclaimable:2003687 slab_unreclaimable:901391#012 mapped:316734
>> shmem:2381810 pagetables:63163 bounce:0#012 free:4851283 free_pcp:11332
>> free_cma:0
>> Aug 4 05:50:00 bravo kernel: Node 0 DMA free:15888kB min:8kB low:8kB
>> high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_
>> file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB
>> present:15972kB managed:15888kB mlocked:0kB dirty:0kB writeback:0kB
>> mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB
>> kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB
>> local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0
>> all_unreclaimable? yes
>> Aug 4 05:50:00 b kernel: lowmem_reserve[]: 0 1679 64139 64139
>>
>> # cat /proc/buddyinfo
>> Node 0, zone DMA 0 0 1 0 2 1 1
>> 0 1 1 3
>> Node 0, zone DMA32 5284 6753 6677 1083 410 59 1
>> 0 0 0 0
>> Node 0, zone Normal 500327 638958 406737 14690 872 106 11
>> 0 0 0 0
>> Node 1, zone Normal 584840 291640 188 0 0 0 0
>> 0 0 0 0
>>
>>
>> The only correlation I see in having the error is the number of server {}
>> blocks (close to 10k) which then makes the nginx process consume ~ 4GB of
>> mem with a single worker process and then a reload is done
>>
>>
>>
>>
>> On Thu, Aug 2, 2018 at 6:02 PM Igor A. Ippolitov <[email protected]>
>> wrote:
>>
>>> Anoop,
>>>
>>> There are two guesses: either mmap allocations limit is hit or memory
>>> is way too fragmented.
>>> Could you please track amount of mapped regions for a worker with pmap
>>> and amount of 16k areas in Normal zones (it is the third number)?
>>>
>>> You can also set vm.max_map_count to a higher number (like 20 times
>>> higher than default) and look if the error is gone.
>>>
>>> Please, let me know if increasing vm.max_map_count helps you.
>>>
>>> On 02.08.2018 13:06, Anoop Alias wrote:
>>>
>>> Hi Igor,
>>>
>>> The error happens randomly
>>>
>>> 2018/08/02 06:52:42 [emerg] 874514#874514: posix_memalign(16, 16384)
>>> failed (12: Cannot allocate memory)
>>> 2018/08/02 09:42:53 [emerg] 872996#872996: posix_memalign(16, 16384)
>>> failed (12: Cannot allocate memory)
>>> 2018/08/02 10:16:14 [emerg] 877611#877611: posix_memalign(16, 16384)
>>> failed (12: Cannot allocate memory)
>>> 2018/08/02 10:16:48 [emerg] 879410#879410: posix_memalign(16, 16384)
>>> failed (12: Cannot allocate memory)
>>> 2018/08/02 10:17:55 [emerg] 876563#876563: posix_memalign(16, 16384)
>>> failed (12: Cannot allocate memory)
>>> 2018/08/02 10:20:21 [emerg] 879263#879263: posix_memalign(16, 16384)
>>> failed (12: Cannot allocate memory)
>>> 2018/08/02 10:20:51 [emerg] 878991#878991: posix_memalign(16, 16384)
>>> failed (12: Cannot allocate memory)
>>>
>>> # date
>>> Thu Aug 2 10:58:48 BST 2018
>>>
>>> ------------------------------------------
>>> # cat /proc/buddyinfo
>>> Node 0, zone DMA 0 0 1 0 2 1 1
>>> 0 1 1 3
>>> Node 0, zone DMA32 11722 11057 4663 1647 609 72 10
>>> 7 1 0 0
>>> Node 0, zone Normal 755026 710760 398136 21462 1114 18 1
>>> 0 0 0 0
>>> Node 1, zone Normal 341295 801810 179604 256 0 0 0
>>> 0 0 0 0
>>> -----------------------------------------
>>>
>>>
>>> slabinfo - version: 2.1
>>> # name <active_objs> <num_objs> <objsize> <objperslab>
>>> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata
>>> <active_slabs> <num_slabs> <sharedavail>
>>> SCTPv6 21 21 1536 21 8 : tunables 0 0
>>> 0 : slabdata 1 1 0
>>> SCTP 0 0 1408 23 8 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> kcopyd_job 0 0 3312 9 8 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> dm_uevent 0 0 2608 12 8 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> nf_conntrack_ffffffff81acbb00 14054 14892 320 51 4 : tunables
>>> 0 0 0 : slabdata 292 292 0
>>> lvp_cache 36 36 224 36 2 : tunables 0 0
>>> 0 : slabdata 1 1 0
>>> lve_struct 4140 4140 352 46 4 : tunables 0 0
>>> 0 : slabdata 90 90 0
>>> fat_inode_cache 0 0 744 44 8 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> fat_cache 0 0 40 102 1 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> isofs_inode_cache 0 0 664 49 8 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> ext4_inode_cache 30 30 1088 30 8 : tunables 0 0
>>> 0 : slabdata 1 1 0
>>> ext4_xattr 0 0 88 46 1 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> ext4_free_data 0 0 64 64 1 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> ext4_allocation_context 32 32 128 32 1 : tunables 0
>>> 0 0 : slabdata 1 1 0
>>> ext4_io_end 0 0 72 56 1 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> ext4_extent_status 102 102 40 102 1 : tunables 0 0
>>> 0 : slabdata 1 1 0
>>> jbd2_journal_handle 0 0 48 85 1 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> jbd2_journal_head 0 0 112 36 1 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> jbd2_revoke_table_s 256 256 16 256 1 : tunables 0 0
>>> 0 : slabdata 1 1 0
>>> jbd2_revoke_record_s 0 0 32 128 1 : tunables 0
>>> 0 0 : slabdata 0 0 0
>>> kvm_async_pf 0 0 136 30 1 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> kvm_vcpu 0 0 18560 1 8 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> xfs_dqtrx 992 992 528 31 4 : tunables 0 0
>>> 0 : slabdata 32 32 0
>>> xfs_dquot 3264 3264 472 34 4 : tunables 0 0
>>> 0 : slabdata 96 96 0
>>> xfs_ili 4342175 4774399 152 53 2 : tunables 0 0
>>> 0 : slabdata 90083 90083 0
>>> xfs_inode 4915588 5486076 1088 30 8 : tunables 0 0
>>> 0 : slabdata 182871 182871 0
>>> xfs_efd_item 2680 2760 400 40 4 : tunables 0 0
>>> 0 : slabdata 69 69 0
>>> xfs_da_state 1088 1088 480 34 4 : tunables 0 0
>>> 0 : slabdata 32 32 0
>>> xfs_btree_cur 1248 1248 208 39 2 : tunables 0 0
>>> 0 : slabdata 32 32 0
>>> xfs_log_ticket 14874 15048 184 44 2 : tunables 0 0
>>> 0 : slabdata 342 342 0
>>> xfs_ioend 12909 13104 104 39 1 : tunables 0 0
>>> 0 : slabdata 336 336 0
>>> scsi_cmd_cache 5400 5652 448 36 4 : tunables 0 0
>>> 0 : slabdata 157 157 0
>>> ve_struct 0 0 848 38 8 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> ip6_dst_cache 1152 1152 448 36 4 : tunables 0 0
>>> 0 : slabdata 32 32 0
>>> RAWv6 910 910 1216 26 8 : tunables 0 0
>>> 0 : slabdata 35 35 0
>>> UDPLITEv6 0 0 1216 26 8 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> UDPv6 832 832 1216 26 8 : tunables 0 0
>>> 0 : slabdata 32 32 0
>>> tw_sock_TCPv6 1152 1376 256 32 2 : tunables 0 0
>>> 0 : slabdata 43 43 0
>>> TCPv6 510 510 2176 15 8 : tunables 0 0
>>> 0 : slabdata 34 34 0
>>> cfq_queue 3698 5145 232 35 2 : tunables 0 0
>>> 0 : slabdata 147 147 0
>>> bsg_cmd 0 0 312 52 4 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> mqueue_inode_cache 136 136 960 34 8 : tunables 0 0
>>> 0 : slabdata 4 4 0
>>> hugetlbfs_inode_cache 1632 1632 632 51 8 : tunables 0
>>> 0 0 : slabdata 32 32 0
>>> configfs_dir_cache 1472 1472 88 46 1 : tunables 0 0
>>> 0 : slabdata 32 32 0
>>> dquot 0 0 256 32 2 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> userfaultfd_ctx_cache 32 32 128 32 1 : tunables 0
>>> 0 0 : slabdata 1 1 0
>>> fanotify_event_info 2336 2336 56 73 1 : tunables 0 0
>>> 0 : slabdata 32 32 0
>>> dio 6171 6222 640 51 8 : tunables 0 0
>>> 0 : slabdata 122 122 0
>>> pid_namespace 42 42 2192 14 8 : tunables 0 0
>>> 0 : slabdata 3 3 0
>>> posix_timers_cache 1056 1056 248 33 2 : tunables 0 0
>>> 0 : slabdata 32 32 0
>>> UDP-Lite 0 0 1088 30 8 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> flow_cache 2268 2296 144 28 1 : tunables 0 0
>>> 0 : slabdata 82 82 0
>>> xfrm_dst_cache 896 896 576 28 4 : tunables 0 0
>>> 0 : slabdata 32 32 0
>>> ip_fib_alias 2720 2720 48 85 1 : tunables 0 0
>>> 0 : slabdata 32 32 0
>>> RAW 3977 4224 1024 32 8 : tunables 0 0
>>> 0 : slabdata 132 132 0
>>> UDP 4110 4110 1088 30 8 : tunables 0 0
>>> 0 : slabdata 137 137 0
>>> tw_sock_TCP 4756 5216 256 32 2 : tunables 0 0
>>> 0 : slabdata 163 163 0
>>> TCP 2705 2768 1984 16 8 : tunables 0 0
>>> 0 : slabdata 173 173 0
>>> scsi_data_buffer 5440 5440 24 170 1 : tunables 0 0
>>> 0 : slabdata 32 32 0
>>> blkdev_queue 154 154 2208 14 8 : tunables 0 0
>>> 0 : slabdata 11 11 0
>>> blkdev_requests 4397688 4405884 384 42 4 : tunables 0 0
>>> 0 : slabdata 104902 104902 0
>>> blkdev_ioc 11232 11232 112 36 1 : tunables 0 0
>>> 0 : slabdata 312 312 0
>>> user_namespace 0 0 1304 25 8 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> sock_inode_cache 12282 12282 704 46 8 : tunables 0 0
>>> 0 : slabdata 267 267 0
>>> file_lock_cache 20056 20960 200 40 2 : tunables 0 0
>>> 0 : slabdata 524 524 0
>>> net_namespace 6 6 5056 6 8 : tunables 0 0
>>> 0 : slabdata 1 1 0
>>> shmem_inode_cache 16970 18952 712 46 8 : tunables 0 0
>>> 0 : slabdata 412 412 0
>>> Acpi-ParseExt 39491 40432 72 56 1 : tunables 0 0
>>> 0 : slabdata 722 722 0
>>> Acpi-State 1683 1683 80 51 1 : tunables 0 0
>>> 0 : slabdata 33 33 0
>>> Acpi-Namespace 11424 11424 40 102 1 : tunables 0 0
>>> 0 : slabdata 112 112 0
>>> task_delay_info 15336 15336 112 36 1 : tunables 0 0
>>> 0 : slabdata 426 426 0
>>> taskstats 1568 1568 328 49 4 : tunables 0 0
>>> 0 : slabdata 32 32 0
>>> proc_inode_cache 169897 190608 680 48 8 : tunables 0 0
>>> 0 : slabdata 3971 3971 0
>>> sigqueue 2208 2208 168 48 2 : tunables 0 0
>>> 0 : slabdata 46 46 0
>>> bdev_cache 792 792 896 36 8 : tunables 0 0
>>> 0 : slabdata 22 22 0
>>> sysfs_dir_cache 74698 74698 120 34 1 : tunables 0 0
>>> 0 : slabdata 2197 2197 0
>>> mnt_cache 163197 163424 256 32 2 : tunables 0 0
>>> 0 : slabdata 5107 5107 0
>>> filp 64607 97257 320 51 4 : tunables 0 0
>>> 0 : slabdata 1907 1907 0
>>> inode_cache 370744 370947 616 53 8 : tunables 0 0
>>> 0 : slabdata 6999 6999 0
>>> dentry 1316262 2139228 192 42 2 : tunables 0 0
>>> 0 : slabdata 50934 50934 0
>>> iint_cache 0 0 80 51 1 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> buffer_head 1441470 2890290 104 39 1 : tunables 0 0
>>> 0 : slabdata 74110 74110 0
>>> vm_area_struct 194998 196840 216 37 2 : tunables 0 0
>>> 0 : slabdata 5320 5320 0
>>> mm_struct 2679 2760 1600 20 8 : tunables 0 0
>>> 0 : slabdata 138 138 0
>>> files_cache 8680 8925 640 51 8 : tunables 0 0
>>> 0 : slabdata 175 175 0
>>> signal_cache 3691 3780 1152 28 8 : tunables 0 0
>>> 0 : slabdata 135 135 0
>>> sighand_cache 1950 2160 2112 15 8 : tunables 0 0
>>> 0 : slabdata 144 144 0
>>> task_xstate 8070 8658 832 39 8 : tunables 0 0
>>> 0 : slabdata 222 222 0
>>> task_struct 1913 2088 4080 8 8 : tunables 0 0
>>> 0 : slabdata 261 261 0
>>> cred_jar 31699 33936 192 42 2 : tunables 0 0
>>> 0 : slabdata 808 808 0
>>> anon_vma_chain 164026 168704 64 64 1 : tunables 0 0
>>> 0 : slabdata 2636 2636 0
>>> anon_vma 84104 84594 88 46 1 : tunables 0 0
>>> 0 : slabdata 1839 1839 0
>>> pid 11127 12576 128 32 1 : tunables 0 0
>>> 0 : slabdata 393 393 0
>>> shared_policy_node 9350 9350 48 85 1 : tunables 0 0
>>> 0 : slabdata 110 110 0
>>> numa_policy 62 62 264 31 2 : tunables 0 0
>>> 0 : slabdata 2 2 0
>>> radix_tree_node 771778 1194312 584 28 4 : tunables 0 0
>>> 0 : slabdata 42654 42654 0
>>> idr_layer_cache 2538 2565 2112 15 8 : tunables 0 0
>>> 0 : slabdata 171 171 0
>>> dma-kmalloc-8192 0 0 8192 4 8 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> dma-kmalloc-4096 0 0 4096 8 8 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> dma-kmalloc-2048 0 0 2048 16 8 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> dma-kmalloc-1024 0 0 1024 32 8 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> dma-kmalloc-512 0 0 512 32 4 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> dma-kmalloc-256 0 0 256 32 2 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> dma-kmalloc-128 0 0 128 32 1 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> dma-kmalloc-64 0 0 64 64 1 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> dma-kmalloc-32 0 0 32 128 1 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> dma-kmalloc-16 0 0 16 256 1 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> dma-kmalloc-8 0 0 8 512 1 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> dma-kmalloc-192 0 0 192 42 2 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> dma-kmalloc-96 0 0 96 42 1 : tunables 0 0
>>> 0 : slabdata 0 0 0
>>> kmalloc-8192 385 388 8192 4 8 : tunables 0 0
>>> 0 : slabdata 97 97 0
>>> kmalloc-4096 9296 10088 4096 8 8 : tunables 0 0
>>> 0 : slabdata 1261 1261 0
>>> kmalloc-2048 65061 133536 2048 16 8 : tunables 0 0
>>> 0 : slabdata 8346 8346 0
>>> kmalloc-1024 11987 21120 1024 32 8 : tunables 0 0
>>> 0 : slabdata 660 660 0
>>> kmalloc-512 107510 187072 512 32 4 : tunables 0 0
>>> 0 : slabdata 5846 5846 0
>>> kmalloc-256 160498 199104 256 32 2 : tunables 0 0
>>> 0 : slabdata 6222 6222 0
>>> kmalloc-192 144975 237426 192 42 2 : tunables 0 0
>>> 0 : slabdata 5653 5653 0
>>> kmalloc-128 36799 108096 128 32 1 : tunables 0 0
>>> 0 : slabdata 3378 3378 0
>>> kmalloc-96 99510 238896 96 42 1 : tunables 0 0
>>> 0 : slabdata 5688 5688 0
>>> kmalloc-64 7978152 8593280 64 64 1 : tunables 0 0
>>> 0 : slabdata 134270 134270 0
>>> kmalloc-32 2939882 3089664 32 128 1 : tunables 0 0
>>> 0 : slabdata 24138 24138 0
>>> kmalloc-16 172057 172288 16 256 1 : tunables 0 0
>>> 0 : slabdata 673 673 0
>>> kmalloc-8 109568 109568 8 512 1 : tunables 0 0
>>> 0 : slabdata 214 214 0
>>> kmem_cache_node 893 896 64 64 1 : tunables 0 0
>>> 0 : slabdata 14 14 0
>>> kmem_cache 612 612 320 51 4 : tunables 0 0
>>> 0 : slabdata 12 12 0
>>>
>>> -------------------------------------------------
>>>
>>>
>>> # uname -r
>>> 3.10.0-714.10.2.lve1.5.17.1.el7.x86_64
>>>
>>> --------------------------------------------------------
>>>
>>> Core part of glances
>>> http://i.imgur.com/La5JbQn.png
>>> -----------------------------------------------------------
>>>
>>> Thank you very much for looking into this
>>>
>>>
>>> On Thu, Aug 2, 2018 at 12:37 PM Igor A. Ippolitov <[email protected]>
>>> wrote:
>>>
>>>> Anoop,
>>>>
>>>> I doubt this will be the solution, but may we have a look at
>>>> /proc/buddyinfo and /proc/slabinfo the moment when nginx can't allocate
>>>> memory?
>>>>
>>>> On 02.08.2018 08:15, Anoop Alias wrote:
>>>>
>>>> Hi Maxim,
>>>>
>>>> I enabled debug and the memalign call is happening on nginx reloads and
>>>> the ENOMEM happen sometimes on the reload(not on all reloads)
>>>>
>>>> 2018/08/02 05:59:08 [notice] 872052#872052: signal process started
>>>> 2018/08/02 05:59:23 [notice] 871570#871570: signal 1 (SIGHUP) received
>>>> from 872052, reconfiguring
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: wake up, sigio 0
>>>> 2018/08/02 05:59:23 [notice] 871570#871570: reconfiguring
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 0000000002B0DA00:16384 @16 === > the memalign call on reload
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: malloc: 00000000087924D0:4560
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 000000000E442E00:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: malloc: 0000000005650850:4096
>>>> 20
>>>>
>>>>
>>>>
>>>>
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #71
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #72
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #73
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #74
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: add cleanup: 000000005340D728
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000024D3260:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000517BAF10:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053854FC0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053855FD0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053856FE0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 0000000053857FF0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: posix_memalign:
>>>> 0000000053859000:16384 @16
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385D010:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385E020:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 000000005385F030:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CD160:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CE170:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536CF180:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D0190:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D11A0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D21B0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D31C0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D41D0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D51E0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D61F0:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D7200:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D8210:4096
>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc: 00000000536D9220:4096
>>>>
>>>>
>>>> Infact there are lot of such calls during a reload
>>>>
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA17ED00:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA1B0FF0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA1E12C0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA211590:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA243880:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA271B30:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA2A3E20:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA2D20D0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA3063E0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA334690:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA366980:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA396C50:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA3C8F40:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA3F9210:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA4294E0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA45B7D0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA489A80:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA4BBD70:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA4EA020:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA51E330:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA54C5E0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA57E8D0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA5AEBA0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA5DEE70:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA611160:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA641430:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA671700:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA6A29E0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA6D5CE0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA707FD0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA736280:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA768570:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA796820:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA7CAB30:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA7F8DE0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA82B0D0:16384 @16
>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>> 00000000BA85B3A0:16384 @16
>>>>
>>>>
>>>>
>>>> What is perplexing is that the system has enough free (available RAM)
>>>> #############
>>>> # free -g
>>>> total used free shared buff/cache
>>>> available
>>>> Mem: 125 54 24 8 46
>>>> 58
>>>> Swap: 0 0 0
>>>> #############
>>>>
>>>> # ulimit -a
>>>> core file size (blocks, -c) 0
>>>> data seg size (kbytes, -d) unlimited
>>>> scheduling priority (-e) 0
>>>> file size (blocks, -f) unlimited
>>>> pending signals (-i) 514579
>>>> max locked memory (kbytes, -l) 64
>>>> max memory size (kbytes, -m) unlimited
>>>> open files (-n) 1024
>>>> pipe size (512 bytes, -p) 8
>>>> POSIX message queues (bytes, -q) 819200
>>>> real-time priority (-r) 0
>>>> stack size (kbytes, -s) 8192
>>>> cpu time (seconds, -t) unlimited
>>>> max user processes (-u) 514579
>>>> virtual memory (kbytes, -v) unlimited
>>>> file locks (-x) unlimited
>>>>
>>>> #########################################
>>>>
>>>> There is no other thing limiting memory allocation
>>>>
>>>> Any way to prevent this or probably identify/prevent this
>>>>
>>>>
>>>> On Tue, Jul 31, 2018 at 7:08 PM Maxim Dounin <[email protected]>
>>>> wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>> On Tue, Jul 31, 2018 at 09:52:29AM +0530, Anoop Alias wrote:
>>>>>
>>>>> > I am repeatedly seeing errors like
>>>>> >
>>>>> > ######################
>>>>> > 2018/07/31 03:46:33 [emerg] 2854560#2854560: posix_memalign(16,
>>>>> 16384)
>>>>> > failed (12: Cannot allocate memory)
>>>>> > 2018/07/31 03:54:09 [emerg] 2890190#2890190: posix_memalign(16,
>>>>> 16384)
>>>>> > failed (12: Cannot allocate memory)
>>>>> > 2018/07/31 04:08:36 [emerg] 2939230#2939230: posix_memalign(16,
>>>>> 16384)
>>>>> > failed (12: Cannot allocate memory)
>>>>> > 2018/07/31 04:24:48 [emerg] 2992650#2992650: posix_memalign(16,
>>>>> 16384)
>>>>> > failed (12: Cannot allocate memory)
>>>>> > 2018/07/31 04:42:09 [emerg] 3053092#3053092: posix_memalign(16,
>>>>> 16384)
>>>>> > failed (12: Cannot allocate memory)
>>>>> > 2018/07/31 04:42:17 [emerg] 3053335#3053335: posix_memalign(16,
>>>>> 16384)
>>>>> > failed (12: Cannot allocate memory)
>>>>> > 2018/07/31 04:42:28 [emerg] 3053937#3053937: posix_memalign(16,
>>>>> 16384)
>>>>> > failed (12: Cannot allocate memory)
>>>>> > 2018/07/31 04:47:54 [emerg] 3070638#3070638: posix_memalign(16,
>>>>> 16384)
>>>>> > failed (12: Cannot allocate memory)
>>>>> > ####################
>>>>> >
>>>>> > on a few servers
>>>>> >
>>>>> > The servers have enough memory free and the swap usage is 0, yet
>>>>> somehow
>>>>> > the kernel denies the posix_memalign with ENOMEM ( this is what I
>>>>> think is
>>>>> > happening!)
>>>>> >
>>>>> > The numbers requested are always 16, 16k . This makes me suspicious
>>>>> >
>>>>> > I have no setting in nginx.conf that reference a 16k
>>>>> >
>>>>> > Is there any chance of finding out what requests this and why this
>>>>> is not
>>>>> > fulfilled
>>>>>
>>>>> There are at least some buffers which default to 16k - for
>>>>> example, ssl_buffer_size (http://nginx.org/r/ssl_buffer_size).
>>>>>
>>>>> You may try debugging log to futher find out where the particular
>>>>> allocation happens, see here for details:
>>>>>
>>>>> http://nginx.org/en/docs/debugging_log.html
>>>>>
>>>>> But I don't really think it worth the effort. The error is pretty
>>>>> clear, and it's better to focus on why these allocations are
>>>>> denied. Likely you are hitting some limit.
>>>>>
>>>>> --
>>>>> Maxim Dounin
>>>>> http://mdounin.ru/
>>>>> _______________________________________________
>>>>> nginx mailing list
>>>>> nginx@nginx.org
>>>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>>>>
>>>>
>>>>
>>>> --
>>>> *Anoop P Alias*
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> nginx mailing listnginx@nginx.orghttp://mailman.nginx.org/mailman/listinfo/nginx
>>>>
>>>>
>>>> _______________________________________________
>>>> nginx mailing list
>>>> nginx@nginx.org
>>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>>
>>>
>>>
>>> --
>>> *Anoop P Alias*
>>>
>>>
>>>
>>> _______________________________________________
>>> nginx mailing listnginx@nginx.orghttp://mailman.nginx.org/mailman/listinfo/nginx
>>>
>>>
>>> _______________________________________________
>>> nginx mailing list
>>> nginx@nginx.org
>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>
>>
>>
>> --
>> *Anoop P Alias*
>>
>>
>>
>> _______________________________________________
>> nginx mailing listnginx@nginx.orghttp://mailman.nginx.org/mailman/listinfo/nginx
>>
>>
>> _______________________________________________
>> nginx mailing list
>> nginx@nginx.org
>> http://mailman.nginx.org/mailman/listinfo/nginx
>
>
>
> --
> *Anoop P Alias*
>
>
>
> _______________________________________________
> nginx mailing listnginx@nginx.orghttp://mailman.nginx.org/mailman/listinfo/nginx
>
>
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx



--
*Anoop P Alias*
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Anoop Alias
Re: posix_memalign error
August 10, 2018 08:50AM
I may have found the root cause of this issue. Many thanks to Igor for the
valuable inputs

The issue had something to do with the fact that I was calling nginx -s
reload from python subprocess module

and I believe the error was coming from the fork in python


https://stackoverflow.com/questions/1367373/python-subprocess-popen-oserror-errno-12-cannot-allocate-memory
https://stackoverflow.com/questions/5306075/python-memory-allocation-error-using-subprocess-popen

As Igor suggested I changed the subprocess call in python to signal (SIGHUP
to master binary) and the logs don't have the ENOMEM error anymore, at
least in the past 12+ hours

The memory usage for ~10k virtual hosts are a bit high though, but things
do work as expected and no more errors

Thank you all again

On Wed, Aug 8, 2018 at 9:03 AM Anoop Alias <[email protected]> wrote:

> Hi Igor,
>
> Yes the server runs other software including httpd with a similar number
> of vhost
>
> # grep "<VirtualHost" /etc/apache2/conf/httpd.conf|wc -l
> 5168
>
> I haven't found issue with the other softwares in the logs relating to
> memory
>
> Infact httpd (event mpm) use lesser memory to load similar config
>
> # ps_mem| head -1 && ps_mem |grep httpd
> Private + Shared = RAM used Program
> 585.6 MiB + 392.0 MiB = 977.6 MiB httpd (63)
>
> # ps_mem| head -1 && ps_mem |grep nginx
> Private + Shared = RAM used Program
> 999.8 MiB + 2.5 GiB = 3.5 GiB nginx (3)
>
> The server is a shared hosting one and runs CloudLinux , but as far as I
> know ,CloudLinux applies limits to only user level process and not nginx
>
> The nginx HUP is needed as this is triggered by changes in apache
> configuration and nginx need to reload the new config . For log file reload
> SIGUSR1 is used
>
>
>
>
>
> On Tue, Aug 7, 2018 at 5:50 PM Igor A. Ippolitov <[email protected]>
> wrote:
>
>> Anoop,
>>
>> I don't see any troubles with your configuration.
>> Also, if you have 120G of RAM and a single worker - the problem is not in
>> nginx.
>> Do you have other software running on the host?
>>
>> Basically, you just run out of memory.
>>
>> You can optimize your reload though: use "service nginx reload" (or "kill
>> -SIGHUP") to reload nginx configuration.
>> When you do nginx -s reload - you make nginx parse configuration (and it
>> requires memory) and then send a signal to the running master. You can
>> avoid this overhead with 'service' command as it uses 'kill' documented in
>> the manual page.
>>
>> On 06.08.2018 22:55, Anoop Alias wrote:
>>
>> Hi Igor,
>>
>> Config is reloaded using
>>
>> /usr/sbin/nginx -s reload
>>
>> this is invoked from a python/shell script ( Nginx is installed on a web
>> control panel )
>>
>> The top-level Nginx config is in the gist below
>>
>> https://gist.github.com/AnoopAlias/ba5ad6749a586c7e267672ee65b32b3a
>>
>> It further includes ~8k server blocks or more in some servers. Out of
>> this 2/3 are server {} blocks with TLS config and 1/3 non-TLS ones
>>
>> ]# pwd
>> /etc/nginx/sites-enabled
>> # grep "server {" *|wc -l
>> 7886
>>
>> And yes most of them are very similar and mostly proxy to upstream httpd
>>
>> I have tried removing all the loadable modules and even tried an older
>> version of nginx and all produce the error
>>
>>
>> # numastat -m
>>
>> Per-node system memory usage (in MBs):
>> Node 0 Node 1 Total
>> --------------- --------------- ---------------
>> MemTotal 65430.84 65536.00 130966.84
>> MemFree 5491.26 40.89 5532.15
>> MemUsed 59939.58 65495.11 125434.69
>> Active 22295.61 21016.09 43311.70
>> Inactive 8742.76 4662.48 13405.24
>> Active(anon) 16717.10 16572.19 33289.29
>> Inactive(anon) 2931.94 1388.14 4320.08
>> Active(file) 5578.50 4443.91 10022.41
>> Inactive(file) 5810.82 3274.34 9085.16
>> Unevictable 0.00 0.00 0.00
>> Mlocked 0.00 0.00 0.00
>> Dirty 7.04 1.64 8.67
>> Writeback 0.00 0.00 0.00
>> FilePages 18458.93 10413.97 28872.90
>> Mapped 862.14 413.38 1275.52
>> AnonPages 12579.49 15264.37 27843.86
>> Shmem 7069.52 2695.71 9765.23
>> KernelStack 18.34 3.03 21.38
>> PageTables 153.14 107.77 260.90
>> NFS_Unstable 0.00 0.00 0.00
>> Bounce 0.00 0.00 0.00
>> WritebackTmp 0.00 0.00 0.00
>> Slab 4830.68 2254.55 7085.22
>> SReclaimable 2061.05 921.72 2982.77
>> SUnreclaim 2769.62 1332.83 4102.45
>> AnonHugePages 4.00 2.00 6.00
>> HugePages_Total 0.00 0.00 0.00
>> HugePages_Free 0.00 0.00 0.00
>> HugePages_Surp 0.00 0.00 0.00
>>
>>
>> Thanks,
>>
>>
>>
>>
>>
>> On Mon, Aug 6, 2018 at 6:33 PM Igor A. Ippolitov <[email protected]>
>> wrote:
>>
>>> Anoop,
>>>
>>> I suppose, most of your 10k servers are very similar, right?
>>> Please, post top level configuration and a typical server{}, please.
>>>
>>> Also, how do you reload configuration? With 'service nginx reload' or
>>> may be other commands?
>>>
>>> It looks like you have a lot of fragmented memory and only 4gb free in
>>> the second numa node.
>>> So, I'd say this is OK that you are getting errors from allocating a 16k
>>> stripes.
>>>
>>> Could you please post numastat -m output additionally. Just to make sure
>>> you have half of the memory for the second CPU.
>>> And we'll have a look if memory utilization may be optimized based on
>>> your configuration.
>>>
>>> Regards,
>>> Igor.
>>>
>>> On 04.08.2018 07:54, Anoop Alias wrote:
>>>
>>> Hi Igor,
>>>
>>> Setting vm.max_map_count to 20x the normal value did not help
>>>
>>> The issue happens on a group of servers and among the group, it shows up
>>> only in servers which have ~10k server{} blocks
>>>
>>> On servers that have lower number of server{} blocks , the ENOMEM issue
>>> is not there
>>>
>>> Also, I can find that the RAM usage of the Nginx process is directly
>>> proportional to the number of server {} blocks
>>>
>>> For example on a server having the problem
>>>
>>> # ps_mem| head -1 && ps_mem |grep nginx
>>> Private + Shared = RAM used Program
>>> 1.0 GiB + 2.8 GiB = 3.8 GiB nginx (3)
>>>
>>>
>>> That is for a single worker process with 4 threads in thread_pool
>>> # pstree|grep nginx
>>> |-nginx-+-nginx---4*[{nginx}]
>>> | `-nginx
>>>
>>> Whatever config change I try the memory usage seem to mostly depend on
>>> the number of server contexts defined
>>>
>>> Now the issue mostly happen in nginx reload ,when one more worker
>>> process will be active in shutting down mode
>>>
>>> I believe the memalign error is thrown by the worker being shutdown,
>>> this is because the sites work after the error and also the pid mentioned
>>> in the error would have gone when I check ps
>>>
>>>
>>> # pmap 948965|grep 16K
>>> 00007f2923ff2000 16K r-x-- ngx_http_redis2_module.so
>>> 00007f2924fd7000 16K r---- libc-2.17.so
>>> 00007f2925431000 16K rw--- [ anon ]
>>> 00007f292584a000 16K rw--- [ anon ]
>>>
>>> Aug 4 05:50:00 b kernel: SysRq : Show Memory
>>> Aug 4 05:50:00 b kernel: Mem-Info:
>>> Aug 4 05:50:00 b kernel: active_anon:7757394 inactive_anon:1021319
>>> isolated_anon:0#012 active_file:3733324 inactive_file:2136476 isolated_
>>> file:0#012 unevictable:0 dirty:1766 writeback:6 wbtmp:0 unstable:0#012
>>> slab_reclaimable:2003687 slab_unreclaimable:901391#012 mapped:316734
>>> shmem:2381810 pagetables:63163 bounce:0#012 free:4851283 free_pcp:11332
>>> free_cma:0
>>> Aug 4 05:50:00 bravo kernel: Node 0 DMA free:15888kB min:8kB low:8kB
>>> high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_
>>> file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB
>>> present:15972kB managed:15888kB mlocked:0kB dirty:0kB writeback:0kB
>>> mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB
>>> kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB
>>> local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0
>>> all_unreclaimable? yes
>>> Aug 4 05:50:00 b kernel: lowmem_reserve[]: 0 1679 64139 64139
>>>
>>> # cat /proc/buddyinfo
>>> Node 0, zone DMA 0 0 1 0 2 1 1
>>> 0 1 1 3
>>> Node 0, zone DMA32 5284 6753 6677 1083 410 59 1
>>> 0 0 0 0
>>> Node 0, zone Normal 500327 638958 406737 14690 872 106 11
>>> 0 0 0 0
>>> Node 1, zone Normal 584840 291640 188 0 0 0 0
>>> 0 0 0 0
>>>
>>>
>>> The only correlation I see in having the error is the number of
>>> server {} blocks (close to 10k) which then makes the nginx process consume
>>> ~ 4GB of mem with a single worker process and then a reload is done
>>>
>>>
>>>
>>>
>>> On Thu, Aug 2, 2018 at 6:02 PM Igor A. Ippolitov <[email protected]>
>>> wrote:
>>>
>>>> Anoop,
>>>>
>>>> There are two guesses: either mmap allocations limit is hit or memory
>>>> is way too fragmented.
>>>> Could you please track amount of mapped regions for a worker with pmap
>>>> and amount of 16k areas in Normal zones (it is the third number)?
>>>>
>>>> You can also set vm.max_map_count to a higher number (like 20 times
>>>> higher than default) and look if the error is gone.
>>>>
>>>> Please, let me know if increasing vm.max_map_count helps you.
>>>>
>>>> On 02.08.2018 13:06, Anoop Alias wrote:
>>>>
>>>> Hi Igor,
>>>>
>>>> The error happens randomly
>>>>
>>>> 2018/08/02 06:52:42 [emerg] 874514#874514: posix_memalign(16, 16384)
>>>> failed (12: Cannot allocate memory)
>>>> 2018/08/02 09:42:53 [emerg] 872996#872996: posix_memalign(16, 16384)
>>>> failed (12: Cannot allocate memory)
>>>> 2018/08/02 10:16:14 [emerg] 877611#877611: posix_memalign(16, 16384)
>>>> failed (12: Cannot allocate memory)
>>>> 2018/08/02 10:16:48 [emerg] 879410#879410: posix_memalign(16, 16384)
>>>> failed (12: Cannot allocate memory)
>>>> 2018/08/02 10:17:55 [emerg] 876563#876563: posix_memalign(16, 16384)
>>>> failed (12: Cannot allocate memory)
>>>> 2018/08/02 10:20:21 [emerg] 879263#879263: posix_memalign(16, 16384)
>>>> failed (12: Cannot allocate memory)
>>>> 2018/08/02 10:20:51 [emerg] 878991#878991: posix_memalign(16, 16384)
>>>> failed (12: Cannot allocate memory)
>>>>
>>>> # date
>>>> Thu Aug 2 10:58:48 BST 2018
>>>>
>>>> ------------------------------------------
>>>> # cat /proc/buddyinfo
>>>> Node 0, zone DMA 0 0 1 0 2 1 1
>>>> 0 1 1 3
>>>> Node 0, zone DMA32 11722 11057 4663 1647 609 72 10
>>>> 7 1 0 0
>>>> Node 0, zone Normal 755026 710760 398136 21462 1114 18 1
>>>> 0 0 0 0
>>>> Node 1, zone Normal 341295 801810 179604 256 0 0 0
>>>> 0 0 0 0
>>>> -----------------------------------------
>>>>
>>>>
>>>> slabinfo - version: 2.1
>>>> # name <active_objs> <num_objs> <objsize> <objperslab>
>>>> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata
>>>> <active_slabs> <num_slabs> <sharedavail>
>>>> SCTPv6 21 21 1536 21 8 : tunables 0 0
>>>> 0 : slabdata 1 1 0
>>>> SCTP 0 0 1408 23 8 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> kcopyd_job 0 0 3312 9 8 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> dm_uevent 0 0 2608 12 8 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> nf_conntrack_ffffffff81acbb00 14054 14892 320 51 4 :
>>>> tunables 0 0 0 : slabdata 292 292 0
>>>> lvp_cache 36 36 224 36 2 : tunables 0 0
>>>> 0 : slabdata 1 1 0
>>>> lve_struct 4140 4140 352 46 4 : tunables 0 0
>>>> 0 : slabdata 90 90 0
>>>> fat_inode_cache 0 0 744 44 8 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> fat_cache 0 0 40 102 1 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> isofs_inode_cache 0 0 664 49 8 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> ext4_inode_cache 30 30 1088 30 8 : tunables 0 0
>>>> 0 : slabdata 1 1 0
>>>> ext4_xattr 0 0 88 46 1 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> ext4_free_data 0 0 64 64 1 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> ext4_allocation_context 32 32 128 32 1 : tunables 0
>>>> 0 0 : slabdata 1 1 0
>>>> ext4_io_end 0 0 72 56 1 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> ext4_extent_status 102 102 40 102 1 : tunables 0 0
>>>> 0 : slabdata 1 1 0
>>>> jbd2_journal_handle 0 0 48 85 1 : tunables 0
>>>> 0 0 : slabdata 0 0 0
>>>> jbd2_journal_head 0 0 112 36 1 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> jbd2_revoke_table_s 256 256 16 256 1 : tunables 0
>>>> 0 0 : slabdata 1 1 0
>>>> jbd2_revoke_record_s 0 0 32 128 1 : tunables 0
>>>> 0 0 : slabdata 0 0 0
>>>> kvm_async_pf 0 0 136 30 1 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> kvm_vcpu 0 0 18560 1 8 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> xfs_dqtrx 992 992 528 31 4 : tunables 0 0
>>>> 0 : slabdata 32 32 0
>>>> xfs_dquot 3264 3264 472 34 4 : tunables 0 0
>>>> 0 : slabdata 96 96 0
>>>> xfs_ili 4342175 4774399 152 53 2 : tunables 0
>>>> 0 0 : slabdata 90083 90083 0
>>>> xfs_inode 4915588 5486076 1088 30 8 : tunables 0
>>>> 0 0 : slabdata 182871 182871 0
>>>> xfs_efd_item 2680 2760 400 40 4 : tunables 0 0
>>>> 0 : slabdata 69 69 0
>>>> xfs_da_state 1088 1088 480 34 4 : tunables 0 0
>>>> 0 : slabdata 32 32 0
>>>> xfs_btree_cur 1248 1248 208 39 2 : tunables 0 0
>>>> 0 : slabdata 32 32 0
>>>> xfs_log_ticket 14874 15048 184 44 2 : tunables 0 0
>>>> 0 : slabdata 342 342 0
>>>> xfs_ioend 12909 13104 104 39 1 : tunables 0 0
>>>> 0 : slabdata 336 336 0
>>>> scsi_cmd_cache 5400 5652 448 36 4 : tunables 0 0
>>>> 0 : slabdata 157 157 0
>>>> ve_struct 0 0 848 38 8 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> ip6_dst_cache 1152 1152 448 36 4 : tunables 0 0
>>>> 0 : slabdata 32 32 0
>>>> RAWv6 910 910 1216 26 8 : tunables 0 0
>>>> 0 : slabdata 35 35 0
>>>> UDPLITEv6 0 0 1216 26 8 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> UDPv6 832 832 1216 26 8 : tunables 0 0
>>>> 0 : slabdata 32 32 0
>>>> tw_sock_TCPv6 1152 1376 256 32 2 : tunables 0 0
>>>> 0 : slabdata 43 43 0
>>>> TCPv6 510 510 2176 15 8 : tunables 0 0
>>>> 0 : slabdata 34 34 0
>>>> cfq_queue 3698 5145 232 35 2 : tunables 0 0
>>>> 0 : slabdata 147 147 0
>>>> bsg_cmd 0 0 312 52 4 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> mqueue_inode_cache 136 136 960 34 8 : tunables 0 0
>>>> 0 : slabdata 4 4 0
>>>> hugetlbfs_inode_cache 1632 1632 632 51 8 : tunables 0
>>>> 0 0 : slabdata 32 32 0
>>>> configfs_dir_cache 1472 1472 88 46 1 : tunables 0 0
>>>> 0 : slabdata 32 32 0
>>>> dquot 0 0 256 32 2 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> userfaultfd_ctx_cache 32 32 128 32 1 : tunables 0
>>>> 0 0 : slabdata 1 1 0
>>>> fanotify_event_info 2336 2336 56 73 1 : tunables 0
>>>> 0 0 : slabdata 32 32 0
>>>> dio 6171 6222 640 51 8 : tunables 0 0
>>>> 0 : slabdata 122 122 0
>>>> pid_namespace 42 42 2192 14 8 : tunables 0 0
>>>> 0 : slabdata 3 3 0
>>>> posix_timers_cache 1056 1056 248 33 2 : tunables 0 0
>>>> 0 : slabdata 32 32 0
>>>> UDP-Lite 0 0 1088 30 8 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> flow_cache 2268 2296 144 28 1 : tunables 0 0
>>>> 0 : slabdata 82 82 0
>>>> xfrm_dst_cache 896 896 576 28 4 : tunables 0 0
>>>> 0 : slabdata 32 32 0
>>>> ip_fib_alias 2720 2720 48 85 1 : tunables 0 0
>>>> 0 : slabdata 32 32 0
>>>> RAW 3977 4224 1024 32 8 : tunables 0 0
>>>> 0 : slabdata 132 132 0
>>>> UDP 4110 4110 1088 30 8 : tunables 0 0
>>>> 0 : slabdata 137 137 0
>>>> tw_sock_TCP 4756 5216 256 32 2 : tunables 0 0
>>>> 0 : slabdata 163 163 0
>>>> TCP 2705 2768 1984 16 8 : tunables 0 0
>>>> 0 : slabdata 173 173 0
>>>> scsi_data_buffer 5440 5440 24 170 1 : tunables 0 0
>>>> 0 : slabdata 32 32 0
>>>> blkdev_queue 154 154 2208 14 8 : tunables 0 0
>>>> 0 : slabdata 11 11 0
>>>> blkdev_requests 4397688 4405884 384 42 4 : tunables 0
>>>> 0 0 : slabdata 104902 104902 0
>>>> blkdev_ioc 11232 11232 112 36 1 : tunables 0 0
>>>> 0 : slabdata 312 312 0
>>>> user_namespace 0 0 1304 25 8 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> sock_inode_cache 12282 12282 704 46 8 : tunables 0 0
>>>> 0 : slabdata 267 267 0
>>>> file_lock_cache 20056 20960 200 40 2 : tunables 0 0
>>>> 0 : slabdata 524 524 0
>>>> net_namespace 6 6 5056 6 8 : tunables 0 0
>>>> 0 : slabdata 1 1 0
>>>> shmem_inode_cache 16970 18952 712 46 8 : tunables 0 0
>>>> 0 : slabdata 412 412 0
>>>> Acpi-ParseExt 39491 40432 72 56 1 : tunables 0 0
>>>> 0 : slabdata 722 722 0
>>>> Acpi-State 1683 1683 80 51 1 : tunables 0 0
>>>> 0 : slabdata 33 33 0
>>>> Acpi-Namespace 11424 11424 40 102 1 : tunables 0 0
>>>> 0 : slabdata 112 112 0
>>>> task_delay_info 15336 15336 112 36 1 : tunables 0 0
>>>> 0 : slabdata 426 426 0
>>>> taskstats 1568 1568 328 49 4 : tunables 0 0
>>>> 0 : slabdata 32 32 0
>>>> proc_inode_cache 169897 190608 680 48 8 : tunables 0 0
>>>> 0 : slabdata 3971 3971 0
>>>> sigqueue 2208 2208 168 48 2 : tunables 0 0
>>>> 0 : slabdata 46 46 0
>>>> bdev_cache 792 792 896 36 8 : tunables 0 0
>>>> 0 : slabdata 22 22 0
>>>> sysfs_dir_cache 74698 74698 120 34 1 : tunables 0 0
>>>> 0 : slabdata 2197 2197 0
>>>> mnt_cache 163197 163424 256 32 2 : tunables 0 0
>>>> 0 : slabdata 5107 5107 0
>>>> filp 64607 97257 320 51 4 : tunables 0 0
>>>> 0 : slabdata 1907 1907 0
>>>> inode_cache 370744 370947 616 53 8 : tunables 0 0
>>>> 0 : slabdata 6999 6999 0
>>>> dentry 1316262 2139228 192 42 2 : tunables 0
>>>> 0 0 : slabdata 50934 50934 0
>>>> iint_cache 0 0 80 51 1 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> buffer_head 1441470 2890290 104 39 1 : tunables 0
>>>> 0 0 : slabdata 74110 74110 0
>>>> vm_area_struct 194998 196840 216 37 2 : tunables 0 0
>>>> 0 : slabdata 5320 5320 0
>>>> mm_struct 2679 2760 1600 20 8 : tunables 0 0
>>>> 0 : slabdata 138 138 0
>>>> files_cache 8680 8925 640 51 8 : tunables 0 0
>>>> 0 : slabdata 175 175 0
>>>> signal_cache 3691 3780 1152 28 8 : tunables 0 0
>>>> 0 : slabdata 135 135 0
>>>> sighand_cache 1950 2160 2112 15 8 : tunables 0 0
>>>> 0 : slabdata 144 144 0
>>>> task_xstate 8070 8658 832 39 8 : tunables 0 0
>>>> 0 : slabdata 222 222 0
>>>> task_struct 1913 2088 4080 8 8 : tunables 0 0
>>>> 0 : slabdata 261 261 0
>>>> cred_jar 31699 33936 192 42 2 : tunables 0 0
>>>> 0 : slabdata 808 808 0
>>>> anon_vma_chain 164026 168704 64 64 1 : tunables 0 0
>>>> 0 : slabdata 2636 2636 0
>>>> anon_vma 84104 84594 88 46 1 : tunables 0 0
>>>> 0 : slabdata 1839 1839 0
>>>> pid 11127 12576 128 32 1 : tunables 0 0
>>>> 0 : slabdata 393 393 0
>>>> shared_policy_node 9350 9350 48 85 1 : tunables 0 0
>>>> 0 : slabdata 110 110 0
>>>> numa_policy 62 62 264 31 2 : tunables 0 0
>>>> 0 : slabdata 2 2 0
>>>> radix_tree_node 771778 1194312 584 28 4 : tunables 0 0
>>>> 0 : slabdata 42654 42654 0
>>>> idr_layer_cache 2538 2565 2112 15 8 : tunables 0 0
>>>> 0 : slabdata 171 171 0
>>>> dma-kmalloc-8192 0 0 8192 4 8 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> dma-kmalloc-4096 0 0 4096 8 8 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> dma-kmalloc-2048 0 0 2048 16 8 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> dma-kmalloc-1024 0 0 1024 32 8 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> dma-kmalloc-512 0 0 512 32 4 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> dma-kmalloc-256 0 0 256 32 2 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> dma-kmalloc-128 0 0 128 32 1 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> dma-kmalloc-64 0 0 64 64 1 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> dma-kmalloc-32 0 0 32 128 1 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> dma-kmalloc-16 0 0 16 256 1 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> dma-kmalloc-8 0 0 8 512 1 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> dma-kmalloc-192 0 0 192 42 2 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> dma-kmalloc-96 0 0 96 42 1 : tunables 0 0
>>>> 0 : slabdata 0 0 0
>>>> kmalloc-8192 385 388 8192 4 8 : tunables 0 0
>>>> 0 : slabdata 97 97 0
>>>> kmalloc-4096 9296 10088 4096 8 8 : tunables 0 0
>>>> 0 : slabdata 1261 1261 0
>>>> kmalloc-2048 65061 133536 2048 16 8 : tunables 0 0
>>>> 0 : slabdata 8346 8346 0
>>>> kmalloc-1024 11987 21120 1024 32 8 : tunables 0 0
>>>> 0 : slabdata 660 660 0
>>>> kmalloc-512 107510 187072 512 32 4 : tunables 0 0
>>>> 0 : slabdata 5846 5846 0
>>>> kmalloc-256 160498 199104 256 32 2 : tunables 0 0
>>>> 0 : slabdata 6222 6222 0
>>>> kmalloc-192 144975 237426 192 42 2 : tunables 0 0
>>>> 0 : slabdata 5653 5653 0
>>>> kmalloc-128 36799 108096 128 32 1 : tunables 0 0
>>>> 0 : slabdata 3378 3378 0
>>>> kmalloc-96 99510 238896 96 42 1 : tunables 0 0
>>>> 0 : slabdata 5688 5688 0
>>>> kmalloc-64 7978152 8593280 64 64 1 : tunables 0
>>>> 0 0 : slabdata 134270 134270 0
>>>> kmalloc-32 2939882 3089664 32 128 1 : tunables 0
>>>> 0 0 : slabdata 24138 24138 0
>>>> kmalloc-16 172057 172288 16 256 1 : tunables 0 0
>>>> 0 : slabdata 673 673 0
>>>> kmalloc-8 109568 109568 8 512 1 : tunables 0 0
>>>> 0 : slabdata 214 214 0
>>>> kmem_cache_node 893 896 64 64 1 : tunables 0 0
>>>> 0 : slabdata 14 14 0
>>>> kmem_cache 612 612 320 51 4 : tunables 0 0
>>>> 0 : slabdata 12 12 0
>>>>
>>>> -------------------------------------------------
>>>>
>>>>
>>>> # uname -r
>>>> 3.10.0-714.10.2.lve1.5.17.1.el7.x86_64
>>>>
>>>> --------------------------------------------------------
>>>>
>>>> Core part of glances
>>>> http://i.imgur.com/La5JbQn.png
>>>> -----------------------------------------------------------
>>>>
>>>> Thank you very much for looking into this
>>>>
>>>>
>>>> On Thu, Aug 2, 2018 at 12:37 PM Igor A. Ippolitov <[email protected]>
>>>> wrote:
>>>>
>>>>> Anoop,
>>>>>
>>>>> I doubt this will be the solution, but may we have a look at
>>>>> /proc/buddyinfo and /proc/slabinfo the moment when nginx can't allocate
>>>>> memory?
>>>>>
>>>>> On 02.08.2018 08:15, Anoop Alias wrote:
>>>>>
>>>>> Hi Maxim,
>>>>>
>>>>> I enabled debug and the memalign call is happening on nginx reloads
>>>>> and the ENOMEM happen sometimes on the reload(not on all reloads)
>>>>>
>>>>> 2018/08/02 05:59:08 [notice] 872052#872052: signal process started
>>>>> 2018/08/02 05:59:23 [notice] 871570#871570: signal 1 (SIGHUP) received
>>>>> from 872052, reconfiguring
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: wake up, sigio 0
>>>>> 2018/08/02 05:59:23 [notice] 871570#871570: reconfiguring
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 0000000002B0DA00:16384 @16 === > the memalign call on reload
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: malloc:
>>>>> 00000000087924D0:4560
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 000000000E442E00:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: malloc:
>>>>> 0000000005650850:4096
>>>>> 20
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #71
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #72
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #73
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: bind() xxxx:443 #74
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: add cleanup:
>>>>> 000000005340D728
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 00000000024D3260:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 00000000517BAF10:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 0000000053854FC0:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 0000000053855FD0:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 0000000053856FE0:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 0000000053857FF0:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: posix_memalign:
>>>>> 0000000053859000:16384 @16
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 000000005385D010:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 000000005385E020:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 000000005385F030:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 00000000536CD160:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 00000000536CE170:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 00000000536CF180:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 00000000536D0190:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 00000000536D11A0:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 00000000536D21B0:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 00000000536D31C0:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 00000000536D41D0:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 00000000536D51E0:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 00000000536D61F0:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 00000000536D7200:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 00000000536D8210:4096
>>>>> 2018/08/02 05:48:49 [debug] 871275#871275: malloc:
>>>>> 00000000536D9220:4096
>>>>>
>>>>>
>>>>> Infact there are lot of such calls during a reload
>>>>>
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA17ED00:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA1B0FF0:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA1E12C0:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA211590:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA243880:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA271B30:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA2A3E20:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA2D20D0:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA3063E0:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA334690:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA366980:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA396C50:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA3C8F40:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA3F9210:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA4294E0:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA45B7D0:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA489A80:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA4BBD70:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA4EA020:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA51E330:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA54C5E0:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA57E8D0:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA5AEBA0:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA5DEE70:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA611160:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA641430:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA671700:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA6A29E0:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA6D5CE0:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA707FD0:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA736280:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA768570:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA796820:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA7CAB30:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA7F8DE0:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA82B0D0:16384 @16
>>>>> 2018/08/02 05:59:23 [debug] 871570#871570: posix_memalign:
>>>>> 00000000BA85B3A0:16384 @16
>>>>>
>>>>>
>>>>>
>>>>> What is perplexing is that the system has enough free (available RAM)
>>>>> #############
>>>>> # free -g
>>>>> total used free shared buff/cache
>>>>> available
>>>>> Mem: 125 54 24 8 46
>>>>> 58
>>>>> Swap: 0 0 0
>>>>> #############
>>>>>
>>>>> # ulimit -a
>>>>> core file size (blocks, -c) 0
>>>>> data seg size (kbytes, -d) unlimited
>>>>> scheduling priority (-e) 0
>>>>> file size (blocks, -f) unlimited
>>>>> pending signals (-i) 514579
>>>>> max locked memory (kbytes, -l) 64
>>>>> max memory size (kbytes, -m) unlimited
>>>>> open files (-n) 1024
>>>>> pipe size (512 bytes, -p) 8
>>>>> POSIX message queues (bytes, -q) 819200
>>>>> real-time priority (-r) 0
>>>>> stack size (kbytes, -s) 8192
>>>>> cpu time (seconds, -t) unlimited
>>>>> max user processes (-u) 514579
>>>>> virtual memory (kbytes, -v) unlimited
>>>>> file locks (-x) unlimited
>>>>>
>>>>> #########################################
>>>>>
>>>>> There is no other thing limiting memory allocation
>>>>>
>>>>> Any way to prevent this or probably identify/prevent this
>>>>>
>>>>>
>>>>> On Tue, Jul 31, 2018 at 7:08 PM Maxim Dounin <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hello!
>>>>>>
>>>>>> On Tue, Jul 31, 2018 at 09:52:29AM +0530, Anoop Alias wrote:
>>>>>>
>>>>>> > I am repeatedly seeing errors like
>>>>>> >
>>>>>> > ######################
>>>>>> > 2018/07/31 03:46:33 [emerg] 2854560#2854560: posix_memalign(16,
>>>>>> 16384)
>>>>>> > failed (12: Cannot allocate memory)
>>>>>> > 2018/07/31 03:54:09 [emerg] 2890190#2890190: posix_memalign(16,
>>>>>> 16384)
>>>>>> > failed (12: Cannot allocate memory)
>>>>>> > 2018/07/31 04:08:36 [emerg] 2939230#2939230: posix_memalign(16,
>>>>>> 16384)
>>>>>> > failed (12: Cannot allocate memory)
>>>>>> > 2018/07/31 04:24:48 [emerg] 2992650#2992650: posix_memalign(16,
>>>>>> 16384)
>>>>>> > failed (12: Cannot allocate memory)
>>>>>> > 2018/07/31 04:42:09 [emerg] 3053092#3053092: posix_memalign(16,
>>>>>> 16384)
>>>>>> > failed (12: Cannot allocate memory)
>>>>>> > 2018/07/31 04:42:17 [emerg] 3053335#3053335: posix_memalign(16,
>>>>>> 16384)
>>>>>> > failed (12: Cannot allocate memory)
>>>>>> > 2018/07/31 04:42:28 [emerg] 3053937#3053937: posix_memalign(16,
>>>>>> 16384)
>>>>>> > failed (12: Cannot allocate memory)
>>>>>> > 2018/07/31 04:47:54 [emerg] 3070638#3070638: posix_memalign(16,
>>>>>> 16384)
>>>>>> > failed (12: Cannot allocate memory)
>>>>>> > ####################
>>>>>> >
>>>>>> > on a few servers
>>>>>> >
>>>>>> > The servers have enough memory free and the swap usage is 0, yet
>>>>>> somehow
>>>>>> > the kernel denies the posix_memalign with ENOMEM ( this is what I
>>>>>> think is
>>>>>> > happening!)
>>>>>> >
>>>>>> > The numbers requested are always 16, 16k . This makes me suspicious
>>>>>> >
>>>>>> > I have no setting in nginx.conf that reference a 16k
>>>>>> >
>>>>>> > Is there any chance of finding out what requests this and why this
>>>>>> is not
>>>>>> > fulfilled
>>>>>>
>>>>>> There are at least some buffers which default to 16k - for
>>>>>> example, ssl_buffer_size (http://nginx.org/r/ssl_buffer_size).
>>>>>>
>>>>>> You may try debugging log to futher find out where the particular
>>>>>> allocation happens, see here for details:
>>>>>>
>>>>>> http://nginx.org/en/docs/debugging_log.html
>>>>>>
>>>>>> But I don't really think it worth the effort. The error is pretty
>>>>>> clear, and it's better to focus on why these allocations are
>>>>>> denied. Likely you are hitting some limit.
>>>>>>
>>>>>> --
>>>>>> Maxim Dounin
>>>>>> http://mdounin.ru/
>>>>>> _______________________________________________
>>>>>> nginx mailing list
>>>>>> nginx@nginx.org
>>>>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> *Anoop P Alias*
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> nginx mailing listnginx@nginx.orghttp://mailman.nginx.org/mailman/listinfo/nginx
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> nginx mailing list
>>>>> nginx@nginx.org
>>>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>>>
>>>>
>>>>
>>>> --
>>>> *Anoop P Alias*
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> nginx mailing listnginx@nginx.orghttp://mailman.nginx.org/mailman/listinfo/nginx
>>>>
>>>>
>>>> _______________________________________________
>>>> nginx mailing list
>>>> nginx@nginx.org
>>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>>
>>>
>>>
>>> --
>>> *Anoop P Alias*
>>>
>>>
>>>
>>> _______________________________________________
>>> nginx mailing listnginx@nginx.orghttp://mailman.nginx.org/mailman/listinfo/nginx
>>>
>>>
>>> _______________________________________________
>>> nginx mailing list
>>> nginx@nginx.org
>>> http://mailman.nginx.org/mailman/listinfo/nginx
>>
>>
>>
>> --
>> *Anoop P Alias*
>>
>>
>>
>> _______________________________________________
>> nginx mailing listnginx@nginx.orghttp://mailman.nginx.org/mailman/listinfo/nginx
>>
>>
>> _______________________________________________
>> nginx mailing list
>> nginx@nginx.org
>> http://mailman.nginx.org/mailman/listinfo/nginx
>
>
>
> --
> *Anoop P Alias*
>
>

--
*Anoop P Alias*
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Sorry, only registered users may post in this forum.

Click here to login