Welcome! Log In Create A New Profile

Advanced

Nginx throttling issue?

Posted by John Melom 
John Melom
Nginx throttling issue?
March 26, 2018 10:30PM
Hi,

I am load testing our system using Jmeter as a load generator. We execute a script consisting of an https request executing in a loop. The loop does not contain a think time, since at this point I am not trying to emulate a “real user”. I want to get a quick look at our system capacity. Load on our system is increased by increasing the number of Jmeter threads executing our script. Each Jmeter thread references different data.

Our system is in AWS with an ELB fronting Nginx, which serves as a reverse proxy for our Docker Swarm application cluster.

At moderate loads, a subset of our https requests start experiencing to a 1 second delay in addition to their normal response time. The delay is not due to resource contention. System utilizations remain low. The response times cluster around 4 values: 0 millilseconds, 50 milliseconds, 1 second, and 1.050 seconds. Right now, I am most interested in understanding and eliminating the 1 second delay that gives the clusters at 1 second and 1.050 seconds.

The attachment shows a response time scatterplot from one of our runs. The x-axis is the number of seconds into the run, the y-axis is the response time in milliseconds. The plotted data shows the response time of requests at the time they occurred in the run.

If I run the test bypassing the ELB and Nginx, this delay does not occur.
If I bypass the ELB, but include Nginx in the request path, the delay returns.

This leads me to believe the 1 second delay is coming from Nginx.

One possible candidate Nginx DDOS. Since all requests are coming from the same Jmeter system, I expect they share the same originating IP address. I attempted to control DDOS throttling by setting limit_req as shown in the nginx.conf fragment below:

http {

limit_req_zone $binary_remote_addr zone=perf:20m rate=10000r/s;

server {

location /myReq {
limit_req zone=perf burst=600;
proxy_pass xxx.xxx.xxx.xxx;
}
….
}

The thinking behind the values set in this conf file is that my aggregate demand would not exceed 10000 requests per second, so throttling of requests should not occur. If there were short bursts more intense than that, the burst value would buffer these requests.

This tuning did not change my results. I still get the 1 second delay.

Am I implementing this correctly?
Is there something else I should be trying?

The responses are not large, so I don’t believe limit_req is the answer.
I have a small number of intense users, so limit_conn does not seem likely to be the answer either.

Thanks,

John Melom
Performance Test Engineer
Spōk, Inc.
+1 (952) 230 5311 Office
[email protected]<mailto:[email protected]>

[cid:[email protected]]http://info.spok.com/spokmobilevid


________________________________
NOTE: This email message and any attachments are for the sole use of the intended recipient(s) and may contain confidential and/or privileged information. Any unauthorized review, use, disclosure or distribution is prohibited.. If you have received this e-mail in error, please contact the sender by replying to this email, and destroy all copies of the original message and any material included with this email.
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Attachments:
open | download - image003.jpg (3.2 KB)
open | download - rawRespScatterplot.png (42.3 KB)
Peter Booth
Re: Nginx throttling issue?
March 26, 2018 11:00PM
You’re correct that this is the ddos throttling. The real question is what do you want to do? JMeter with zero think time is an imperfect load generator- this is only one complication. The bigger one is the open/closed model issue. With you design you have back ptesssure from your system under test to your load generator. A jmeter virtual user will only ever issue a request when the prior one completes. Real users are not so well behaved which is why your test results will always be over optimistic with this design.

Better approach us to use a load generator that replicates the desired request distribution without triggering the ddos protection. Wrk2, Tsung, httperf are candidates, as well as the cloud based load generator services. Also see Neil Gunther’s paper on how to combine multiple jmeter instances to replicate real world tragic patterns.

Peter

Sent from my iPhone

> On Mar 26, 2018, at 4:21 PM, John Melom <[email protected]> wrote:
>
> Hi,
>
> I am load testing our system using Jmeter as a load generator. We execute a script consisting of an https request executing in a loop. The loop does not contain a think time, since at this point I am not trying to emulate a “real user”. I want to get a quick look at our system capacity.. Load on our system is increased by increasing the number of Jmeter threads executing our script. Each Jmeter thread references different data.
>
> Our system is in AWS with an ELB fronting Nginx, which serves as a reverse proxy for our Docker Swarm application cluster.
>
> At moderate loads, a subset of our https requests start experiencing to a 1 second delay in addition to their normal response time. The delay is not due to resource contention. System utilizations remain low. The response times cluster around 4 values: 0 millilseconds, 50 milliseconds, 1 second, and 1.050 seconds. Right now, I am most interested in understanding and eliminating the 1 second delay that gives the clusters at 1 second and 1.050 seconds.
>
> The attachment shows a response time scatterplot from one of our runs. The x-axis is the number of seconds into the run, the y-axis is the response time in milliseconds. The plotted data shows the response time of requests at the time they occurred in the run.
>
> If I run the test bypassing the ELB and Nginx, this delay does not occur.
> If I bypass the ELB, but include Nginx in the request path, the delay returns.
>
> This leads me to believe the 1 second delay is coming from Nginx.
>
> One possible candidate Nginx DDOS. Since all requests are coming from the same Jmeter system, I expect they share the same originating IP address. I attempted to control DDOS throttling by setting limit_req as shown in the nginx.conf fragment below:
>
> http {
> …
> limit_req_zone $binary_remote_addr zone=perf:20m rate=10000r/s;
> …
> server {
> …
> location /myReq {
> limit_req zone=perf burst=600;
> proxy_pass xxx.xxx.xxx.xxx;
> }
> ….
> }
>
> The thinking behind the values set in this conf file is that my aggregate demand would not exceed 10000 requests per second, so throttling of requests should not occur. If there were short bursts more intense than that, the burst value would buffer these requests.
>
> This tuning did not change my results. I still get the 1 second delay.
>
> Am I implementing this correctly?
> Is there something else I should be trying?
>
> The responses are not large, so I don’t believe limit_req is the answer.
> I have a small number of intense users, so limit_conn does not seem likely to be the answer either.
>
> Thanks,
>
> John Melom
> Performance Test Engineer
> Spōk, Inc.
> +1 (952) 230 5311 Office
> John.Melom@spok.com
>
> <image003.jpg>
>
>
> NOTE: This email message and any attachments are for the sole use of the intended recipient(s) and may contain confidential and/or privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you have received this e-mail in error, please contact the sender by replying to this email, and destroy all copies of the original message and any material included with this email.
> <rawRespScatterplot.png>
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Maxim Dounin
Re: Nginx throttling issue?
March 27, 2018 02:00PM
Hello!

On Mon, Mar 26, 2018 at 08:21:27PM +0000, John Melom wrote:

> I am load testing our system using Jmeter as a load generator.
> We execute a script consisting of an https request executing in
> a loop. The loop does not contain a think time, since at this
> point I am not trying to emulate a “real user”. I want to get a
> quick look at our system capacity. Load on our system is
> increased by increasing the number of Jmeter threads executing
> our script. Each Jmeter thread references different data.
>
> Our system is in AWS with an ELB fronting Nginx, which serves as
> a reverse proxy for our Docker Swarm application cluster.
>
> At moderate loads, a subset of our https requests start
> experiencing to a 1 second delay in addition to their normal
> response time. The delay is not due to resource contention.
> System utilizations remain low. The response times cluster
> around 4 values: 0 millilseconds, 50 milliseconds, 1 second,
> and 1.050 seconds. Right now, I am most interested in
> understanding and eliminating the 1 second delay that gives the
> clusters at 1 second and 1.050 seconds.
>
> The attachment shows a response time scatterplot from one of our
> runs. The x-axis is the number of seconds into the run, the
> y-axis is the response time in milliseconds. The plotted data
> shows the response time of requests at the time they occurred in
> the run.
>
> If I run the test bypassing the ELB and Nginx, this delay does
> not occur.
> If I bypass the ELB, but include Nginx in the request path, the
> delay returns.
>
> This leads me to believe the 1 second delay is coming from
> Nginx.

There are no magic 1 second delays in nginx - unless you've
configured something explicitly.

Most likely, the 1 second delay is coming from TCP retransmission
timeout during connection establishment due to listen queue
overflows. Check "netstat -s" to see if there are any listen
queue overflows on your hosts.

[...]

--
Maxim Dounin
http://mdounin.ru/
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
John Melom
RE: Nginx throttling issue?
March 27, 2018 03:50PM
Peter,

Thanks for your reply.

What I’d really like is to understand how to tune nginx to avoid the delays when I run my tests.

I am comfortable with the overly optimistic results from my current “closed model” test design. Once I determine my system’s throughput limits I will introduce significant think times into my scripts so that much larger user populations are required to produce the same work demand. This will more closely approximate an “open model” test design.

Could you provide more explanation as to why a different load generation tool would avoid triggering a DDOS response from nginx? My first guess would have been that they would also generate requests from a single IP address, and thus look the same as a JMeter load.

I did try my test with JMeter driving workload from 2 different machines at the same time. I ran each machine ‘s workload at a low enough level that individually they did not trigger the 1 second delay. The combined workload did trigger the delay for each of the JMeter workload generators. I’m not sure how many machines would be required to avoid the collective response from nginx.

Thanks,

John


From: nginx [mailto:[email protected]] On Behalf Of Peter Booth
Sent: Monday, March 26, 2018 3:57 PM
To: nginx@nginx.org
Subject: Re: Nginx throttling issue?

You’re correct that this is the ddos throttling. The real question is what do you want to do? JMeter with zero think time is an imperfect load generator- this is only one complication. The bigger one is the open/closed model issue. With you design you have back ptesssure from your system under test to your load generator. A jmeter virtual user will only ever issue a request when the prior one completes. Real users are not so well behaved which is why your test results will always be over optimistic with this design.

Better approach us to use a load generator that replicates the desired request distribution without triggering the ddos protection. Wrk2, Tsung, httperf are candidates, as well as the cloud based load generator services. Also see Neil Gunther’s paper on how to combine multiple jmeter instances to replicate real world tragic patterns.

Peter
Sent from my iPhone

On Mar 26, 2018, at 4:21 PM, John Melom <[email protected]<mailto:[email protected]>> wrote:
Hi,

I am load testing our system using Jmeter as a load generator. We execute a script consisting of an https request executing in a loop. The loop does not contain a think time, since at this point I am not trying to emulate a “real user”. I want to get a quick look at our system capacity. Load on our system is increased by increasing the number of Jmeter threads executing our script. Each Jmeter thread references different data.

Our system is in AWS with an ELB fronting Nginx, which serves as a reverse proxy for our Docker Swarm application cluster.

At moderate loads, a subset of our https requests start experiencing to a 1 second delay in addition to their normal response time. The delay is not due to resource contention. System utilizations remain low. The response times cluster around 4 values: 0 millilseconds, 50 milliseconds, 1 second, and 1.050 seconds. Right now, I am most interested in understanding and eliminating the 1 second delay that gives the clusters at 1 second and 1.050 seconds.

The attachment shows a response time scatterplot from one of our runs. The x-axis is the number of seconds into the run, the y-axis is the response time in milliseconds. The plotted data shows the response time of requests at the time they occurred in the run.

If I run the test bypassing the ELB and Nginx, this delay does not occur.
If I bypass the ELB, but include Nginx in the request path, the delay returns.

This leads me to believe the 1 second delay is coming from Nginx.

One possible candidate Nginx DDOS. Since all requests are coming from the same Jmeter system, I expect they share the same originating IP address. I attempted to control DDOS throttling by setting limit_req as shown in the nginx.conf fragment below:

http {

limit_req_zone $binary_remote_addr zone=perf:20m rate=10000r/s;

server {

location /myReq {
limit_req zone=perf burst=600;
proxy_pass xxx.xxx.xxx.xxx;
}
….
}

The thinking behind the values set in this conf file is that my aggregate demand would not exceed 10000 requests per second, so throttling of requests should not occur. If there were short bursts more intense than that, the burst value would buffer these requests.

This tuning did not change my results. I still get the 1 second delay.

Am I implementing this correctly?
Is there something else I should be trying?

The responses are not large, so I don’t believe limit_req is the answer.
I have a small number of intense users, so limit_conn does not seem likely to be the answer either.

Thanks,

John Melom
Performance Test Engineer
Spōk, Inc.
+1 (952) 230 5311 Office
[email protected]<mailto:[email protected]>

<image003.jpg>http://info.spok.com/spokmobilevid


________________________________
NOTE: This email message and any attachments are for the sole use of the intended recipient(s) and may contain confidential and/or privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you have received this e-mail in error, please contact the sender by replying to this email, and destroy all copies of the original message and any material included with this email.
<rawRespScatterplot.png>
_______________________________________________
nginx mailing list
[email protected]<mailto:[email protected]>
http://mailman.nginx.org/mailman/listinfo/nginx

________________________________
NOTE: This email message and any attachments are for the sole use of the intended recipient(s) and may contain confidential and/or privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you have received this e-mail in error, please contact the sender by replying to this email, and destroy all copies of the original message and any material included with this email.
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
John Melom
RE: Nginx throttling issue?
March 27, 2018 04:00PM
Maxim,

Thank you for your reply. I will look to see if "netstat -s" detects any listen queue overflows.

John


-----Original Message-----
From: nginx [mailto:[email protected]] On Behalf Of Maxim Dounin
Sent: Tuesday, March 27, 2018 6:55 AM
To: nginx@nginx.org
Subject: Re: Nginx throttling issue?

Hello!

On Mon, Mar 26, 2018 at 08:21:27PM +0000, John Melom wrote:

> I am load testing our system using Jmeter as a load generator.
> We execute a script consisting of an https request executing in a
> loop. The loop does not contain a think time, since at this point I
> am not trying to emulate a “real user”. I want to get a quick look at
> our system capacity. Load on our system is increased by increasing
> the number of Jmeter threads executing our script. Each Jmeter thread
> references different data.
>
> Our system is in AWS with an ELB fronting Nginx, which serves as a
> reverse proxy for our Docker Swarm application cluster.
>
> At moderate loads, a subset of our https requests start experiencing
> to a 1 second delay in addition to their normal response time. The
> delay is not due to resource contention.
> System utilizations remain low. The response times cluster around 4
> values: 0 millilseconds, 50 milliseconds, 1 second, and 1.050
> seconds. Right now, I am most interested in understanding and
> eliminating the 1 second delay that gives the clusters at 1 second and
> 1.050 seconds.
>
> The attachment shows a response time scatterplot from one of our runs.
> The x-axis is the number of seconds into the run, the y-axis is the
> response time in milliseconds. The plotted data shows the response
> time of requests at the time they occurred in the run.
>
> If I run the test bypassing the ELB and Nginx, this delay does not
> occur.
> If I bypass the ELB, but include Nginx in the request path, the delay
> returns.
>
> This leads me to believe the 1 second delay is coming from Nginx.

There are no magic 1 second delays in nginx - unless you've configured something explicitly.

Most likely, the 1 second delay is coming from TCP retransmission timeout during connection establishment due to listen queue overflows. Check "netstat -s" to see if there are any listen queue overflows on your hosts.

[...]

--
Maxim Dounin
http://mdounin.ru/
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx

________________________________
NOTE: This email message and any attachments are for the sole use of the intended recipient(s) and may contain confidential and/or privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you have received this e-mail in error, please contact the sender by replying to this email, and destroy all copies of the original message and any material included with this email.
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
John Melom
RE: Nginx throttling issue?
April 04, 2018 11:30PM
Hi Maxim,

I've looked at the nstat data and found the following values for counters:

> nstat -az | grep -I listen
TcpExtListenOverflows 0 0.0
TcpExtListenDrops 0 0.0
TcpExtTCPFastOpenListenOverflow 0 0.0


nstat -az | grep -i retra
TcpRetransSegs 12157 0.0
TcpExtTCPLostRetransmit 0 0.0
TcpExtTCPFastRetrans 270 0.0
TcpExtTCPForwardRetrans 11 0.0
TcpExtTCPSlowStartRetrans 0 0.0
TcpExtTCPRetransFail 0 0.0
TcpExtTCPSynRetrans 25 0.0

Assuming the above "Listen" counters provide data about the overflow issue you mention, then there are no overflows on my system. While retransmissions are happening, it doesn't seem they are related to listen queue overflows.


Am I looking at the correct data items? Is my interpretation of the data correct? If so, do you have any other ideas I could investigate?

Thanks,

John

-----Original Message-----
From: nginx [mailto:[email protected]] On Behalf Of John Melom
Sent: Tuesday, March 27, 2018 8:52 AM
To: nginx@nginx.org
Subject: RE: Nginx throttling issue?

Maxim,

Thank you for your reply. I will look to see if "netstat -s" detects any listen queue overflows.

John


-----Original Message-----
From: nginx [mailto:[email protected]] On Behalf Of Maxim Dounin
Sent: Tuesday, March 27, 2018 6:55 AM
To: nginx@nginx.org
Subject: Re: Nginx throttling issue?

Hello!

On Mon, Mar 26, 2018 at 08:21:27PM +0000, John Melom wrote:

> I am load testing our system using Jmeter as a load generator.
> We execute a script consisting of an https request executing in a
> loop. The loop does not contain a think time, since at this point I
> am not trying to emulate a “real user”. I want to get a quick look at
> our system capacity. Load on our system is increased by increasing
> the number of Jmeter threads executing our script. Each Jmeter thread
> references different data.
>
> Our system is in AWS with an ELB fronting Nginx, which serves as a
> reverse proxy for our Docker Swarm application cluster.
>
> At moderate loads, a subset of our https requests start experiencing
> to a 1 second delay in addition to their normal response time. The
> delay is not due to resource contention.
> System utilizations remain low. The response times cluster around 4
> values: 0 millilseconds, 50 milliseconds, 1 second, and 1.050
> seconds. Right now, I am most interested in understanding and
> eliminating the 1 second delay that gives the clusters at 1 second and
> 1.050 seconds.
>
> The attachment shows a response time scatterplot from one of our runs.
> The x-axis is the number of seconds into the run, the y-axis is the
> response time in milliseconds. The plotted data shows the response
> time of requests at the time they occurred in the run.
>
> If I run the test bypassing the ELB and Nginx, this delay does not
> occur.
> If I bypass the ELB, but include Nginx in the request path, the delay
> returns.
>
> This leads me to believe the 1 second delay is coming from Nginx.

There are no magic 1 second delays in nginx - unless you've configured something explicitly.

Most likely, the 1 second delay is coming from TCP retransmission timeout during connection establishment due to listen queue overflows. Check "netstat -s" to see if there are any listen queue overflows on your hosts.

[...]

--
Maxim Dounin
http://mdounin.ru/
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx

________________________________
NOTE: This email message and any attachments are for the sole use of the intended recipient(s) and may contain confidential and/or privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you have received this e-mail in error, please contact the sender by replying to this email, and destroy all copies of the original message and any material included with this email.
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx

________________________________
NOTE: This email message and any attachments are for the sole use of the intended recipient(s) and may contain confidential and/or privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you have received this e-mail in error, please contact the sender by replying to this email, and destroy all copies of the original message and any material included with this email.
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Peter Booth
Re: Nginx throttling issue?
April 05, 2018 06:50AM
John,

I think that you need to understand what is happening on your host throughout the duration of the test. Specifically, what is happening with the tcp connections. If you run netstat and grep for tcp and do this in a loop every say five seconds then you’ll see how many connections peak get created.
If the thing you are testing exists in production then you are lucky. You can do the same in production and see what it is that you need to replicate.

You didn’t mention whether you had persistent connections (http keep alive) configured. This is key to maximizing scalability. You did say that you were using SSL. If it were me I’d use a load generator that more closely resembles the behavior of real users on a website. Wrk2, Tsung, httperf, Gatling are examples of some that do. Using jmeter with zero think time is a very common anti pattern that doesn’t behave anything like real users. I think of it as the lazy performance tester pattern.

Imagine a real web server under heavy load from human beings. You will see thousands of concurrent connections but fewer concurrent requests in flight. With the jmeter zero think time model then you are either creating new connections or reusing them - so either you have a shitload of connections and your nginx process starts running out of file handles or you are jamming requests down a single connection- neither of which resemble reality.

If you are committed to using jmeter for some reason then use more instances with real thinktimes. Each instance’s connection wil have a different source port

Sent from my iPhone

> On Apr 4, 2018, at 5:20 PM, John Melom <[email protected]> wrote:
>
> Hi Maxim,
>
> I've looked at the nstat data and found the following values for counters:
>
>> nstat -az | grep -I listen
> TcpExtListenOverflows 0 0.0
> TcpExtListenDrops 0 0.0
> TcpExtTCPFastOpenListenOverflow 0 0.0
>
>
> nstat -az | grep -i retra
> TcpRetransSegs 12157 0.0
> TcpExtTCPLostRetransmit 0 0.0
> TcpExtTCPFastRetrans 270 0.0
> TcpExtTCPForwardRetrans 11 0.0
> TcpExtTCPSlowStartRetrans 0 0.0
> TcpExtTCPRetransFail 0 0.0
> TcpExtTCPSynRetrans 25 0.0
>
> Assuming the above "Listen" counters provide data about the overflow issue you mention, then there are no overflows on my system. While retransmissions are happening, it doesn't seem they are related to listen queue overflows.
>
>
> Am I looking at the correct data items? Is my interpretation of the data correct? If so, do you have any other ideas I could investigate?
>
> Thanks,
>
> John
>
> -----Original Message-----
> From: nginx [mailto:[email protected]] On Behalf Of John Melom
> Sent: Tuesday, March 27, 2018 8:52 AM
> To: nginx@nginx.org
> Subject: RE: Nginx throttling issue?
>
> Maxim,
>
> Thank you for your reply. I will look to see if "netstat -s" detects any listen queue overflows.
>
> John
>
>
> -----Original Message-----
> From: nginx [mailto:[email protected]] On Behalf Of Maxim Dounin
> Sent: Tuesday, March 27, 2018 6:55 AM
> To: nginx@nginx.org
> Subject: Re: Nginx throttling issue?
>
> Hello!
>
>> On Mon, Mar 26, 2018 at 08:21:27PM +0000, John Melom wrote:
>>
>> I am load testing our system using Jmeter as a load generator.
>> We execute a script consisting of an https request executing in a
>> loop. The loop does not contain a think time, since at this point I
>> am not trying to emulate a “real user”. I want to get a quick look at
>> our system capacity. Load on our system is increased by increasing
>> the number of Jmeter threads executing our script. Each Jmeter thread
>> references different data.
>>
>> Our system is in AWS with an ELB fronting Nginx, which serves as a
>> reverse proxy for our Docker Swarm application cluster.
>>
>> At moderate loads, a subset of our https requests start experiencing
>> to a 1 second delay in addition to their normal response time. The
>> delay is not due to resource contention.
>> System utilizations remain low. The response times cluster around 4
>> values: 0 millilseconds, 50 milliseconds, 1 second, and 1.050
>> seconds. Right now, I am most interested in understanding and
>> eliminating the 1 second delay that gives the clusters at 1 second and
>> 1.050 seconds.
>>
>> The attachment shows a response time scatterplot from one of our runs.
>> The x-axis is the number of seconds into the run, the y-axis is the
>> response time in milliseconds. The plotted data shows the response
>> time of requests at the time they occurred in the run.
>>
>> If I run the test bypassing the ELB and Nginx, this delay does not
>> occur.
>> If I bypass the ELB, but include Nginx in the request path, the delay
>> returns.
>>
>> This leads me to believe the 1 second delay is coming from Nginx.
>
> There are no magic 1 second delays in nginx - unless you've configured something explicitly.
>
> Most likely, the 1 second delay is coming from TCP retransmission timeout during connection establishment due to listen queue overflows. Check "netstat -s" to see if there are any listen queue overflows on your hosts.
>
> [...]
>
> --
> Maxim Dounin
> http://mdounin.ru/
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx
>
> ________________________________
> NOTE: This email message and any attachments are for the sole use of the intended recipient(s) and may contain confidential and/or privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you have received this e-mail in error, please contact the sender by replying to this email, and destroy all copies of the original message and any material included with this email.
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx
>
> ________________________________
> NOTE: This email message and any attachments are for the sole use of the intended recipient(s) and may contain confidential and/or privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you have received this e-mail in error, please contact the sender by replying to this email, and destroy all copies of the original message and any material included with this email.
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Richard Stanway via nginx
Re: Nginx throttling issue?
April 06, 2018 07:20PM
Even though it shouldn't be reaching your limits, limit_req does delay in
1 second increments which sounds like it could be responsible for this. You
should see error log entries if this happens (severity warning). Have you
tried without the limit_req option? You can also use the nodelay option to
avoid the delaying behavior.

http://nginx.org/en/docs/http/ngx_http_limit_req_module.html#limit_req


On Thu, Apr 5, 2018 at 6:45 AM, Peter Booth <[email protected]> wrote:

> John,
>
> I think that you need to understand what is happening on your host
> throughout the duration of the test. Specifically, what is happening with
> the tcp connections. If you run netstat and grep for tcp and do this in a
> loop every say five seconds then you’ll see how many connections peak get
> created.
> If the thing you are testing exists in production then you are lucky. You
> can do the same in production and see what it is that you need to replicate.
>
> You didn’t mention whether you had persistent connections (http keep
> alive) configured. This is key to maximizing scalability. You did say that
> you were using SSL. If it were me I’d use a load generator that more
> closely resembles the behavior of real users on a website. Wrk2, Tsung,
> httperf, Gatling are examples of some that do. Using jmeter with zero think
> time is a very common anti pattern that doesn’t behave anything like real
> users. I think of it as the lazy performance tester pattern.
>
> Imagine a real web server under heavy load from human beings. You will see
> thousands of concurrent connections but fewer concurrent requests in
> flight. With the jmeter zero think time model then you are either creating
> new connections or reusing them - so either you have a shitload of
> connections and your nginx process starts running out of file handles or
> you are jamming requests down a single connection- neither of which
> resemble reality.
>
> If you are committed to using jmeter for some reason then use more
> instances with real thinktimes. Each instance’s connection wil have a
> different source port
>
> Sent from my iPhone
>
> > On Apr 4, 2018, at 5:20 PM, John Melom <[email protected]> wrote:
> >
> > Hi Maxim,
> >
> > I've looked at the nstat data and found the following values for
> counters:
> >
> >> nstat -az | grep -I listen
> > TcpExtListenOverflows 0 0.0
> > TcpExtListenDrops 0 0.0
> > TcpExtTCPFastOpenListenOverflow 0 0.0
> >
> >
> > nstat -az | grep -i retra
> > TcpRetransSegs 12157 0.0
> > TcpExtTCPLostRetransmit 0 0.0
> > TcpExtTCPFastRetrans 270 0.0
> > TcpExtTCPForwardRetrans 11 0.0
> > TcpExtTCPSlowStartRetrans 0 0.0
> > TcpExtTCPRetransFail 0 0.0
> > TcpExtTCPSynRetrans 25 0.0
> >
> > Assuming the above "Listen" counters provide data about the overflow
> issue you mention, then there are no overflows on my system. While
> retransmissions are happening, it doesn't seem they are related to listen
> queue overflows.
> >
> >
> > Am I looking at the correct data items? Is my interpretation of the
> data correct? If so, do you have any other ideas I could investigate?
> >
> > Thanks,
> >
> > John
> >
> > -----Original Message-----
> > From: nginx [mailto:[email protected]] On Behalf Of John Melom
> > Sent: Tuesday, March 27, 2018 8:52 AM
> > To: nginx@nginx.org
> > Subject: RE: Nginx throttling issue?
> >
> > Maxim,
> >
> > Thank you for your reply. I will look to see if "netstat -s" detects
> any listen queue overflows.
> >
> > John
> >
> >
> > -----Original Message-----
> > From: nginx [mailto:[email protected]] On Behalf Of Maxim Dounin
> > Sent: Tuesday, March 27, 2018 6:55 AM
> > To: nginx@nginx.org
> > Subject: Re: Nginx throttling issue?
> >
> > Hello!
> >
> >> On Mon, Mar 26, 2018 at 08:21:27PM +0000, John Melom wrote:
> >>
> >> I am load testing our system using Jmeter as a load generator.
> >> We execute a script consisting of an https request executing in a
> >> loop. The loop does not contain a think time, since at this point I
> >> am not trying to emulate a “real user”. I want to get a quick look at
> >> our system capacity. Load on our system is increased by increasing
> >> the number of Jmeter threads executing our script. Each Jmeter thread
> >> references different data.
> >>
> >> Our system is in AWS with an ELB fronting Nginx, which serves as a
> >> reverse proxy for our Docker Swarm application cluster.
> >>
> >> At moderate loads, a subset of our https requests start experiencing
> >> to a 1 second delay in addition to their normal response time. The
> >> delay is not due to resource contention.
> >> System utilizations remain low. The response times cluster around 4
> >> values: 0 millilseconds, 50 milliseconds, 1 second, and 1.050
> >> seconds. Right now, I am most interested in understanding and
> >> eliminating the 1 second delay that gives the clusters at 1 second and
> >> 1.050 seconds.
> >>
> >> The attachment shows a response time scatterplot from one of our runs.
> >> The x-axis is the number of seconds into the run, the y-axis is the
> >> response time in milliseconds. The plotted data shows the response
> >> time of requests at the time they occurred in the run.
> >>
> >> If I run the test bypassing the ELB and Nginx, this delay does not
> >> occur.
> >> If I bypass the ELB, but include Nginx in the request path, the delay
> >> returns.
> >>
> >> This leads me to believe the 1 second delay is coming from Nginx.
> >
> > There are no magic 1 second delays in nginx - unless you've configured
> something explicitly.
> >
> > Most likely, the 1 second delay is coming from TCP retransmission
> timeout during connection establishment due to listen queue overflows.
> Check "netstat -s" to see if there are any listen queue overflows on your
> hosts.
> >
> > [...]
> >
> > --
> > Maxim Dounin
> > http://mdounin.ru/
> > _______________________________________________
> > nginx mailing list
> > nginx@nginx.org
> > http://mailman.nginx.org/mailman/listinfo/nginx
> >
> > ________________________________
> > NOTE: This email message and any attachments are for the sole use of the
> intended recipient(s) and may contain confidential and/or privileged
> information. Any unauthorized review, use, disclosure or distribution is
> prohibited. If you have received this e-mail in error, please contact the
> sender by replying to this email, and destroy all copies of the original
> message and any material included with this email.
> > _______________________________________________
> > nginx mailing list
> > nginx@nginx.org
> > http://mailman.nginx.org/mailman/listinfo/nginx
> >
> > ________________________________
> > NOTE: This email message and any attachments are for the sole use of the
> intended recipient(s) and may contain confidential and/or privileged
> information. Any unauthorized review, use, disclosure or distribution is
> prohibited. If you have received this e-mail in error, please contact the
> sender by replying to this email, and destroy all copies of the original
> message and any material included with this email.
> > _______________________________________________
> > nginx mailing list
> > nginx@nginx.org
> > http://mailman.nginx.org/mailman/listinfo/nginx
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx
>
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Maxim Dounin
Re: Nginx throttling issue?
April 09, 2018 03:40PM
Hello!

On Fri, Apr 06, 2018 at 07:11:36PM +0200, Richard Stanway via nginx wrote:

> Even though it shouldn't be reaching your limits, limit_req does delay in
> 1 second increments which sounds like it could be responsible for this. You

Delays as introduced by limit_req (again, only if explicitly
configured) use milliseconds granularity. In the particular case
configured with rate=10000r/s and burst=600, maximum possible
delay would be 60ms.

--
Maxim Dounin
http://mdounin.ru/
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Sorry, only registered users may post in this forum.

Click here to login