-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP Server closed connections keep being in CLOSED_WAIT state #6330
Comments
I've done some more testing on this. It doesn't turn into a non-HTTP error against every server so I have no idea why that happens with one server I'm using, but the problem remains. On many servers somehow, after the FIN_ACK is sent and jmeter leaves it in CLOSE_WAIT state, when JMeter wants to do a new request, it somehow first sends the FIN_ACK back which cleans the connection and then does the request. Against some servers it doesn't send the FIN_ACK when it wants to do a new request, but just sends it over the connection and the server sends a RST because the connection is either cleaned up or it's in CLOSE_WAIT state. It is a serious issue, because once the server tries to disconnect the connection (after its keep-alive timeout), the connection remains in CLOSE_WAIT state on both the webserver and jmeter client and still taking/tracking a socket which should not be there. Browsers clean up the socket immediately, so it's not there anymore on both the client and the server which is the behaviour we need to see in JMeter |
I've made a setup with basic nginx, keep-alive timeout changed to 2 seconds. Then a script that does a request (with keep-alive), then 4 seconds later, an new requests. The 2nd requests fails due to this RST and connection in CLOSE_WAIT state. It only happens with the HttpClient4 implementation. The Java implementation has no issues |
So I've found the root cause of the issue and it's due to this config httpclient4.validate_after_inactivity. This defaults to 4900 (ms), meaning that if a connection trying to be disconnected by the server within 4900ms, and a new request is done between the requested disconnect and 4900ms, it just sends the request packet over the connection in CLOSE_WAIT state, resulting in a connection reset error. I think this is a bad implementation in the HttpClient4 library. It should simply remove a connection once the server requests it. A work-around is to set httpclient4.validate_after_inactivity low (900). This only resolves the unexpected "Non HTTP response" errors, but doesn't resolve the high CLOSE_WAIT state taking unrealistic higher connections on jmeter clients and left FIN_WAIT_2 state on webservers |
I experienced the same issue - thanks for posting here! |
Well I created a ticket with this right? But indeed, it should actually be resolved and behave like a browser, or any http client. So it should respond to the FIN packet and close the socket on the jmeter side and send a FIN back, so the server can remove the half closed state as well |
Ah you are right - I thought it is still the bugzilla tracker... But since 2022 it was migrated here. So everything is fine :-) |
Great find, I've been suffering from this issue as well. From the above it sounds like an issue was also filed with Apache? Would you mind providing the link, I've been searching in the Apache Jira without success . |
No, the only issue, is this issue here. But as it's a httpclient bug I think it should be reported there I guess |
any update on this ticket ? |
Sorry, after analysing and putting everything on a plate, nothing happens. Not sure how big the jmeter team is, but I've made some interesting proposals and even PRs, but nothing is happening. Last JMeter update was more than year ago, so it feels like a dead project lately. |
Well, could you please clarify why would you expect exactly this behavior? In other words, it means "server is done sendings its bytes", however, it is fine for a TCP connection to be in half-closed state, and the client can still send data and the server could receive it.
Well, this boils down to the expected outcome of your test. In practice, people use JMeter to simulate actual applications. So you should configure JMeter exactly the same as the application/microservice/browser you try to impersonate. For instance, if the application/microservice/browser does not expect the server to close a connection shortly, then the application would fall into the same issue of trying to send data over a broken connection. In other words, "Non HTTP response message: : failed to respond" surfaces a configuration error (assuming you've configured JMeter the same as your app). If the application performs connection re-validation, you should configure JMeter to do so. Does that make sense?
If your webserver closes connections immediately while JMeter trying to keep them alive, I expect it might be the following: WDYT? |
This appears as a critical bug for my use case as well. |
For the transaction controller, the fix is either: don't enable 'Generate parent sample', or make a build of this PR: #6386 Regarding close_wait, it doesn't fix the server side, but it helps on the errors, to lower httpclient4.validate_after_inactivity to 900ms for example. |
First of all, it is not fine for a tcp connection to be in half-closed state. The client can't send data where the server would receive it. At least not in a FIN_WAIT_2 / CLOSE_WAIT state, because this means the server is trying to shut down the socket, but the client didn't do it yet. If the client then sends a packet as if the connection would still be open, it would get a RST which is not fine.
You are right with simulating the clients behaviour. In most cases, the client you are simulating is either a browser or some kind of microservice doing API-calls. Either way, whenever the other end (server) sends a FIN/ACK packet, other client should simply respond with a FIN/ACK as well, so both ends can close the socket (similar to a SYN request, the other end has to respond with SYN/ACK to accept it or not). None of the browsers at least behave in a way, that after a FIN/ACK, they do nothing. They all accept the disconnect and are not trying to sent a new request over a half-closed socket.
In a normal browser-webserver situation, or HTTP/1.1 default, keep-alive is the default and the server dictates when to close the connection, not the client. The client has no idea when (the keep alive time in the header could give it a hint, but it's just a hint). But when the server closes a connection, ie after 2 seconds which is what many webservers do, then we start being into an unrealistic situation already, because jmeter doesn't actively respond on the closed connection on tcp level, and keeping the TCP socket in a half open state on both ends (server and client/jmeter). This causes obvious errors as stated in this issue. So this ticket is about 2 issues:
|
First, if the server was about to close the connection it should probably send Second, I'm afraid the only way to tell if "server" closes a connection is to read/write something to the connection. That means if the server silently closes a connection (which is (un)fortunately allowed by various HTTP RFCs) the client does not get an immediate notification, thus it can't discard the connections right away. Frankly, it is not clear what clients are supposed to do with all this. It is not clear how Java implementation handles "silent connection close"
Looks so, however, as the only way to detect "server-side closure" is to write data, so HttpClient4 should detect "IOException when writing headers", and retry the request. |
See https://datatracker.ietf.org/doc/html/rfc9293#name-half-closed-connections
Could you clarify (e.g. refer a RFC) why do you think half-closed connections are "not fine"? |
In my case 1 or 2 out of approx 8000 to 10000 transaction are failing with non http response message: connection reset by peer or connection timeout. I think jmeter is closing connection before it receive complete response from server and I can see it's producing code 499 on server log. |
No, with keep-alive (http/1.1 standards), it can disconnect whenever it wants. In many cases it can be after just 2 seconds. The connection: close is only a hint from the client to request the server to close the connection immediately after the response. If the server does a connection: keep-alive it certainly doesn't mean it would keep the connection open for unlimited time, and it will disconnect at some point (could vary between near instant, to 1, 2, or whatever seconds).
Sending a FIN/ACK is. I think, not a silently closed connection, but a clear message to the client that is is actively closing the connection. In fact, it gets the packet/message so there is nothing silent about it.
It is clear, it should reconnect if it want to do a new/next request. The problem is in httpclient4, where it is not actively closing the socket/responding to the FIN/ACK. It knows it is there, because if you want to do a request AFTER the httpclient4.validate_after_inactivity time, it does know the socket was requested to be closed and acts how it supposed to, but should have done it immediately when it got the FIN/ACK.
This is the whole idea of this ticket, it doesn't do so and jmeter throws this non-http error (because httpclient sends data on the socket in half-state and receives a RST from the server). https://datatracker.ietf.org/doc/html/rfc9293#name-closing-a-connection In the end, we can look into standards, if it is throwing errors where it is not supposed to do, and if it is not acting as browsers/client do, there is a bug |
@jgaalen , please refer to RFCs or the public documentation. Otherwise it is hard to tell where all your conclusions come from. Many parts of your messages violate or contradict RFCs and Java documentation. |
Please double-check. The Java side does not know there's FIN. "FIN" is not exposed in Java APIs. If you know an API, please clarify which one exposes "FIN from the server". Then, the application (e.g. Java application) can't tell if the server "died completely", "fully closed the stream" or "closed the write part of the stream only". See golang/go#67337 (comment) |
Perhaps screenshots make it more obvious what is going on. I've made a test case, with a simple nginx and a keep-alive timeout setting of 2 seconds. This is a tcpdump which shows what is going on: Here we can see it does the request, Then 2 seconds later, the webserver sends the FIN to close the socket. JMeter/client only responds with an ACK. At this moment, the connection is in CLOSE_WAIT state at the client (JMeter), and in FIN_WAIT_2 state at the server (nginx). You cannot argue this is good and expected behaviour from JMeter. It should simply not try to send the request on this half-closed socket. Even tho, it is technically allowed to keep the socket in a half-closed/open state, it should not send a new request over this as it would lead to obvious errors which are not realistic. This screenshots shows the behaviour from a browser (firefox). You can see that 2s after the response, the server closes the connection, but immediately, firefox (client) closes the connection on its behave as well, clearing the socket on both ends. This screenshot shows the behaviour when we wait 6 seconds rather than 4 seconds (passing the 4900ms of default httpclient4.validate_after_inactivity), we can see better behaviour. Before it tries to send the request, it first closes the connection on the client side as well (notifying the server with a FIN/ACK) and creates a new connection. workaround is to set httpclient4.validate_after_inactivity maybe to 1ms so it always evaluates if the socket is in a half-closed state and acts how it should |
Here's a way to reproduce |
Httpclient validates the connection before making a request if |
"What it does it attempts a read() with 1ms timeout." |
=> => |
so setting validity setting to 1ms would at least solve the unexpected non-http errors (but not the high CLOSE_WAIT/FIN_WAIT_2) states |
It might add an artificial |
So basically this already happens for every request happening after 4900ms of idle time on a socket? |
I have to say @vlsi that I feel like this is very demotivating and I feel you're not taking this issue seriously and deflecting everything. Also on other topics, I've made some contribution to this open source project but everything ends up in void. What is the status of JMeter anyway? Last update was more than a year ago! That isn't a good sign for a tool so generically used as JMeter, so what is that about? If you wan't to keep JMeter alive and the community contributing then this is, I think, not the way to go. |
Hi jgaalen Firstly I (and I think other too) your PR and discussion are apprecied About JMeter updates, it for some years than they are slow About your contributions, it takes time to check them and merge them. Unfortunatly JMeter commiters don't have always time for some days/weeks. |
@jgaalen , you raise good points, however, please consider the fix would be much faster if you provide a solution, and it would be way faster if the solution includes the relevant tests and documentation. This specific issue is caused by Apache HttpClient 4 not properly retrying the requests. We can't do much from JMeter's perspective here. Of course, we could add a retry loop. The issue does not impact the systems I work with, so I'm not that eager to adding a retry or something like that. If you feel confident, you could implement it and suggest a PR. I appreciate you provide tcpdumps, pictures, however, please consider it took significant time to persuade you there's no way to detect and react to FIN in Java code. In the meantime, I asked HttpClient developers about the retry, and they say HttpClient 5 should not be impacted, and they say they are not going to fix HttpClient 4: https://issues.apache.org/jira/browse/HTTPCLIENT-2359 So yet another option for JMeter is to implement HttpClient5-based option for the HTTP Sampler. My wild guess is it should be enough to duplicate Regarding your PRs,
|
Expected behavior
If the server closes an HTTP connection, jmeter should handle the closed connection immediately and not keeping it in CLOSE_WAIT state until the next request, which could also lead to "Non HTTP response" errors
Actual behavior
Currently, when JMeter does an HTTP request and is in keep-alive state and server waits for the next request. If the server closes the connection after X-seconds and sends a FIN/ACK packet, JMeter does nothing. If the next request is done within 5 seconds after the FIN/ACK packet, it sets a "Non HTTP response message: : failed to respond" error immediately, because the socket is already in CLOSED state and can't pick up new packets. This results in this error.
If it tries to send a request later than 5 seconds, somehow it gracefully sends a FIN/ACK back and creates a new connection and does the request.
Steps to reproduce the problem
Set keep-alive timeout to 2 seconds on the server, so it shuts down the connection after 2 seconds.
Then create a jmeter script to send request 1, then wait 3 seconds, then send request 2. It will end up with the Non HTTP response error, because it tries to send the request on a closed socket.
Now wait 6 seconds rather than 3, and somehow it does work without an error.
JMeter Version
5.6.3
Java Version
17
OS Version
Mac + Ubuntu
The text was updated successfully, but these errors were encountered: