[RTFACT-16743] Issue with nginx configuration on large file downloads Created: 18/May/18  Updated: 24/Sep/19  Resolved: 01/Jul/19

Status: Resolved
Project: Artifactory Binary Repository
Component/s: None
Affects Version/s: 5.6.2, 5.10.4, 6.3.2
Fix Version/s: None

Type: Bug Priority: Normal
Reporter: Scott Mosher Assignee: Ariel Kabov
Resolution: Not a Bug Votes: 3
Labels: QF, QF-P1

Issue Links:
Trigger
Regression:
Yes
Sprint: Pam - Sprint 1

 Description   

After upgrading from 5.4.4 to 5.10.4, the user will experience issues downloading large files (>1GB). They get returned from curl:

*curl -L -o again.txt -G http://myserver/artifactory/example-repo-local/test.txt*

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

67 1524M   67 1032M    0     0  3785k      0  0:06:52  0:04:39  0:02:13 1430k

curl: (18) transfer closed with 515411771 bytes remaining to read

 

It will always close around the 1GB mark.  The request logs return a 200 and display the correct size of the file.  The artifactory logs show a 499:

 

[WARN ] (o.a.r.ArtifactoryResponseBase:137) - Client Closed Request 499: java.net.SocketTimeoutException

 

Reproduce

  1. start up a 5.4.4 instance
  2. Setup nginx proxy with Artifactory generated config
  3. deploy a 1.5 GB file
  4. try to download with curl (curl -L -o again.txt -G http://myserver/artifactory/example-repo-local/test.txt)
  5. **download will succeed
  6. upgrade to 5.10.4
  7. download once again to see error and failed download

Workaround

add the proxy_max_temp_file_size to nginx configuration and set to 0 or some larger amount.  Its default value is 1024 (1GB)

 

We would like to know what would have changed in the upgrade process that would need the addition of this property in nginx for large files.



 Comments   
Comment by Ariel Kabov [ 07/Nov/18 ]

This will reproduce only with slow download speeds.
In order to reproduce it easier you can add a limit to NGINX such as:

 limit_rate 2m;
Comment by Michael Huettermann [ 07/Nov/18 ]

Hi team,

as suggested separately already, it would be a nice and helpful feature to be able to add own, additional NGinx directives to the NGinx configuration, which is generated and continuously updated automatically, by Artifactory, unless you switch this auto-update feature off (but this can only be a workaround).

Thanks.

Michael

Comment by Ariel Kabov [ 07/Nov/18 ]

Hi Michael Huettermann,
What do you mean by automatically updated NGINX configuration? Are you talking about our Helm chart?
If that's the case than we have this discussion here.

Comment by Michael Huettermann [ 07/Nov/18 ]

Hi Ariel. No I mean the mechanism you provide here: https://github.com/jfrog/artifactory-docker-examples/tree/master/docker-compose/artifactory. In this scenario, you are auto-updating the NGinx configuration. Of course, if the Nginx config is decoupled, it is just sufficient to add the directive, and you are done. 

Comment by Ariel Kabov [ 08/Nov/18 ]

Michael Huettermann I have looked into it.
You can disable the NGINX auto-reload by passing this ENV variable:

SKIP_AUTO_UPDATE_CONFIG=true
Comment by Michael Huettermann [ 08/Nov/18 ]

Hi Ariel.
Yes, I know. My idea was that the additional Nginx directive persists without switching off the auto-update. In a last comment here, see above, I've said that switching off is a workaround. I think it is not really a bug, the request to add the missing feature is more a RFE.
Thanks.
Best regards
Michael

Comment by Christian Bremer [ 23/Nov/18 ]

We are experiencing the same problem and setting proxy_max_temp_file_size to 0 does not work.
I have restarted the nginx docker process and see that the file have changed with the value but I am no nginx expert so it might not have been set correctly as well.

I am also using https://github.com/jfrog/artifactory-docker-examples/tree/master/docker-compose/artifactory

A more detailed step-by-step workaround would be appreciated.

We are using Artifactory 6.3.3

 

Comment by Lior Gur (Inactive) [ 25/Jun/19 ]

This issue happens regardless to an upgrade.

I reproduce it on latest version without any upgrade.

As mentioned in this ticket the solution is to add this param-  proxy_max_temp_file_size  to the Nginx conf.

This variable indicates the max size of a temporary file when the data served is bigger than the proxy buffer. If it is indeed bigger than the buffer, it will be served synchronously from the upstream server, avoiding the disk buffering.

If you configure proxy_max_temp_file_size to 0, then your temporary files will be disabled.

Or set proxy_max_temp_file_size to a number that is greater than the file size.

For example if usually files size are more then 1G, set proxy_max_temp_file_size to 2048m

Comment by Michael Huettermann [ 02/Jul/19 ]

Hi Lior Gur I don't think the issue is resolved due to the reasons I've summarized above.
What is the recommended approach to cope with this behavior while relying on your Docker Compose setup? (others here in this thread did point to that too)
This inference did prevent some of my customers to use this approach, and in one case the customer did decide to take a different tool as binary repo manager. The feedback (also streamed in separately) was not incorporated in any way for roughly one entire year now, so I'm wondering a bit about why this is not important for you, or why your Docker based solution is not supported any more.
Thank you. (also to Ariel Kabov for commenting here)

Comment by Michael Huettermann [ 19/Sep/19 ]

Hello. Do you meanwhile perhaps any insights regarding the questions raised above?

thanks.

Comment by Ariel Kabov [ 20/Sep/19 ]

After revising the issue in order to get for a root cause, we have a conclusion.

Prior to Artifactory version 5.6.0, the Tomcat used was 8.0.X.
In this version of Tomcat, the WriteTimeout was by default infinite.

This WriteTimeout parameter is not really mentioned in the Tomcat documentation, but in basic it is a timeout which is relevant when Tomcat is writing to a Socket.
If nothing is written by Tomcat into the open socket within the configured WriteTimeout, it will close the connection.
This didn't happen for older Artifactory versions (prior to 5.6) because this timeout was infinite.

Starting Tomcat 8.5.X, the WriteTimeout is being set based on the configured connectionTimeout. As the default connectionTimeout is 60 seconds, this makes the WriteTimeout of 60 seconds as well.
That statement leads us into an additional workaround of the issue at hand, which is to increase the Tomcat connectionTimeout.
I can confirm that with connectionTimeout="-1", the reported issue doesn't reproduce.
(You cannot set the writeTimeout explicitly, so we have to change the connectionTimeout.)

This issue is not a bug, and actually the expected behaviour. We would define any behaviour that occurred prior to 5.6 as problematic.
Tomcat has limited the writeTimeout for a reason, as with an infinite configuration it resulted in connection exhaustion in several cases.

As NGINX is used with a cache and it saves the temp files, and the default limitation of the temp file is 1gb, when trying to download larger files with a network connection that is too slow for this operation, it will cause the caching happens faster then download, which should result in an error.

With regards to the discussion at the chat around the Docker Compose, we should all remember that the Docker compose available on JFrog's GitHub is an example.
When taken into a production environment a lot of things should be taken into consideration and it is highly possible that modifications to the NGINX will be required when large files are downloaded via slow network.

While we can ship in the Artifactory default NGINX without a "proxy_max_temp_file_size" limitation, it is something that can max out environments, which is something we wish to avoid.

Hope that clarifies.

Comment by Idan Bidani [ 20/Sep/19 ]

Thanks for the background info and explanation, I did found this issue today after long internal investigation with network team trying to isolate the issue. RTFACT-16398 

The workarounds(resolutions) offered on the other ticket seems to be doing the trick, but I think this issue is a regression in the performance of Artifactory. I really like that JFrog provides the reverse proxy config generation to make sure it is optimized to the best intended performance. As now that you identified the root cause I think there should be an update to the config with one of the resolutions to ensure large companies' can continue enjoy Artifactory.

I hope to get suggestion between turning off the proxy_buffering completely or increasing the proxy_max_temp_file_size parameter(with the explanation to the users the value may vary per usage/env) or alternative option to be able to serve large files

Thanks

Comment by Michael Huettermann [ 24/Sep/19 ]

Hi Ariel Kabov. Thanks for taking care. I'd like to slightly disagree: this is not an expected behavior from a customer/user perspective. Since the Docker Compose including the Docker images are provided by you, it is probably a bit too easy now to say that this is just an example. Unable to render embedded object: File (/jira/images/icons/emoticons/smile.png) not found.  

Since this issue exists for around two years now, and did irritate couple of my customers, I'd like to again suggest to provide any public guidance how to solve that, respectively a feature as part of the delivery that does cope with that. I like the idea of Idan Bidani. Some time ago, I've suggested to make this configurable, being able to change the nginx config even with auto-update enabled. 

Thank you!

Generated at Tue Oct 22 19:02:51 UTC 2019 using JIRA 7.6.16#76018-sha1:9ed376192612a49536ac834c64177a0fed6290f5.