[RTFACT-18735] Assume offline doesn't work for slow PyPI servers Created: 11/Mar/19  Updated: 06/May/19

Status: Open
Project: Artifactory Binary Repository
Component/s: Artifact Storage, PyPI
Affects Version/s: 6.7.0
Fix Version/s: None

Type: Improvement Priority: Normal
Reporter: Rafael Cunha de Almeida Assignee: Unassigned
Resolution: Unresolved Votes: 1
Labels: None
Remaining Estimate: 3 days
Time Spent: Not Specified
Original Estimate: 3 days

Support Tickets:

Man Group plc - Support Case

Product Comments: 22-Apr-2019: According to Uriah, Artifactory assumes the remote resource is offline only once the error returned is 502-505.
In this scenario, there was no response from the remote resource and therefore the repo wasn't assumed as offline.

No plans to fix at the moment (raised for the first time).
Support Comments: Reproduced by Support (Ohad Levy)
01/05/19 : customer updated the Jira to improvement.
                 

 Description   

I've configured a remote PyPI repo pointing to another artifactory which was very slow. That caused read time outs while retrieving package index (either /simple/ or /simple/<pkg-name>/). However, the repo was never set as offline. That's particularly bad when that remote is part of a virtual repo, as all requests to that virtual repo will wait until the connection to the remote server times out before they respond.

How to reproduce

It's easy to reproduce the issue using toxy (https://github.com/h2non/toxy). All you need to do is point the proxy to another artifactory instance and set the latency high (eg. 1500ms, if that's your socket timeout configuration in your remote repo). This is what my toxy configuration looks like:

const toxy = require('toxy')

const proxy = toxy()
const rules = proxy.rules
const poisons = proxy.poisons

proxy
  .all('/*')
  .poison(poisons.slowRead({ chunk: 1, threshold: 2000 }))
  .poison(toxy.poisons.latency(1500))
  .withRule(rules.method('GET'))
  .forward('http://artifactory2:8081')

proxy.listen(3000)
console.log('Server listening on port:', 3000)

I've set a test virtual repo containing a local repo and a remote pointing to http://artifactory2:3000 (I bound the proxy locally on that port). After doing that this curl would always take more than 1.5s:

time curl http://artifactory/artifactory/api/pypi/test/simple/

Artifactory configuration:

$ curl http://artifactory/artifactory/api/repositories/test
{
  "key" : "test",
  "packageType" : "pypi",
  "description" : "",
  "repositories" : [ "local-python", "remote-python" ],
  "rclass" : "virtual"
}
$ curl http://artifactory/artifactory/api/repositories/remote-python
{
  "key" : "remote-python",
  "packageType" : "pypi",
  "description" : "",
  "url" : "http://artifactory2:3000/artifactory/pypi/api/local-python/",
  "rclass" : "remote"
}
$ curl http://artifactory/artifactory/api/repositories/local-python
{
  "key" : "local-python",
  "packageType" : "pypi",
  "description" : "",
  "rclass" : "local"
}
$ curl http://artifactory2/artifactory/api/repositories/local-python
{
  "key" : "local-python",
  "packageType" : "pypi",
  "description" : "",
  "rclass" : "local"
}

Logs

As you can see, there's never a message in the logs about setting the remote index offline:

2019-03-11 11:57:14,715 [http-nio-8081-exec-5] [ERROR] (o.a.r.RemoteRepoBase:757) - IO error while trying to download resource 'remote-python:.pypi/simple.html': java.net.SocketTimeoutException: Read timed out
2019-03-11 11:57:14,716 [http-nio-8081-exec-5] [ERROR] (o.a.a.p.r.r.PypiRemoteIndexProvider:135) - Could not retrieve remote index from http://artifactory2:3000/artifactory/api/pypi/local-python/simple/:
Read timed out
2019-03-11 11:57:21,896 [http-nio-8081-exec-9] [ERROR] (o.a.r.RemoteRepoBase:757) - IO error while trying to download resource 'remote-python:.pypi/simple.html': java.net.SocketTimeoutException: Read timed out
2019-03-11 11:57:21,897 [http-nio-8081-exec-9] [ERROR] (o.a.a.p.r.r.PypiRemoteIndexProvider:135) - Could not retrieve remote index from http://artifactory2:3000/artifactory/api/pypi/local-python/simple/:
Read timed out

Generated at Tue Oct 15 13:31:39 UTC 2019 using JIRA 7.6.16#76018-sha1:9ed376192612a49536ac834c64177a0fed6290f5.