[RTFACT-18735] Assume offline doesn't work for slow PyPI servers Created: 11/Mar/19  Updated: 06/May/19

Status: Open
Project: Artifactory Binary Repository
Component/s: Artifact Storage, PyPI
Affects Version/s: 6.7.0
Fix Version/s: None

Type: Improvement Priority: Normal
Reporter: Rafael Cunha de Almeida Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None


 Description   

I've configured a remote PyPI repo pointing to another artifactory which was very slow. That caused read time outs while retrieving package index (either /simple/ or /simple/<pkg-name>/). However, the repo was never set as offline. That's particularly bad when that remote is part of a virtual repo, as all requests to that virtual repo will wait until the connection to the remote server times out before they respond.

How to reproduce

It's easy to reproduce the issue using toxy (https://github.com/h2non/toxy). All you need to do is point the proxy to another artifactory instance and set the latency high (eg. 1500ms, if that's your socket timeout configuration in your remote repo). This is what my toxy configuration looks like:

const toxy = require('toxy')

const proxy = toxy()
const rules = proxy.rules
const poisons = proxy.poisons

proxy
  .all('/*')
  .poison(poisons.slowRead({ chunk: 1, threshold: 2000 }))
  .poison(toxy.poisons.latency(1500))
  .withRule(rules.method('GET'))
  .forward('http://artifactory2:8081')

proxy.listen(3000)
console.log('Server listening on port:', 3000)

I've set a test virtual repo containing a local repo and a remote pointing to http://artifactory2:3000 (I bound the proxy locally on that port). After doing that this curl would always take more than 1.5s:

time curl http://artifactory/artifactory/api/pypi/test/simple/

Artifactory configuration:

$ curl http://artifactory/artifactory/api/repositories/test
{
  "key" : "test",
  "packageType" : "pypi",
  "description" : "",
  "repositories" : [ "local-python", "remote-python" ],
  "rclass" : "virtual"
}
$ curl http://artifactory/artifactory/api/repositories/remote-python
{
  "key" : "remote-python",
  "packageType" : "pypi",
  "description" : "",
  "url" : "http://artifactory2:3000/artifactory/pypi/api/local-python/",
  "rclass" : "remote"
}
$ curl http://artifactory/artifactory/api/repositories/local-python
{
  "key" : "local-python",
  "packageType" : "pypi",
  "description" : "",
  "rclass" : "local"
}
$ curl http://artifactory2/artifactory/api/repositories/local-python
{
  "key" : "local-python",
  "packageType" : "pypi",
  "description" : "",
  "rclass" : "local"
}

Logs

As you can see, there's never a message in the logs about setting the remote index offline:

2019-03-11 11:57:14,715 [http-nio-8081-exec-5] [ERROR] (o.a.r.RemoteRepoBase:757) - IO error while trying to download resource 'remote-python:.pypi/simple.html': java.net.SocketTimeoutException: Read timed out
2019-03-11 11:57:14,716 [http-nio-8081-exec-5] [ERROR] (o.a.a.p.r.r.PypiRemoteIndexProvider:135) - Could not retrieve remote index from http://artifactory2:3000/artifactory/api/pypi/local-python/simple/:
Read timed out
2019-03-11 11:57:21,896 [http-nio-8081-exec-9] [ERROR] (o.a.r.RemoteRepoBase:757) - IO error while trying to download resource 'remote-python:.pypi/simple.html': java.net.SocketTimeoutException: Read timed out
2019-03-11 11:57:21,897 [http-nio-8081-exec-9] [ERROR] (o.a.a.p.r.r.PypiRemoteIndexProvider:135) - Could not retrieve remote index from http://artifactory2:3000/artifactory/api/pypi/local-python/simple/:
Read timed out

Generated at Mon Jul 15 18:03:45 UTC 2019 using JIRA 7.6.3#76005-sha1:8a4e38d34af948780dbf52044e7aafb13a7cae58.