When configuring two Artifactory instances (pip -> art2 -> art1 -> internet) in a chain to serve pypi packages via remote repo, art2 only serves packages to pip that have been cached at art1 already. Uncached artefacts return 404. The expected behaviour would be that uncached artefacts are fetched from the internet.
In practical terms this means, starting with empty caches everywhere, the following call fails with 404.
But the following sequence succeeds.
Fetching Metadata for packages works (pip always can determine the list of available versions for a package), but pip then fails to download the actual package file.
I already investigated a bit by examining the HTTP-requests.
When pip directly talks to Artifactory (pip -> art1 -> internet), the package URLs announced by Artifactory through the package metadata are of the form
Pip then requests exactly that URL, which works as expected. If the package is missing, the request triggers an upstream lookup and initiates downloading from upstream.
However, when one Artifactory instance talks to another upstream Artifactory instance (pip -> art2 -> art1 -> internet), the request art2 -> art1 is:
Note that there only is one "packages" in the URL.
This URL does not trigger any further upstream lookup and therefore causes a 404 downstream for uncached artefacts.
The funny part is, once an artefact is cached, any URL works, regardless of how many "packages/..." levels it contains. This explains why uncached artefacts are hit by that issue only.