Uploaded image for project: 'Artifactory Binary Repository'
  1. Artifactory Binary Repository
  2. RTFACT-18495

SHA256 hashes in PyPi repository metadata

    XMLWordPrintable

    Details

    • Release Notes:
      Yes

      Description

      The repository index which Artifactory generates for PyPi clients like ‘pip’ to use includes only md5 hashes, while the best practice is now for sha256 to be used. While this always represents a potential security concern due to the use of a questionable hash algorithm for integrity protection, it prevents the use of Artifactory for PyPi in an environment where any clients are configured in FIPS mode. The reason is that 'pip' uses OpenSSL for checksum verification, and on machines with the kernel configured to FIPS mode OpenSSL refuses to use non FIPS 140-2 hash algorithms. This results in an error like "Unknown hash algorithm 'md5'" when using "pip" against Artifactory on any client in FIPS mode.

      To be more specific, since there has been some confusion in a support request due to Artifactory explicitly advertising SHA256 in its own API, this is related to the Artifactory implementation of the PyPi "legacy" API used by pip. This is the index page found at e.g.

      https://<artifactory_hostname>/artifactory/api/pypi/<repository_name>/simple

      When looking at the index listing for a specific package, to use a random package for example:

      https://<artifactory_hostname>/artifactory/api/pypi/<repository_name>/simple/jupyter/

      You will see that the links provided to the package files are in the format:

      ../../packages/jupyter/1.0.0/jupyter-1.0.0-py2.py3-none-any.whl#md5=e19e7c80e9bdcd4b5946e318f9a60767

      The “#md5=” component is used by pip to verify the downloaded file. PyPi.org, the Warehouse software which runs PyPi, and other PyPi repository implementations like Sonatype Nexus now provide a sha256 hash here instead of md5. This is a best practice, and also required in some cases such as ours where ‘pip’ will refuse to trust MD5 hashes.

      This is still the case for artifacts which have a sha256 hash available using other Artifactory APIs (all of ours do).

      For example, if we look up the same package in the pypi.org API:

      https://pypi.org/simple/jupyter/

      Notice that the link is in the form:

      https://files.pythonhosted.org/packages/83/df/0f5dd132200728a86190397e1ea87cd76244e42d39ec5e88efd25b2abd7e/jupyter-1.0.0-py2.py3-none-any.whl#sha256=5b290f93b98ffbc21c0c7e749f054b3267782166d72fa5e3ed1ed4eaf34a2b78

      sha256 hash used instead. Additionally, PEP-503 which documents the legacy “simple” repository API states:

      “Repositories SHOULD choose a hash function from one of the ones guaranteed to be available via the hashlib module in the Python standard library (currently md5, sha1, sha224, sha256, sha384, sha512). The current recommendation is to use sha256.”

        Attachments

          Activity

            People

            Assignee:
            shahafg Shahaf Golan
            Reporter:
            j_crawford Jesse Crawford
            Votes:
            17 Vote for this issue
            Watchers:
            25 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: