Uploaded image for project: 'Artifactory Binary Repository'
  1. Artifactory Binary Repository
  2. RTFACT-20333

Performance problems when deploying artifacts to a very large folder

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Done
    • Resolution: Done
    • Affects Version/s: 6.12.2
    • Fix Version/s: 6.17.0
    • Component/s: Performance
    • Labels:
    • Environment:

      PostgreSQL database

      Artifactory HA

      Docker repository

    • Severity:
      Critical

      Description

      Symptoms: Regular REST API requests take a long time to complete, the UI hangs, there is high database activity, and the problem goes away after a build completes.

       

      In Artifactory, the database has a "nodes" table which tracks some of the artifact metadata information. This table contains information on each file in Artifactory, including the folder path. Some SQL queries look to be inefficient when it comes to deploying artifacts to folders which already contain a huge number of items. 

      This is a problem because at a very large scale, Artifactory performs poorly.

       

      Steps to reproduce:

      1. Deploy at least 7,000,000 files to a local Docker repository 
        It doesn't have to be actual Docker images, plaintext files will work
      1. Deploy around 800,000 - 1,000,000 files to a single folder in this large repository
      2. Rapidly deploy Docker Images to the single large folder - Simulating a build
        1. A good number to try for would be at least 100 docker images, pushed rapidly in succession
      3. Observe that Artifactory stops responding during the rapid deployment

       

      The logs indicate that this performance problem is coming from the database: 

      20191015151553|290|REQUEST|172.19.0.1|admin|HEAD|/api/docker/docker-local/v2/centos/blobs/sha256:729ec3a6ada3a6d26faca9b4779a037231f1762f759ef34c08bdd61bf52cd704|HTTP/1.1|200|0

      20191015151553|10|REQUEST|172.19.0.1|admin|HEAD|/api/docker/docker-local/v2/centos/blobs/sha256:0f3e07c0138fbe05abcb7a9cc7d63d9bd4c980c3f61fea5efa32e7c4217ef4da|HTTP/1.1|200|0

      20191119002934| 30068 [30 seconds] |REQUEST|RESTRICTED_IP|admin|PUT|/api/docker/docker-local/v2/big-folder/centos/manifests/39|HTTP/1.1|201|529

      2019-11-19 00:29:34,011 [http-nio-8081-exec-57] [DEBUG] (o.j.s.JdbcHelper    :191) - Query returned in 28.02 secs : 'select distinct  n.repo as itemRepo,n.node_id as itemId,n.node_path as itemPath,n.node_name as itemName,n.node_type as itemType  from  nodes n  where ( n.repo = 'docker-local' and n.depth >= 4 and( n.node_path like 'big-folder/centos/88/%' or n.node_path = 'big-folder/centos/88') and n.node_type = 1) '

       

      I believe the query took 4 seconds because the SELECT statement is done along a path containing a wildcard (centos/latest/%), and the SQL table is very large along that path.

       

        Attachments

          Activity

              People

              Assignee:
              Unassigned
              Reporter:
              patrickr Patrick Russell
              Votes:
              3 Vote for this issue
              Watchers:
              10 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Sync Status

                  Connection: RTFACT Sync
                  RTMID-20333 -
                  SYNCHRONIZED
                  • Last Sync Date: