Uploaded image for project: 'Artifactory Binary Repository'
  1. Artifactory Binary Repository
  2. RTFACT-20333

Performance problems when deploying artifacts to a very large folder

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Done
    • Priority: 2 - Critical
    • Resolution: Done
    • Affects Version/s: 6.12.2
    • Fix Version/s: 6.17.0
    • Component/s: Performance
    • Labels:
    • Environment:

      PostgreSQL database

      Artifactory HA

      Docker repository

    • Severity:
      Critical

      Description

      Symptoms: Regular REST API requests take a long time to complete, the UI hangs, there is high database activity, and the problem goes away after a build completes.

       

      In Artifactory, the database has a "nodes" table which tracks some of the artifact metadata information. This table contains information on each file in Artifactory, including the folder path. Some SQL queries look to be inefficient when it comes to deploying artifacts to folders which already contain a huge number of items. 

      This is a problem because at a very large scale, Artifactory performs poorly.

       

      Steps to reproduce:

      1. Deploy at least 7,000,000 files to a local Docker repository 
        It doesn't have to be actual Docker images, plaintext files will work
      1. Deploy around 800,000 - 1,000,000 files to a single folder in this large repository
      2. Rapidly deploy Docker Images to the single large folder - Simulating a build
        1. A good number to try for would be at least 100 docker images, pushed rapidly in succession
      3. Observe that Artifactory stops responding during the rapid deployment

       

      The logs indicate that this performance problem is coming from the database: 

      20191015151553|290|REQUEST|172.19.0.1|admin|HEAD|/api/docker/docker-local/v2/centos/blobs/sha256:729ec3a6ada3a6d26faca9b4779a037231f1762f759ef34c08bdd61bf52cd704|HTTP/1.1|200|0

      20191015151553|10|REQUEST|172.19.0.1|admin|HEAD|/api/docker/docker-local/v2/centos/blobs/sha256:0f3e07c0138fbe05abcb7a9cc7d63d9bd4c980c3f61fea5efa32e7c4217ef4da|HTTP/1.1|200|0

      20191119002934| 30068 [30 seconds] |REQUEST|RESTRICTED_IP|admin|PUT|/api/docker/docker-local/v2/big-folder/centos/manifests/39|HTTP/1.1|201|529

      2019-11-19 00:29:34,011 [http-nio-8081-exec-57] [DEBUG] (o.j.s.JdbcHelper    :191) - Query returned in 28.02 secs : 'select distinct  n.repo as itemRepo,n.node_id as itemId,n.node_path as itemPath,n.node_name as itemName,n.node_type as itemType  from  nodes n  where ( n.repo = 'docker-local' and n.depth >= 4 and( n.node_path like 'big-folder/centos/88/%' or n.node_path = 'big-folder/centos/88') and n.node_type = 1) '

       

      I believe the query took 4 seconds because the SELECT statement is done along a path containing a wildcard (centos/latest/%), and the SQL table is very large along that path.

       

        Attachments

          Activity

            People

            Assignee:
            Unassigned
            Reporter:
            patrickr Patrick Russell
            Votes:
            3 Vote for this issue
            Watchers:
            10 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Sync Status

                Connection: RTFACT Sync
                RTMID-20333 -
                SYNCHRONIZED
                • Last Sync Date: