Uploaded image for project: 'Artifactory Binary Repository'
  1. Artifactory Binary Repository
  2. RTFACT-18151

createOrhpanItems repair API uses a broken (and complex) SQL query


    • Type: Bug
    • Status: Open
    • Priority: Normal
    • Resolution: Unresolved
    • Affects Version/s: 5.5.2, 6.6.0
    • Fix Version/s: None
    • Component/s: Database
    • Labels:



      Migrating from case insensitive database, could have path aggregations (and conflicts) if there were files deployed in similar manner as below:


      You would see the two files above under fatherDir/childrepo only.

      Artifactory has a built-in mechanism that scans and detects which folders has no father and create the folder automatically.

      However the query used for it (below in the JIRA) is broken for at least 2 cases and considered very complex (O(n^2) complexity) which can cause it to run for days (and possibility not complete) with millions of artifacts, and other registered paths scenarios.


      Set up replication between two instances, original one should be case insensitive - target should be sensitive

      Steps to reproduce 3 scenario the query would be broken for:

      1. If there are paths in other repositories which have the same parent paths as in the repository the query was issued for. E.g.:

      a. Create paths like below on one repo like below (this already creates a conflict the query should repair):
      curl -i -Lvv -X PUT -uadmin:password "localhost:8081/artifactory/generic-local/father/child/foo.bar״
      curl -i -Lvv -X PUT -uadmin:password "localhost:8081/artifactory/generic-local/father/chilD/foo.bar"

      b. Create similar parent path like below on a different repo:
      curl -i -Lvv -X PUT -uadmin:password "localhost:8081/artifactory/generic-local-2/father/chilD/foo.bar"

      The createOrphans repair REST API for generic-local repo will return an empty result.

      2. If a father path under the same repository will contain the same directory component but a different ancestor to the father:

      a. Create paths like below on one repo like below (this already creates a conflict the query should repair):
      curl -i -Lvv -X PUT -uadmin:password "localhost:8081/artifactory/generic-local/father/child/foo.bar״
      curl -i -Lvv -X PUT -uadmin:password "localhost:8081/artifactory/generic-local/father/chilD/foo.bar"

      b. Create a different ancestor path while keeping the same orphan father directory like below:
      curl -i -Lvv -X PUT -uadmin:password "localhost:8081/artifactory/generic-local/boo/chilD/foo.bar2"

      3. Half a million of conflicts out of total million artifacts on a single repository:

      a. Create large repository with conflicts (half million affected)
      b. Issue the repair createOrphan API for that repo:


      c. It will "never" complete

      "SELECT * FROM nodes n1" +
                                  " WHERE n1.repo = ?" +
                                  " AND n1.node_path like ?" +
                                  " AND n1.node_name NOT IN" +
                                  " (SELECT n2.node_name FROM nodes n2, nodes n3" +
                                  " WHERE (n2.node_path like '%/%' AND n2.node_path like CONCAT('%/', n3.node_name))" +
                                  " OR (n2.node_path not like '%/%' AND n2.node_path like CONCAT('%', n3.node_name)))",
                          path.getRepo(), emptyIfNullOrDot(path.getPath()) + "%"




            • Assignee:
              andreik Andrei Komarov
            • Votes:
              0 Vote for this issue
              1 Start watching this issue


              • Created: