Uploaded image for project: 'Artifactory Binary Repository'
  1. Artifactory Binary Repository
  2. RTFACT-18151

createOrhpanItems repair API uses a broken (and complex) SQL query

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Normal
    • Resolution: Unresolved
    • Affects Version/s: 5.5.2, 6.6.0
    • Fix Version/s: None
    • Component/s: Database
    • Labels:
      None

      Description

      Overview:

      Migrating from case insensitive database, could have path aggregations (and conflicts) if there were files deployed in similar manner as below:

      fatherDir/childrepo/file.ext
      fatherDir/ChiLdrEpo/file2.ext

      You would see the two files above under fatherDir/childrepo only.

      Artifactory has a built-in mechanism that scans and detects which folders has no father and create the folder automatically.

      However the query used for it (below in the JIRA) is broken for at least 2 cases and considered very complex (O(n^2) complexity) which can cause it to run for days (and possibility not complete) with millions of artifacts, and other registered paths scenarios.

      Workaround:

      Set up replication between two instances, original one should be case insensitive - target should be sensitive


      Steps to reproduce 3 scenario the query would be broken for:

      1. If there are paths in other repositories which have the same parent paths as in the repository the query was issued for. E.g.:

      a. Create paths like below on one repo like below (this already creates a conflict the query should repair):
      curl -i -Lvv -X PUT -uadmin:password "localhost:8081/artifactory/generic-local/father/child/foo.bar״
      curl -i -Lvv -X PUT -uadmin:password "localhost:8081/artifactory/generic-local/father/chilD/foo.bar"

      b. Create similar parent path like below on a different repo:
      curl -i -Lvv -X PUT -uadmin:password "localhost:8081/artifactory/generic-local-2/father/chilD/foo.bar"

      The createOrphans repair REST API for generic-local repo will return an empty result.


      2. If a father path under the same repository will contain the same directory component but a different ancestor to the father:

      a. Create paths like below on one repo like below (this already creates a conflict the query should repair):
      curl -i -Lvv -X PUT -uadmin:password "localhost:8081/artifactory/generic-local/father/child/foo.bar״
      curl -i -Lvv -X PUT -uadmin:password "localhost:8081/artifactory/generic-local/father/chilD/foo.bar"

      b. Create a different ancestor path while keeping the same orphan father directory like below:
      curl -i -Lvv -X PUT -uadmin:password "localhost:8081/artifactory/generic-local/boo/chilD/foo.bar2"


      3. Half a million of conflicts out of total million artifacts on a single repository:

      a. Create large repository with conflicts (half million affected)
      b. Issue the repair createOrphan API for that repo:

      api/repairPaths/createOrphanItems/<repoKey>

      c. It will "never" complete

      "SELECT * FROM nodes n1" +
                                  " WHERE n1.repo = ?" +
                                  " AND n1.node_path like ?" +
                                  " AND n1.node_name NOT IN" +
                                  " (SELECT n2.node_name FROM nodes n2, nodes n3" +
                                  " WHERE (n2.node_path like '%/%' AND n2.node_path like CONCAT('%/', n3.node_name))" +
                                  " OR (n2.node_path not like '%/%' AND n2.node_path like CONCAT('%', n3.node_name)))",
                          path.getRepo(), emptyIfNullOrDot(path.getPath()) + "%"
      

        Attachments

          Activity

            People

            Assignee:
            Unassigned
            Reporter:
            andreik Andrei Komarov
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated: