Uploaded image for project: 'Artifactory Binary Repository'
  1. Artifactory Binary Repository
  2. RTFACT-20423

Please provide a REST service to copy / move files in bulk

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Normal
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      Description

      This JIRA is an enhancement request to the Artifactory REST API.

      The current Artifactory REST API features a service named Copy Item to copy a given item (either a file or a folder) to another location in Artifactory.

      When using the service to copy a folder, the request will only return after the copy completes which makes sense from a user point-of-view. This time can vary based on the number of items in the folder being copied.

      I'm currently implementing an automation to create a snapshot of a repository which implies to copy the full repository content somewhere else in Artifactory. Some of the repositories I need to snapshot can contain up to 26 000 files.

      The sheer amount of files to copy means that I'm hitting the limits of the current Copy Item REST service. I ran a test. The copy eventually completes after 18 minutes (thanks to 6 parallel copy threads) but as expected, those 26 000 copy requests sent to Artifactory induce a high DB load and a huge network traffic. In terms of performance, this is a worse case scenario.

      I tried Improving things by having Artifactory perform whole folder copies: instead of issuing 26000 file copy requests, the snapshot would sent 5000 file copy requests and 25 folder copy requests. I'm not going to elaborate on how this works but it turns out that this solution doesn't really work because of the metadata files: when performing a folder copy, if the folder contains some metadata files (i.e. "repodata") and the user doesn't have permissions to overwrite them in the destination, Artifactory is going to throw an error because the user lacks delete / overwrite permissions. One solution would be to tell Artifactory "Copy this folder over there but skip all the metadata", which is not possible.

      My need could be addressed by featuring a new REST service to perform a set of copy / move operations in a transactional way. This would provide a scalable version of the "Copy Item" / "Move Item" current service.

      The new service could be invoked by sending a POST request to "/api/batch" with a JSON payload with the following structure:

      {
          "operations": [
              {
                  "from": "/repo/dir1/dir2/file.txt",
                  "to": "/another-repo/dir1/dir2/file.txt",
                  "type": "copy"
              },
              {       
                  "from": "/yet-another-repo/file3.txt",
                  "to": "/another-repo/dir1/dir2/file4.txt",
                  "type": "copy"
              }
          ]     
      }
      

      where:

      • "from" denotes the source location of the item to copy / move
      • "to" denotes the target location where to copy / move the item
      • "type" denotes the type of operation to perform. Expected values: "copy" or "move".

      The items should be copied / moved in the same DB transaction to ensure that the process is transactional and performant.

      The service should return a HTTP 200 (OK) if the operation succeeds or a HTTP 4xx if the update fails.

      Note: I'm thinking that the service could also be extended to cover item deletions as in the following example.

      {
          "operations": [
              {
                  "from": "/repo/dir1/dir2/file.txt",
                  "to": "/another-repo/dir1/dir2/file.txt",
                  "type": "copy"
              },
              {       
                  "from": "/yet-another-repo/file3.txt",
                  "to": "/another-repo/dir1/dir2/file4.txt",
                  "type": "copy"
              },
              {       
                  "location": "/yet-another-repo/file.txt",
                  "type": "delete"
              }
          ]     
      }
      

      Having this new service would remove the limitations of the current Copy Item service.

      Just to be clear, I don't intend the service to be able to handle 26 000 items at once. It would be nice if it could support up to 500 operations at a time.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              francois.ritaly Francois Ritaly
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: