Affects Version/s: 7.35.0
Fix Version/s: None
Component/s: Docker Image
Artifactory Cold Storage uses a straightforward configuration system to archive unused files. For most repository types this works as expected, if a jar file isn't downloaded after X days it gets archived.
Docker images are different. They consist of layers and a manifest.json file, Artifactory treats the layers and manifest as artifacts while the parent folder is the "image".
Docker clients will only download the layers of an image it needs, it skips shared base layers for example. Artifactory will see this behavior as some artifacts in a folder being downloaded while others are not.
This limitation is noted on our "artifactCleanup" User Plugin.
If you try to set up Cold Storage with a straightforward AQL pattern on a Docker repository, you will corrupt all of its Docker images as some layers will randomly be archived.
What is the impact?
Users who set up Cold Storage on a Docker repository will end up with corrupt, broken Docker images. Chunks of images will end up in the Cold Storage repository, and they are not usable until the layers are restored.
Due to the nature of the archive system, you have to fully restore the entire Docker repository to restore the corrupt images. Recovery is possible, but involves rolling back all archived Docker repositories.
While this bug is in effect, Docker repositories can't be used with Cold Storage.
What is the expected behavior?
Artifactory needs to recognize when Cold Storage is set up against Docker repositories and change its behavior: It needs to track only the manifest.json file's statistics.
If the manifest.json is not downloaded within X days, the parent "folder" AKA the entire Docker Image is archived.
It should be noted that existing Cold Storage documentation on this subject assumes that the user can go through every single Docker image to archive them via a manually applied System Property. The application should handle the cleanups itself.
Steps to reproduce:
1] Set up 2 Artifactory installations
2] Set up a Docker image with partially pulled layers:
Manually download a single layer and the manifest.json and wait a day / move the computer clock forward
3] Configure them for Hot/Cold Storage (Hot and Cold instances)
4] Set up an Archive Policy using this AQL pattern to archive images older than one day:
6] Observe that only the downloaded Docker layer remains, the rest get archived.
Artifactory version which the bug was reproduced on: 7.35
DB type & Version: Postgresql
Is this an HA env? Yes - 3 HA Nodes
Is this On-Prem or SaaS? On-Prem
Installation type: Package-manger installed