In previous versions of Artifactory, medium+ scale Artifactory instances with dozens and hundreds millions of artifacts might encountered slowness running the Artifactory Garbage Collection.
The main issue was the trashcan cleanup phase (which is the first step of the gc), it could be slow, and sometimes never ending process (depending on the amount of data, type of DB, DB available resources).
When trashcan repo is configured, by default, GC will run the a new cleanup strategy. The new strategy fetches trashcan candidates that are located under the trashcan repo for more than 14 days (or any other configurable time), undeploy these, and immediately check if there is no other reference for this checksum. In no other reference exists, it will also perform the binary deletion.
This has two main benefits on the old strategy:
1. There is one thread that fetch the candidates and dispatch the work, but multiple threads performing the cleanup (configurable, 'artifactory.gc.numberOfWorkersThreads=3')
2. The query to find the candidates is much faster, millis to seconds comparing seconds or minutes, and even never ending query on large DB with lack of resources.
3. The binary deletion is done immediately if no other reference exists to this checksum (unlike the previous strategy which perform the binaries cleanup only after the first trashcan cleanup phase completed)
4. The default value of trashcan batch items was increased from 100 to 10000 (configurable, 'artifactory.trashcan.max.search.results=10000')
Every 20 executions (configurable, 'artifactory.gc.skipFullGcBetweenMinorIterations=20', Artifactory will run the old Full GC in addition to the trashcan GC strategy. This will ensure that unreferenced binaries that already exists (already existing unreferenced binaries, or artifacts that manually deleted from trashcan repo), will get deleted.