Advanced Cleanup Using Artifactory Query Language (AQL)

Each Artifactory administrator has his own methodology and policies for managing binaries within Artifactory, however, cleaning artifacts and freeing up storage space is a common need that every administrator has. As you probably know, Artifactory does provide a few cleanup methods out-of-the-box such as deleting complete versions, limiting the number of snapshots and deleting unused cached artifacts. Still, for many use cases these methods are not flexible enough and don’t satisfy the very specific deletion requirements that different administrators may have. This leads systems users to getting cluttered with keep very old, unused and unnecessary artifacts which consumes a lot of storage space.

Artifactory Query Language (AQL) is specially designed to find artifacts stored within Artifactory’srepositories based on any number of search criteria. Its syntax offers a simple way to formulate complex queries that specify any number of search criteria, filters, sorting options, and field output parameters. AQL is exposed as a RESTful API which, when possible, uses data streaming to provide output data resulting in extremely fast response times and low memory consumption.

Datastore Disk

“Datastore disk is too high” warnning in Artifactory’s UI or log sound familiar?. At this point, most people will find themselves saying “I wish JFrog would develop a cleanup method that answers my exact needs”, but since everybody’s exact needs are different, we can’t really do that. Or can we? Of course we can. The groovy script below that runs any AQL query does exactly that. All you have to do is provide the criteria for cleanup.

Let’s see an example. Say I want to delete all files that meet the following criteria:

  • They are the largest 100 files which were created by the “jenkins-builder” user.
  • Their extension is either tar, zip or rpm.
  • They have never been downloaded.
  • Their size is greater than 100Mb .
  • They are tagged with the property ‘qa=approved’.

 

The AQL query that returns the files answering all of these criteria looks like this:

items.find(
	{
		"type":"file",
		"created_by":"jenkins-builder",
		"size":{"$gt":"100000000"},
		"stat.downloads":{"$eq":null},
		"@qa":"approved",
		"$or":[
			{"name":{"$match":"*.tar"}},
			{"name":{"$match":"*.zip"}},
			{"name":{"$match":"*.rpm"}}
		]
	}
)
.sort({"$desc":["size","name"]})
.limit(100)

And here is the groovy script that deletes the files returned by the AQL query.. All you need to do is:

  • Replace the “query” parameter value with the one above (or with your own AQL query)
  • Replace the value of “artifactoryURL” with your own Artifactory server URL
  • Replace  the credentials from ‘admin:password’ to your own credentials.
  • Set the “dryRun” parameter. If set to “true” (as below) then files will only be listed. If set to “false” the files will really be deleted (so I suggest running one with “true” first)

How does it work? This script sends the AQL query to Artifactory, and in response, receives the list of the artifacts that meet all our criteria as a JSON object. For each artifact in the list, it construct the artifacts path (see constructPath()), and if “dryRun” is false, sends another http DELETE request. If the server is not accessible or if the user has insufficient permissions, it will print a message to the output.

(For the most recent script, visit to our GitHub account)

@Grab(group = 'org.codehaus.groovy.modules.http-builder', module = 'http-builder', version = '0.6')
import groovyx.net.http.RESTClient
import groovyx.net.http.HttpResponseException
import org.apache.http.conn.HttpHostConnectException
/**
* Created by shaybagants on 4/30/15.
*/

def query = 'items.find({"type":"file","name":{"$match":"jfrog-artifact-3.*.tar.gz"}})' // replace this with your AQL query
def artifactoryURL = 'https://localhost:8081/artifactory/' // replace this with your Artifactory server
def restClient = new RESTClient(artifactoryURL)
restClient.setHeaders(['Authorization': 'Basic ' + "admin:password".getBytes('iso-8859-1').encodeBase64()]) //replace the 'admin:password' with your own credentials
def dryRun = true //set the value to false if you want the script to actually delete the artifacts

def itemsToDelete = getAqlQueryResult(restClient, query)
if (itemsToDelete != null && itemsToDelete.size() > 0) {
delete(restClient, itemsToDelete, dryRun)
} else {
println('Nothing to delete')
}

/**
* Send the AQL to Artifactory and collect the response.
*/
public List getAqlQueryResult(RESTClient restClient, String query) {
def response
try {
response = restClient.post(path: 'api/search/aql',
body: query,
requestContentType: 'text/plain'
)
} catch (Exception e) {
println(e.message)
}
if (response != null && response.getData()) {
def results = [];
response.getData().results.each {
results.add(constructPath(it))
}
return results;
} else return null
}

/**
* Construct the full path form the returned items.
* If the path is '.' (file is on the root) we ignores it and construct the full path from the repo and the file name only
*/
public constructPath(HashMap item) {
if (item.path.toString().equals(".")) {
return item.repo + "/" + item.name
}
return item.repo + "/" + item.path + "/" + item.name
}

/**
* Send DELETE request to Artifactory for each one of the returned items
*/
public delete(RESTClient restClient, List itemsToDelete, def dryRun) {
dryMessage = (dryRun) ? "*** This is a dry run ***" : "";
itemsToDelete.each {
println("Trying to delete artifact: '$it'. $dryMessage")
try {
if (!dryRun) {
restClient.delete(path: it)
}
println("Artifact '$it' has been successfully deleted. $dryMessage")
} catch (HttpResponseException e) {
println("Cannot delete artifact '$it': $e.message" +
", $e.statusCode")
} catch (HttpHostConnectException e) {
println("Cannot delete artifact '$it': $e.message")
}
}
}