What is the ‘indexed_archives_entries’ table and how do I clean it?

The “indexed_archives_entries” table represents an index of files that are contained within archive files to allow their content to be searchable through “Archive Search” (formerly the “Class Search”). When a new archive file is deployed to Artifactory, its content is indexed and the table is updated. These archive files can be jar, war, zip, or others as defined in the ${ARTIFACTORY_HOME}/etc/mimetypes.xml file. 


Entries are deleted from this table whenever an indexed artifact is deleted by the garbage collector, so this table should not hold any indexes of non-existing archives.

To disable archive indexing for a specific file type, edit the mimetypes.xml and change the value of the “index” attribute of the file type from “true” to “false”. Disabling future indexing of a specific mimetype will not delete existing indexes from the DB.

We will describe here the steps to delete the indexes from the DB and reindex only the necessary mimetype. Please follow the instructions carefully since these involve deletions from the Database which could cause corruption if not executed properly:


  1. Edit the ${artifactory_home}/etc/mimetypes.xml file to cancel indexing of the desired file type

  2. Shutdown Artifactory

  3. Delete all from the ‘indexed_archives_entries’ table using the following SQL query:

DELETE FROM artdb.indexed_archives_entries;

  1. Delete all from the ‘indexed_archives’ table:

DELETE FROM artdb.indexed_archives;

  1. Delete all from the ‘archive_names’ table:

DELETE FROM artdb.archive_names;

  1. Delete all from the ‘archive_paths’ table:

DELETE FROM artdb.archive_paths;

  1. Start Artifactory


Optional:


If you want to restore archive indexing data after modifying the mimetypes.xml and cleaning the above tables, you can run the following REST query:

 curl -X POST -uadmin:password “http://{server_name}:{port_number}/artifactory/api/archiveIndex/*”

This is an internal REST query to calculate Archives Index which asynchronously and recursively calculates and saves the indices of configured mimetypes to allow their content to be searchable through Class Search. You can specify a single repository key in order to re-index only this repository or use ‘*’ instead of repoKey in order to trigger calculation for all local and cache repositories.

Notice that this query requires an admin user.

Also notice that this process could be very heavy. Try to activate it when the server is not heavily loaded. We recommend that you re-index one repo at a time in order not to put too much stress on the system.