Artifactory comes with a built-in embedded Derby database that can be reliably used to store data (metadata) for production-level repositories up to hundreds of gigabytes in size.
However, Artifactory supports pluggable database implementations allowing you to change the default to use other popular databases.
Artifactory currently supports the following databases:
- Derby (The default embedded database)
- MySQL v5.5 and above with InnoDB
- Oracle version 10g and above
- Microsoft SQL Server 2008 and above
- PostgreSQL v9.2 and above
For each of the supported databases you can find the corresponding properties file inside
Choosing the Right Database
As the default database, Derby provides good performance since it runs in the same process as Artifactory, however, under intensive usage or high load, performance may be degraded since Artifactory and the database compete for shared JVM resources such as caches and memory. Therefore, for Artifactory servers that need to support heavy load, you may consider using an external database such as MySQL or PostgreSQL which are very common choices in many Artifactory installations.
Any of the other supported databases is also a fair choice and may be the practical choice to make if your organization is already using one of them.
Accessing a Remote Database
When using an external database, you need a reliable, stable and low-latency network connection to ensure proper functioning of your system.
When using a fullDB configuration, we strongly recommend a high-bandwidth to accommodate the transfer of large BLOBs over the network.
Modes of Operation
Artifactory supports two modes of operation:
- Metadata in the database and binaries stored on the file system (This is the default and recommended configuration).
- Metadata and binaries stored as BLOBs in the database
Artifactory uniquely stores artifacts using checksum-based storage.
A file that is uploaded to Artifactory, first has its SHA1 checksum calculated, and is then renamed to its checksum. It is then hosted in the configured filestore in a directory structure made up of the first two characters of the checksum. For example, a file whose checksum is "ac3f5e56..." would be stored in directory "ac"; a file whose checksum is "dfe12a4b..." would be stored in directory "df" and so forth. The example below shows the "d4" directory that contains two files whose checksum begins with "d4"
In parallel, Artifactory's creates a database entry mapping the file's checksum to the path it was uploaded to in a repository. This way of storing binaries optimizes many operations in Artifactory since they are implemented through simple database transactions rather than actually manipulating files.
Artifactory stores any binary file only once. This is what we call "once and once only storage". First time a file is uploaded, Artifactory runs the required checksum calculations when storing the file, however, if the file is uploaded again (to a different location, for example), the upload is implemented as a simple database transaction that creates another record mapping the file's checksum to its new location. There is no need to actually store the file again in storage. No matter how many times a file is uploaded, the filestore only hosts a single copy of the file.
Copying and Moving Files
Copying and moving a file is implemented by simply adding and removing database references and, correspondingly, performance of these actions is that of a database transaction.
Deleting a file is also a simple database transaction in which the corresponding database record is deleted. The file itself is not directly deleted, even if the last database entry pointing to it is removed. So-called "orphaned" files are removed in the background by Artifactory's garbage collection processes.
Upload, download and replication
Before moving files from one location to another, Artifactory sends checksum headers. If the files already exist in the destination, they are not transferred even if they exist under a different path.
Filesystem performance is greatly improved because actions on the filestore are implemented as database transactions, so there is never any need to do a write-lock on the filesystem.
Searching for a file by its checksum is extremely fast since Artifactory is actually searching through the database for the specified checksum.
Since the database is a layer of indirection between the filestore and the displayed layout, any layout can be supported, whether for one of the standard packaging formats such as Maven1, Maven2, npm, NuGet etc. or for any custom layout.
Before You Start
Changing the database does not automatically transfer your data to the new database. Please follow the steps below to backup your data so that you can restore it after the change.
Backup Your Current Installation
When changing the database for an existing installation you must first perform a Full System Export using the "Exclude Content" option. Once your new database is set up and configured, you will import this data to re-populate your Artifactory metadata content.
Make sure to backup your current Artifactory system before updating to a new database. You will need your Artifactory instance to be disconnected from the network to avoid usage during this procedure.
Setup the New Database
To setup your new database you need to perform the following steps:
- Create a database instance
- Create an Artifactory user for the database
- Install the appropriate JDBC driver
- Copy the relevant database configuration file
- Configure the corresponding
- Start Artifactory
- Import the metadata using Full System Import
These steps are fully detailed in the specific documentation page for each of the supported databases listed in the Overview.
Once you have setup your database, you can configure it to support your expected load with the following two parameters:
|The maximum number of pooled database connections (default: 100).|
|The maximum number of pooled idle database connections (default: 10).|