Have a question? Want to report an issue? Contact JFrog support

Skip to end of metadata
Go to start of metadata

Overview

JFrog Artifactory offers flexible filestore management that is configurable to meet a variety of needs in terms of binary storage providers, storage size, and redundancy. Not only are you now able to use different storage providers, but you can also chain a series of providers together to build complex structures of binary providers and support seamless and unlimited growth in storage.

Artifactory offers flexible filestore management through the binarystore.xml configuration file located in the $ARTIFACTORY_HOME/etc folder. By modifying this file you can implement a variety of different binary storage configurations.

Take care when modifying binarystore.xml

Making changes to this file may result in losing binaries stored in Artifactory!

If you are not sure of what you are doing, please contact JFrog Support for assistance.

Chains and Binary Providers

The binarystore.xml file specifies a chain with a set of binary providers. A binary provider represents a type of object storage feature such as “cached filesystem”. Binary providers can be embedded into one another to form chains that represent a coherent filestore. Artifactory comes with a built-in set of chains that correspond to the binary.provider.type parameter that was used in previous versions of Artifactory. The built-in set of chains available in Artifactory are:

  • file-system

  • cache-fs

  • full-db

  • full-db-direct

  • s3

  • google-storage
  • double-shards

  • redundant-shards
  • cluster-file-system
  • cluster-s3
  • cluster-google-storage

Configuring a Built-in Filestore

To configure Artifactory to use one of the built-in filestores, you only need some basic configuration elements.

 

 

Page contents

 


Basic Configuration Elements

For basic filestore configuration, the binarystore.xml file is quite simple and contains the basic tags or elements that are described below along with the attributes that they may include:

config tag

The <config> tag specifies a filestore configuration. It includes a version attribute to allow versioning of configurations.

<config version="v1">
…
</config>

chain element

The config tag contains a chain element that that defines the structure of the filestore. To use one of the built-in filestores, the chain element needs to include the corresponding template attribute. For example, to use the built-in basic “file system” template, all you need is the following configuration:

<config version="v1">
	<chain template="file-system"/>
</config>

Built-in Templates

The following sections describe the basic chain templates come built-in with Artifactory and are ready for you to use out-of-the-box, as well as other binary providers that are included in the default chains.

Additional information about every template can be found below, under the Built-in Chain Templates section.

file-system

The most basic filestore configuration for Artifactory used for a local or mounted filestore.

cache-fs

Works the same way as filesystem but also has a binary LRU (Least Recently Used) cache for download requests. Improves performance of instances with high IOPS (I/O Operations) or slow NFS access.

full-db

All the metadata and the binaries are stored as BLOBs in the database with an additional layer of caching.

full-db-direct
All the metadata and the binaries are stored as BLOBs in the database without caching.
s3

This is the setting used for S3 Object Storage using the JetS3t library.

s3Old
This is the setting used for S3 Object Storage using JCloud as the underlying framework.
google-storage
This is the setting used for Google Cloud Storage as the remote filestore.
azure-blob-storage
This is the setting used for Azure Blob Storage as the remote filestore.
double-shards

A pure sharding configuration that uses 2 physical mounts with 1 copy (which means each artifact is saved only once).

redundant-shards
A pure sharding configuration that uses 2 physical mounts with 2 copies (which means each shard stores a copy of each artifact).
cluster-file-system
A filestore configuration where each node has its own local filestore (just like the file-system chain) and is connected to all other nodes via dynamically allocated Remote Binary Providers using the Sharding-Cluster provider.
cluster-s3

This is the setting used for S3 Object Storage using the JetS3t library. It is based on the sharding and dynamic provider logic that synchronizes the cluster-file-system.

cluster-google-storage

This is the setting used for Google Cloud Storage using the JetS3t library. It is based on the sharding and dynamic provider logic that synchronizes the cluster-file-system.

cluster-azure-blob-storage

This is the setting used for Azure Blob Storage. It is based on the sharding and dynamic provider logic that synchronizes the cluster-file-system.


Modifying an Existing Filestore

To accommodate any specific requirements you may have for your filestore, you may modify one of the existing chain templates either by extending it with additional binary providers or by overriding one of its attributes. For example, the built-in filesystem chain template stores binaries under the $ARTIFACTORY_HOME/data/filestore directory. To modify the template so that it stores binaries under $FILESTORE/binaries you could extend it as follows:

<!-- file-system chain template structure  -->  
<config version="v1">   
	<chain template="file-system"/>
	<provider id="file-system" type="file-system"> 				<!-- Modify the "file-system" binary provider -->
		<baseDataDir>$FILESTORE/binaries</baseDataDir>		<!-- Override the <baseDataDir> attribute -->
	</provider>
</config>

Built-in Chain Templates

Artifactory comes with a set of chain templates built-in allowing you to set up a variety of different filestores out-of-the-box. However, to override the built-in filestores, you need to be familiar with the attributes available for each binary provider that is used in them. These are described in the following sections which also show the template configuration and what is 'under the hood' in each template. Also, usage examples can be found for all templates. 

Filesystem Binary Provider

This is the basic filestore configuration for Artifactory and is used for a local or mounted filestore.

file-system template configuration

If you choose to use the file-system template, your binarystore.xml configuration file should look like this:

<config version="v1">
	<chain template="file-system"/>
</config>

 

What's in the template? 
While you don't need to configure anything else in your binarystore.xml, this is what the file-system template looks like under the hood.

In this example, the filestore and temp folder are located under the root directory of the machine.  

<config version="v1">
	<chain template="file-system"/>
	<provider id="file-system" type="file-system">
		<baseDataDir>/var/opt/jfrog/data</baseDataDir>
        <fileStoreDir>/filestore</fileStoreDir>   
		<tempDir>/temp</tempDir>
	</provider>
</config>

Where:

type
file-system
baseDataDir

Default: $ARTIFACTORY_HOME/data

The root directory where Artifactory should store data files.
fileStoreDir

Default: filestore

The root folder of binaries for the filestore. If the value specified starts with a forward slash (“/”) the value is considered the fully qualified path to the filestore folder. Otherwise, it is considered relative to the baseDataDir.
tempDir

Default: temp

A temporary folder under baseDataDir into which files are written for internal use by Artifactory. This must be on the same disk as the fileStoreDir.

Cached Filesystem Binary Provider

The cache-fs serves as a binary LRU (Least Recently Used) cache for all upload/download requests. This can improve Artifactory's performance since frequent requests will be served from the cache-fs (as in case of the S3 binary provider).

The cache-fs binary provider will be the closest filestore layer of Artifactory. This means that if the filestore is mounted, we would like the cache-fs to be local on the artifactory server itself (if the filestore is local, then cache-fs is meaningless). In the case of an HA configuration, the cache-fs will be mounted and the recommendation is for each node to have its own cache-fs layer.

 

cache-fs template configuration

If you choose to use the cache-fs template, your binarystore.xml configuration file should look like this:

<config version="v1">
	<chain template="cache-fs"/>
</config>
What's in the template?
While you don't need to configure anything else in your binarystore.xml, this is what the cache-fs template looks like under the hood. 

This example sets the cache-fs size to be 10GB and its location (absolute path since it starts with a "/") to be /cache/filestore. 

<config version="v1">
	<chain template="cache-fs"/>
	<provider id="cache-fs" type="cache-fs">
    	<cacheProviderDir>/cache/filestore</cacheProviderDir>
		<maxCacheSize>10000000000</maxCacheSize>
	</provider>
</config>

Where:

type
cache-fs
maxCacheSize

Default: 5000000000 (5GB)

The maximum storage allocated for the cache in bytes.
cacheProviderDir

Default: cache

The root folder of binaries for the filestore cache. If the value specified starts with a forward slash (“/”) it is considered the fully qualified path to the filestore folder. Otherwise, it is considered relative to the baseDataDir.

Full-DB Binary Provider

This binary provider saves all the metadata and binary content as BLOBs in the database with an additional layer of caching on the filesystem. 
Caching can improve Artifactory's performance since frequent requests will be served from the cache-fs before reaching out to the database. 

 

full-db template configuration

If you choose to use the full-db template, your binarystore.xml configuration file should look like this:

<config version="v1">
    <chain template="full-db"/>
 </config>
What's in the template? 

While you don't need to configure anything else in your binarystore.xml, this is what the full-db template looks like under the hood. 
For details about the cache-fs provider, please refer to Cached Filesystem Binary Provider.
The blob provider is what handles the actual saving of metadata and binary content as BLOBs in the database. 

<config version="v1">
    <chain template="full-db"/>
	<provider id="cache-fs" type="cache-fs">
    	<provider id="blob" type="blob"/>
	</provider>
 </config>

 

Full-DB-Direct Binary Provider

This binary provider saves all the metadata and binary content as BLOBs in the database without using a caching layer. 
full-db-direct template configuration

If you choose to use the full-db-direct template, your binarystore.xml configuration file should look like this:

 

<config version="v1">
    <chain template="full-db-direct"/>
</config>
What's in the template? 

While you don't need to configure anything else in your binarystore.xml, this is what the full-db-direct template looks like under the hood. 
The blob provider is what handles the actual saving of metadata and binary content as BLOBs in the database.

<config version="v1">
    <chain template="full-db-direct"/>
	<provider id="blob" type="blob"/>
</config>

 

Cloud Storage Providers

Using cloud storage providers is only available with an Enterprise license. 
As part of its universal approach, Artifactory supports a variety of cloud storage providers described in detail in the sections below. These providers will typically be wrapped with other binary providers to ensure that the binary resources are always available from Artifactory (for example, to enable Artifactory to serve files when requested even if they have not yet reached the cloud storage due to upload latency). 

S3 Binary Provider

Artifactory provides templates to let you configure storage on an S3 cloud provider where there are two options: s3 and s3old

  • The s3 template is used for configuring S3 Object Storage using the JetS3t library. 
  • The s3Old template is used for configuring S3 Object Storage using the JClouds. 

These binary providers for cloud storage solutions have a very similar selection of parameters.

type

s3 or s3old

testConnection

Default: true

When set to true, the binary provider uploads and downloads a file when Artifactory starts up to verify that the connection to the cloud storage provider is fully functional.

useSignature

Default: false.

When set to true, requests to AWS S3 are signed. Available from AWS S3 version 4. For details, please refer to Signing AWS API requests in the AWS S3 documentation.

multiPartLimit

Default: 100,000,000 bytes

File size threshold over which file uploads are chunked and multi-threaded.

identity

Your cloud storage provider identity.

credential

Your cloud storage provider authentication credential.

region

The region offered by your cloud storage provider with which you want to work.

bucketName

Your globally unique bucket name.

path

Default: filestore

The path relative to the bucket where binary files are stored.
proxyIdentity

Corresponding parameters if you are accessing the cloud storage provider through a proxy server.

proxyCredential
proxyPort
proxyHost
port

The cloud storage provider’s port.

endPoint

The cloud storage provider’s URL.

roleName

Only available on S3.

The IAM role configured on your Amazon server for authentication.

When this parameter is used, the refreshCredentials parameter must be set to true.

refreshCredentials

Default: false. Only available on S3.

When true, the owner's credentials are automatically renewed if they expire.

When roleName is used, this parameter must be set to true.

httpsOnly

Default: true. Only available on S3.

Set to true if you only want to access your cloud storage provider through a secure https connection.

httpsPort

Default: 443. Must be set if httpsOnly is true. The https port for the secure connection.

When this value is specified, the port needs to be removed from the endPoint.

providerID

Set to S3. Only available for S3old.

s3AwsVersion

Default: 'AWS4-HMAC-SHA256' (AWS signature version 4). Only available on S3.

Can be set to 'AWS2' if AWS signature version 2 is needed. Please refer the AWS documentation for more information.

<property name="s3service.disable-dns-buckets" value="true"></property>
Artifactory by default prepends the bucketName in front of the endpoint (e.g. mybucket.s3.aws.com) to create an URL that it access the S3 bucket with. S3 providers such as Amazon AWS uses this convention.
However, this is not the case for some S3 providers use the bucket name as part of the context URL (e.g. s3provider.com/mybucket); so Artifactory needs to have following perimeter added in order for the URI to be compatible with the S3 providers. S3 providers that use this URI format includes OpenStack, CEPH, CleverSafe, and EMC ECS.
The snippets below show the basic template configuration and examples that use the S3 binary provider to support several configurations (CEPH, CleverSafe and more). 
s3 template configuration
Because you must configure the s3 provider with parameters specific to your account (but can leave all other parameters with the recommended values), if you choose to use this template, your binarystore.xml configuration file should look like this:

 

<config version="2">
    <chain template="s3"/>
    <provider id="s3" type="s3">
       <endpoint>http://s3.amazonaws.com</endpoint>
       <identity>[ENTER IDENTITY HERE]</identity>
       <credential>[ENTER CREDENTIALS HERE]</credential>
       <path>[ENTER PATH HERE]</path>
       <bucketName>[ENTER BUCKET NAME HERE]</bucketName>
    </provider>
</config>

 

What's in the template?

While you don't need to configure anything else in your binarystore.xml, this is what the s3 template looks like under the hood.

<config version="v1">
    <chain template="s3"/>
    <provider id="cache-fs" type="cache-fs">
        <provider id="eventual" type="eventual">
            <provider id="retry" type="retry">
                <provider id="s3" type="s3"/>
            </provider>
        </provider>
    </provider>
</config>


For details about the cache-fs provider, please refer to Cached Filesystem Binary Provider.

For details about the eventual provider, please refer to Eventual Binary Provider .
For details about the retry provider, please refer to Retry Binary Provider.
Example 1

A configuration for OpenStack Object Store Swift.

 

<config version="v1">
	<chain template="s3"/>
	<provider id="s3" type="s3">
    	<identity>XXXXXXXXX</identity>
    	<credential>XXXXXXXX</credential>     
    	<endpoint><My OpenStack Server></endpoint>
    	<bucketName><My OpenStack Container></bucketName>
    	<httpsOnly>false</httpsOnly> 
    	<property name="s3service.disable-dns-buckets" value="true"></property>                               
	</provider>
</config>

Example 2
A configuration for CEPH. 

 

<config version="v1">
	<chain template="s3"/>
	<provider id="s3" type="s3">
		<identity>XXXXXXXXXX</identity>
    	<credential>XXXXXXXXXXXXXXXXX</credential>     
    	<endpoint><My Ceph server></endpoint>  			<!-- Specifies the CEPH endpoint -->
	    <bucketName>[My Ceph Bucket Name]</bucketName>
		<property name="s3service.disable-dns-buckets" value="true"></property>                               
    	<httpsOnly>false</httpsOnly>                            
	</provider>
</config>
Example 3
A configuration for CleverSafe.

<config version="v1">
	<chain template="s3"/>
	<provider id="s3" type="s3">
    	<identity>XXXXXXXXX</identity>
	    <credential>XXXXXXXX</credential>     
    	<endpoint>[My CleverSafe Server]</endpoint> 	<!-- Specifies the CleverSafe endpoint -->
	    <bucketName>[My CleverSafe Bucket]</bucketName>
    	<httpsOnly>false</httpsOnly> 
		<property name="s3service.disable-dns-buckets" value="true"></property>                               
	</provider>
</config>
Example 4
A configuration for S3 with a proxy between Artifactory and the S3 bucket.

<config version="v1">
	<chain template="s3"/>
	<provider id="s3" type="s3">
	    <identity>XXXXXXXXXX</identity>
		<credential>XXXXXXXXXXXXXXXXX</credential>     
	    <endpoint>[My S3 server]</endpoint>
    	<bucketName>[My S3 Bucket Name]</bucketName>
	    <proxyHost>[http proxy host name]</proxyHost>
    	<proxyPort>[http proxy port number]</proxyPort>
	    <proxyIdentity>XXXXX</proxyIdentity>
    	<proxyCredential>XXXX</proxyCredential>                          
	</provider>
</config>
Example 5
A configuration for S3 using an IAM role instead of an IAM user.

<config version="v1">
	<chain template="s3"/>
	<provider id="s3" type="s3">
		<roleName>XXXXXX</roleName>
		<endpoint>s3.amazonaws.com</endpoint>
		<bucketName>[mybucketname]</bucketName>
		<refreshCredentials>true</refreshCredentials>
	</provider>
</config>
Example 6
A configuration for S3 when using server side encryption. 

<config version="v1">
	<chain template="s3"/>
	<provider id="s3" type="s3">
    	<identity>XXXXXXXXX</identity>
    	<credential>XXXXXXXX</credential>    
    	<endpoint>s3.amazonaws.com</endpoint>
    	<bucketName>[mybucketname]</bucketName>
    	<property name="s3service.server-side-encryption" value="AES256"></property>  
	</provider>
</config>

Example 7
A configuration for S3 when using EMC Elastic Cloud Storage (ECS).

<config version="v1">    
	<chain template="s3"/>
    <provider id="s3" type="s3">
        <identity>XXXXXXXXXX</identity>
        <credential>XXXXXXXXXXXXXXXXX</credential>    
        <endpoint><My ECS server></endpoint>     <!-- e.g. https://emc-ecs.mycompany.com -->
        <httpsPort><My ECS Server SSL Port></httpsPort>     <!-- Required only if HTTPS port other than 443 is used -->
        <bucketName>[My ECS Bucket Name]</bucketName>
        <property name="s3service.disable-dns-buckets" value="true"></property>                              
    </provider>
</config>

S3Old Binary Provider

The snippet below shows an example that uses the S3 binary provider where JClouds is the underlying framework.

s3Old template configuration

A configuration for AWS.
Because you must configure the s3Old provider with parameters specific to your account (but can leave all other parameters with the recommended values), if you choose to use this template, your binarystore.xml configuration file should look like this:

 

<config version="v1">
    <chain template="s3Old"/>
    <provider id="s3Old" type="s3Old">
        <identity>XXXXXXXXX</identity>
        <credential>XXXXXXXX</credential>    
        <endpoint>s3.amazonaws.com</endpoint>
        <bucketName>[mybucketname]</bucketName>                        
    </provider>
</config>

 

What's in the template?

While you don't need to configure anything else in your binarystore.xml, this is what the s3Old template looks like under the hood. 

 

<config version="v1">
    <chain template="s3Old"/>
    <provider id="cache-fs" type="cache-fs">
        <provider id="eventual" type="eventual">
            <provider id="retry" type="retry">
                <provider id="s3Old" type="s3Old"/>
            </provider>
        </provider>
    </provider>
</config>

For details about the cache-fs provider, please refer to Cached Filesystem Binary Provider.

For details about the eventual provider, please refer to Eventual Binary Provider .
For details about the retry provider, please refer to Retry Binary Provider.

 

Google Storage Binary Provider

The google-storage template is used for configuring Google Cloud Storage as the remote filestore.

The snippets below show the basic template configuration and an examples that use the Google Cloud Storage binary provider.

This binary provider uses the following set of parameters:

type

google-storage

testConnection

Default: true

When set to true, the binary provider uploads and downloads a file when Artifactory starts up to verify that the connection to the cloud storage provider is fully functional.

multiPartLimit

Default: 100,000,000 bytes

File size threshold over which file uploads are chunked and multi-threaded.

identity

Your cloud storage provider identity.

credential

Your cloud storage provider authentication credential.

bucketName

Your globally unique bucket name.

path

Default: filestore

The path relative to the bucket where binary files are stored.
proxyIdentity

Corresponding parameters if you are accessing the cloud storage provider through a proxy server.

proxyCredential
proxyPort
proxyHost
port

The cloud storage provider’s port.

endPoint

The cloud storage provider’s URL.

httpsOnly

Default: true.

Set to true if you only want to access your cloud storage provider through a secure https connection.

httpsPort

Default: 443. Must be set if httpsOnly is true. The https port for the secure connection.

When this value is specified, the port needs to be removed from the endPoint.

bucketExists

Default: false.

When set to true, it indicates to the binary provider that a bucket already exists in Google Cloud Storage and therefore does not need to be created.

google-storage template configuration

Because you must configure the google-storage provider with parameters specific to your account (but can leave all other parameters with the recommended values), if you choose to use this template, your binarystore.xml configuration file should look like this:

<config version="v1">
	<chain template="google-storage"/>
 
	<provider id="google-storage" type="google-storage">
		<endpoint>commondatastorage.googleapis.com</endpoint>
		<bucketName><BUCKET NAME></bucketName>
		<identity>XXXXXX</identity>
		<credential>XXXXXXX</credential>
	</provider>
</config>

What's in the template?

While you don't need to configure anything else in your binarystore.xml, this is what the google-storage template looks like under the hood.

 

<config version="v1">
	<chain template="google-storage"/>
    <provider id="cache-fs" type="cache-fs">
        <provider id="eventual" type="eventual">
            <provider id="retry" type="retry">
                <provider id="google-storage" type="google-storage"/>
            </provider>
        </provider>
    </provider>
</config>

For details about the cache-fs provider, please refer to Cached Filesystem Binary Provider.
For details about the eventual provider, please refer to Eventual Binary Provider .
For details about the retry provider, please refer to Retry Binary Provider.

 

Example 1
A configuration with a dynamic property from the JetS3t library. In this example, the httpclient.max-connections parameter sets the maximum number of simultaneous connections to allow globally (default is 100).

<config version="v1">
	<chain template="google-storage"/>
	<provider id="google-storage" type="google-storage">
		<endpoint>commondatastorage.googleapis.com</endpoint>
		<bucketName><BUCKET NAME></bucketName>  
		<identity>XXXXXX</identity>
		<credential>XXXXXXX</credential>
		<property name="httpclient.max-connections" value=150></property>
	</provider>
</config> 

Azure Blob Storage Binary Provider

The azure-blob-storage template is used for configuring Azure Blob Storage as the remote filestore.

The snippets below show the basic template configuration and an examples that use the Azure Blob Storage binary provider.

This binary provider uses the following set of parameters:

testConnection

Default: true

When true, Artifactory uploads and downloads a file when starting up to verify that the connection to the cloud storage provider is fully functional.

accountName

The storage account can be a General-purpose storage account or a Blob storage account which is specialized for storing objects/blobs.

 Your cloud storage provider identity.

accountKey

Your cloud storage provider authentication credential.

containerName

Your globally unique container name on Azure Blob Storage.

endpoint

The hostname. You should only use the default value unless you need to contact a different endpoint for testing purposes.

httpsOnly

Default: true.

Set to true if you only want to access through a secure https connection.

The following snippet shows the default chain that uses azure-blob-storage as the binary provider:

 <config version="1">
    <chain template="azure-blob-storage"/>
    <provider id="azure-blob-storage" type="azure-blob-storage">
        <accountName>XXXXXXXX</accountName>
        <accountKey>XXXXXXXX</accountKey>
        <endpoint>https://<ACCOUNT_NAME>.blob.core.windows.net/</endpoint>
        <containerName><NAME></containerName>
    </provider>
</config> 

Eventual Binary Provider

This binary provider is not independent and will always be used as part of a template chain for a remote filestore that may exhibit upload latency (e.g. S3 or GCS). To overcome potential latency, files are first written to a folder called “eventual” under the baseDataDir in local storage, and then later uploaded to persistent storage with the cloud provider. The default location of the eventual folder is under the $ARTIFACTORY_HOME/data folder (or $CLUSTER_HOME/ha-data in the case of an HA configuration using a version of Artifactory below 5.0) and is not configurable. You need to make sure that Artifactory has full read/write permissions to this location.

There are three additional folders under the eventual folder:

  • _pre: part of the persistence mechanism that ensures all files are valid before being uploaded to the remote filestore
  • _add: handles upload of files to the remote filestore
  • _delete: handles deletion of files from the remote filestore
Example

The example below shows a configuration that uses S3 for persistent storage after temporary storage with an eventual binary provider. The eventual provider configures 10 parallel threads for uploading and a lock timeout of 180 seconds.

<!-- The S3 binary provider configuration -->
<config version="v1">
	<chain template="s3"/>
	<provider id="s3" type="s3">
    	<identity>XXXXXXXXX</identity>
		<credential>XXXXXXXX</credential>     
		<endpoint><My OpenStack Server></endpoint>
		<bucketName><My OpenStack Container></bucketName>
		<httpsOnly>false</httpsOnly> 
    	<property name="s3service.disable-dns-buckets" value="true"></property>                               
	</provider>
 
<!-- The eventual provider configuration -->
	<provider id="eventual" type="eventual">
		<numberOfThreads>10</numberOfThreads>	
		<timeout>180000</timeout>
	</provider>
</config>

Where:

type
eventual
timeout
The maximum amount of time a file may be locked while it is being written to or deleted from the filesystem.
dispatchInterval

Default: 5000 ms

The interval between which the provider scans the “eventual” folder to check for files that should be uploaded to persistent storage.

numberOfThreads

Default: 5

The number of parallel threads that should be allocated for uploading files to persistent storage.

Retry Binary Provider

This binary provider is not independent and will always be used as part of a more complex template chain of providers. In case of a failure in a read or write operation, this binary provider notifies its underlying provider in the hierarchy to retry the operation.

 

type
retry
interval

Default: 5000 ms

The time interval to wait before retries.
maxTrys

Default: 5

The maximum number of attempts to read or write before responding with failure.
Example

The example below shows a configuration that uses S3 for persistent storage , but uses a retry provider to keep retrying (up to a maximum of 10 times) in case upload fails. 

<!-- The S3 binary provider configuration -->
<config version="v1">
	<chain template="s3"/>
	<provider id="s3" type="s3">
    	<identity>XXXXXXXXX</identity>
	   	<credential>XXXXXXXX</credential>     
	   	<endpoint><My OpenStack Server></endpoint>
	   	<bucketName><My OpenStack Container></bucketName>
	   	<httpsOnly>false</httpsOnly> 
    	<property name="s3service.disable-dns-buckets" value="true"></property>                               
	</provider>

<!-- The retry provider configuration -->
	<provider id="retry" type="retry">
		<maxTrys>10</maxTrys>
	</provider>
</config>

 

Double Shards, Redundant Shards

These binary providers are only available with an Enterprise license. 

Double Shards Binary Provider

The double-shards template is used for pure sharding configuration that uses 2 physical mounts with 1 copy (which means each artifact is saved only once). To learn more about the different sharding capabilities, refer to Filestore Sharding

double-shards template configuration

If you choose to use the double-shards template, your binarystore.xml configuration file should look like this:

<config version="4">
    <chain template="double-shards"/>
</config>
What's in the template? 

While you don't need to configure anything else in your binarystore.xml, this is what the double-shards template looks like under the hood. 
For details about the cache-fs provider, please refer to Cached Filesystem Binary Provider.
For details about the sharding provider, please refer to Sharding Binary Provider.
For details about the state-aware sub-provider, please refer to State-Aware Binary Provider.

<config version="4">
    <chain template="double-shards"/>
	<provider id="cache-fs" type="cache-fs">
        <provider id="sharding" type="sharding">
            <redundancy>1</redundancy>
            <sub-provider id="shard-fs-1" type="state-aware"/>
            <sub-provider id="shard-fs-2" type="state-aware"/>
        </provider>
    </provider>
</config>

Redundant Shards Binary Provider

The redundant-shards template is used for pure sharding configuration that uses 2 physical mounts with 2 copies (which means each shard stores a copy of each artifact). To learn more about the different sharding capabilities, refer to Filestore Sharding

redundant-shards template configuration

If you choose to use the redundant-shards template, your binarystore.xml configuration file should look like this:

<config version="4">
    <chain template="redundant-shards"/>
</config>
What's in the template? 

While you don't need to configure anything else in your binarystore.xml, this is what the redundant-shards template looks like under the hood. 
Details about the cache-fs provider can be found in the Cached Filesystem Binary Provider section.
Details about the sharding provider can be found in the Sharding Binary Provider section.
Details about the state-aware sub-provider can be found in the State-Aware Binary Provider section.

<config version="4">
    <chain template="redundant-shards"/>
	<provider id="cache-fs" type="cache-fs">
        <provider id="sharding" type="sharding">
            <redundancy>2</redundancy>
            <sub-provider id="shard-state-aware-1" type="state-aware"/>
            <sub-provider id="shard-state-aware-2" type="state-aware"/>
        </provider>
    </provider>
</config>

Sharding Binary Provider

Artifactory offers a Sharding Binary Provider that lets you manage your binaries in a sharded filestore. A sharded filestore is one that is implemented on a number of physical mounts (M), which store binary objects with redundancy (R), where R <= M.
This binary provider is not independent and will always be used as part of a more complex template chain of providers. To learn about sharding, refer to Filestore Sharding

type
sharding
readBehavior

This parameter dictates the strategy for reading binaries from the mounts that make up the sharded filestore.

Possible values are:

roundRobin (default): Binaries are read from each mount using a round robin strategy.

writeBehavior

This parameter dictates the strategy for writing binaries to the mounts that make up the sharded filestore. Possible values are:

 roundRobin (default): Binaries are written to each mount using a round robin strategy.

 freeSpace: Binaries are written to the mount with the greatest absolute volume of free space available.

 percentageFreeSpace: Binaries are written to the mount with the percentage of free space available.

redundancy
Default: r = 1

The number of copies that should be stored for each binary in the filestore. Note that redundancy must be less than or equal to the number of mounts in your system for Artifactory to work with this configuration.

concurrentStreamWaitTimeout

Default: 30,000 ms

To support the specified redundancy, accumulates the write stream in a buffer, and uses “r” threads (according to the specified redundancy) to write to each of the redundant copies of the binary being written. A binary can only be considered written once all redundant threads have completed their write operation. Since all threads are competing for the write stream buffer, each one will complete the write operation at a different time. This parameter specifies the amount of time (ms) that any thread will wait for all the others to complete their write operation.

If a write operation fails, you can try increasing the value of this parameter.

concurrentStreamBufferKb

Default: 32 Kb
The size of the write buffer used to accumulate the write stream before being replicated for writing to the “r” redundant copies of the binary.

If a write operation fails, you can try increasing the value of this parameter.

maxBalancingRunTime

Default: 3,600,000 ms (1 hour)
Once a failed mount has been restored, this parameter specifies how long each balancing session may run before it lapses until the next Garbage Collection has completed. For more details about balancing, please refer to Using Balancing to Recover from Mount Failure.

To restore your system to full redundancy more quickly after a mount failure, you may increase the value of this parameter. If you find this causes an unacceptable degradation of overall system performance, you can consider decreasing the value of this parameter, but this means that the overall time taken for Artifactory to restore full redundancy will be longer.
freeSpaceSampleInterval

Default: 3,600,000 ms (1 hour)

To implement its write behavior, Artifactory needs to periodically query the mounts in the sharded filestore to check for free space. Since this check may be a resource intensive operation, you may use this parameter to control the time interval between free space checks.

If you anticipate a period of intensive upload of large volumes of binaries, you can consider decreasing the value of this parameter in order to reduce the transient imbalance between mounts in your system.
minSpareUploaderExecutor

Default: 2

Artifactory maintains a pool of threads to execute writes to each redundant unit of storage. Depending on the intensity of write activity, eventually, some of the threads may become idle and are then candidates for being killed. However, Artifactory does need to maintain some threads alive for when write activities begin again. This parameter specifies the minimum number of threads that should be kept alive to supply redundant storage units.

uploaderCleanupIdleTime

Default: 120,000 ms (2 min)

The maximum period of time threads may remain idle before becoming candidates for being killed.

 

State-Aware Binary Provider

This binary provider is not independent and will always be used in the sharding or sharding-cluster providers. The provider is aware if its underlying disk is functioning or not. It is identical to the basic filesystem provider provider, however, it can also recover from errors (the parent provider is responsible for recovery) with the addition of the checkPeriod field.  

 

type
state-aware
checkPeriod

Default: 15000 ms

The minimum time to wait between trying to re-activate the provider if it had fatal errors at any point.

zone
The name of the sharding zone the provider is part of (only applicable under a sharding provider)

Configuring Sharding for HA Cluster

These binary providers are only available with an Enterprise license. 

For a High Availability cluster, Artifactory offers templates that support sharding-cluster for File-System, S3 and Google Storage. To learn more about the different sharding capabilities, refer to Filestore Sharding
When configuring your filestore on an HA cluster, you need to place the binarystore.xml under $ARTIFACTORY_HOME/etc in the primary node and it will be synced to the other members in the cluster. 

File System Cluster Binary Provider

When using the cluster-file-system templateeach node has its own local filestore (just like in the file-system binary provider) and is connected to all other cluster nodes via dynamically allocated Remote Binary Providers using the Sharding-Cluster Binary Provider.

cluster-file-system template configuration
If you choose to use the cluster-file-system template, your binarystore.xml configuration file should look like this:

 

<config version="2">
	<chain template="cluster-file-system"/>
</config>
What's in the template?
While you don't need to configure anything else in your binarystore.xml, this is what the cluster-file-system template looks like under the hood. 
Details about the cache-fs provider can be found in the Cached Filesystem Binary Provider section.
Details about the sharding-cluster can be found in the Sharding-Cluster Binary Provider section.
Details about the state-aware sub-provider can be found in the State-Aware Binary Provider section.
<config version="2">
	<chain> <!--template="cluster-file-system"-->
        <provider id="cache-fs" type="cache-fs">
            <provider id="sharding-cluster" type="sharding-cluster">
                <sub-provider id="state-aware" type="state-aware"/>
                <dynamic-provider id="remote-fs" type="remote"/>
            </provider>
        </provider>
    </chain>
 
	<provider id="state-aware" type="state-aware">
        <zone>local</zone>
    </provider>

    <!-- Shard dynamic remote provider configuration -->
    <provider id="remote-fs" type="remote">
        <zone>remote</zone>
    </provider>

    <provider id="sharding-cluster" type="sharding-cluster">
        <readBehavior>crossNetworkStrategy</readBehavior>
        <writeBehavior>crossNetworkStrategy</writeBehavior>
        <redundancy>2</redundancy>
        <property name="zones" value="local,remote"/>
    </provider>
 
</config>

S3 Cluster Binary Provider

This is the setting used for S3 Object Storage using the JetS3t library when configuring filestore sharding for an HA cluster. It is based on the sharding and dynamic provider logic that synchronizes the cluster-file-system.
When using the cluster-s3 templatedata is temporarily stored on the file system of each node using the Eventual Binary Provider, and is then passed on to your S3 object storage for persistent storage. 
Each node has its own local filestore (just like in the file-system binary provider) and is connected to all other cluster nodes via dynamically allocated Remote Binary Providers using the Sharding-Cluster Binary Provider.

cluster-s3 template configuration
Because you must configure the s3 provider with parameters specific to your account (but can leave all other parameters with the recommended values), if you choose to use the cluster-s3 template, your binarystore.xml configuration file should look like this:

 

<config version="2">
	<chain template="cluster-s3"/>
	<provider id="s3" type="s3">
       <endpoint>http://s3.amazonaws.com</endpoint>
       <identity>[ENTER IDENTITY HERE]</identity>
       <credential>[ENTER CREDENTIALS HERE]</credential>
       <path>[ENTER PATH HERE]</path>
       <bucketName>[ENTER BUCKET NAME HERE]</bucketName>
    </provider>
</config>
What's in the template? 

While you don't need to configure anything else in your binarystore.xml, this is what the cluster-s3 template looks like under the hood. 

<config version="2">
	<chain> <!--template="cluster-s3"-->
    	<provider id="cache-fs-eventual-s3" type="cache-fs">
        	<provider id="sharding-cluster-eventual-s3" type="sharding-cluster">
            	<sub-provider id="eventual-cluster-s3" type="eventual-cluster">
                	<provider id="retry-s3" type="retry">
                    	<provider id="s3" type="s3"/>
	                </provider>
    	        </sub-provider>
        	    <dynamic-provider id="remote-s3" type="remote"/>
	        </provider>
    	</provider>
	</chain> 
 
	<provider id="sharding-cluster-eventual-s3" type="sharding-cluster">
    	<readBehavior>crossNetworkStrategy</readBehavior>
	    <writeBehavior>crossNetworkStrategy</writeBehavior>
    	<redundancy>2</redundancy>
	    <property name="zones" value="local,remote"/>
	</provider>

	<provider id="remote-s3" type="remote">
    	<zone>remote</zone>
	</provider>

	<provider id="eventual-cluster-s3" type="eventual-cluster">
    	<zone>local</zone>
	</provider>
	<provider id="s3" type="s3">
       <endpoint>http://s3.amazonaws.com</endpoint>
       <identity>[ENTER IDENTITY HERE]</identity>
       <credential>[ENTER CREDENTIALS HERE]</credential>
       <path>[ENTER PATH HERE]</path>
       <bucketName>[ENTER BUCKET NAME HERE]</bucketName>
    </provider>
</config>

Details about the cache-fs provider can be found in the Cached Filesystem Binary Provider section.
Details about the sharding-cluster can be found in the Sharding-Cluster Binary Provider section.
Details about the eventual-cluster sub-provider can be found in the Eventual Binary Provider section.
Details about the retry provider can be found in the Retry Binary Provider section. 
Details about the remote dnyamic provider can be found in the Remote Binary Provider section. 

Google Storage Cluster Binary Provider

This is the setting used for Google Cloud Storage using the JetS3t library when configuring filestore sharding for an HA cluster. It is based on the sharding and dynamic provider logic that synchronizes the cluster-file-system.
When using the cluster-google-storage templatedata is temporarily stored on the file system of each node using the Eventual Binary Provider, and is then passed on to your Google storage for persistent storage. 
Each node has its own local filestore (just like in the file-system binary provider) and is connected to all other cluster nodes via dynamically allocated Remote Binary Providers using the Sharding-Cluster Binary Provider.

cluster-google-storage template configuration
Because you must configure the google-storage provider with parameters specific to your account (but can leave all other parameters with the recommended values), if you choose to use the cluster-google-storage template, your binarystore.xml configuration file should look like this:

 

<config version="2">
	<chain template="cluster-google-storage"/>
	<provider id="google-storage" type="google-storage">
		<endpoint>commondatastorage.googleapis.com</endpoint>
		<bucketName><BUCKET NAME></bucketName>
		<identity>XXXXXX</identity>
		<credential>XXXXXXX</credential>
	</provider>
</config>
What's in the template? 

While you don't need to configure anything else in your binarystore.xml, this is what the cluster-google-storage template looks like under the hood. 

<config version="2">
	<chain> <!--template="cluster-google-storage"-->
    	<provider id="cache-fs-eventual-google-storage" type="cache-fs">
        	<provider id="sharding-cluster-eventual-google-storage" type="sharding-cluster">
            	<sub-provider id="eventual-cluster-google-storage" type="eventual-cluster">
                	<provider id="retry-google-storage" type="retry">
                    	<provider id="google-storage" type="google-storage"/>
	                </provider>
    	        </sub-provider>
        	    <dynamic-provider id="remote-google-storage" type="remote"/>
	        </provider>
    	</provider>
	</chain> 
 
	<provider id="sharding-cluster-eventual-google-storage" type="sharding-cluster">
    	<readBehavior>crossNetworkStrategy</readBehavior>
	    <writeBehavior>crossNetworkStrategy</writeBehavior>
    	<redundancy>2</redundancy>
	    <property name="zones" value="local,remote"/>
	</provider>

	<provider id="remote-google-storage" type="remote">
    	<zone>remote</zone>
	</provider>

	<provider id="eventual-cluster-google-storage" type="eventual-cluster">
    	<zone>local</zone>
	</provider>

	<provider id="google-storage" type="google-storage">
		<endpoint>commondatastorage.googleapis.com</endpoint>
		<bucketName><BUCKET NAME></bucketName>
		<identity>XXXXXX</identity>
		<credential>XXXXXXX</credential>
	</provider>
</config>
Details about the cache-fs provider can be found in the Cached Filesystem Binary Provider section.
Details about the sharding-cluster can be found in the Sharding-Cluster Binary Provider section.
Details about the eventual-cluster sub-provider can be found in the Eventual Binary Provider section.
Details about the retry provider can be found in the Retry Binary Provider section. 
Details about the remote dnyamic provider can be found in the Remote Binary Provider section. 

Azure Blob Storage Cluster Binary Provider

This is the setting used for Azure Blob Storage. It is based on the sharding and dynamic provider logic that synchronizes the cluster-file-system.
When using the cluster-azure-blob-storage templatedata is temporarily stored on the file system of each node using the Eventual Binary Provider, and is then passed on to your Azure Blob Storage for persistent storage. 
Each node has its own local filestore (just like in the file-system binary provider) and is connected to all other cluster nodes via dynamically allocated Remote Binary Providers using the Sharding-Cluster Binary Provider.

cluster-azure-blob-storage template configuration
Because you must configure the azure-blob-storage provider with parameters specific to your account (but can leave all other parameters with the recommended values), if you choose to use the cluster-azure-blob-storage template, your binarystore.xml configuration file should look like this:

 

<config version="2">
	<chain template="cluster-azure-blob-storage"/>
    <provider id="azure-blob-storage" type="azure-blob-storage">
        <accountName>XXXXXX</accountName>
        <accountKey>XXXXXX</accountKey>
        <endpoint>https://<ACCOUNT_NAME>.blob.core.windows.net/</endpoint>
        <containerName><NAME></containerName>
    </provider>
</config>
What's in the template? 

While you don't need to configure anything else in your binarystore.xml, this is what the cluster-azure-blob-storage template looks like under the hood:

<config version="2">
	<chain template=“cluster-azure-blob-storage”>
       <provider id=“cache-fs-eventual-azure-blob-storage” type=“cache-fs”>
           <provider id=“sharding-cluster-eventual-azure-blob-storage” type=“sharding-cluster”>
               <sub-provider id=“eventual-cluster-azure-blob-storage” type=“eventual-cluster”>
                   <provider id=“retry-azure-blob-storage” type=“retry”>
                       <provider id=“azure-blob-storage” type=“azure-blob-storage”/>
                   </provider>
               </sub-provider>
               <dynamic-provider id=“remote-azure-blob-storage” type=“remote”/>
           </provider>
       </provider>
   </chain>

	<!-- cluster eventual Azure Blob Storage Service default chain -->
	<provider id="sharding-cluster-eventual-azure-blob-storage" type="sharding-cluster">
		<readBehavior>crossNetworkStrategy</readBehavior>
		<writeBehavior>crossNetworkStrategy</writeBehavior>
		<redundancy>2</redundancy>
		<lenientLimit>1</lenientLimit>
		<property name="zones" value="local,remote"/>
	</provider>

	<provider id="remote-azure-blob-storage" type="remote">
		<zone>remote</zone>
	</provider>

	<provider id="eventual-cluster-azure-blob-storage" type="eventual-cluster">
		<zone>local</zone>
	</provider>

	<!--cluster eventual template-->
	<provider id="azure-blob-storage" type="azure-blob-storage">
		<accountName>XXXXXX</accountName>
		<accountKey>XXXXXX</accountKey>
		<endpoint>https://<ACCOUNT_NAME>.blob.core.windows.net/</endpoint>
		<containerName><NAME></containerName>
	</provider>
</config>
Details about the cache-fs provider can be found in the Cached Filesystem Binary Provider section.
Details about the sharding-cluster can be found in the Sharding-Cluster Binary Provider section.
Details about the eventual-cluster sub-provider can be found in the Eventual Binary Provider section.
Details about the retry provider can be found in the Retry Binary Provider section. 
Details about the remote dnyamic provider can be found in the Remote Binary Provider section. 

Sharding-Cluster Binary Provider

The sharding-cluster binary provider can be used together with other binary providers for both local or cloud-native storage. It adds a crossNetworkStrategy parameter to be used as read and write behaviors for validation of the redundancy values and the balance mechanism. It must include a Remote Binary Provider in its dynamic-provider setting to allow synchronizing providers across the cluster.

The Sharding-Cluster provider listens to cluster topology events and creates or removes dynamic providers based on the current state of nodes in the cluster.

type
sharding-cluster
zones

The zones defined in the sharding mechanism. Read/write strategies take providers based on zones.

lenientLimit

Default: 1 (From version 5.4. Note that for filestores configured with a custom chain and not using the built-in templates, the default value of the lenientLimit parameter is 0 to maintain consistency with previous versions.)

The minimum number of filestores that must be active for writes to continue. For example, if lenientLimit is set to 2, my setup includes 4 filestores, and 2 of them go down, writing will continue. If a 3rd filestore goes down, writing will stop.

Typically this is used to address transient failures of an individual binary store, with the assumption that the balance mechanism will make up for it over time.

dynamic-provider
The type of provider that can be added and removed dynamically based on cluster topology changes. Currently only the Remote Binary Provider is supported as a dynamic provider.
Example
<config version="v1">
	<chain>
    	<provider id="cache-fs" type="cache-fs">    
			<provider id="sharding-cluster" type="sharding-cluster">
				<sub-provider id="state-aware" type="state-aware"/>
			 	<dynamic-provider id="remote" type="remote"/>
			 	<property name="zones" value="remote"/>
			</provider>
		</provider>
	</chain>
 
	<provider id="sharding-cluster" type="sharding-cluster">
		<readBehavior>crossNetworkStrategy</readBehavior>
 		<writeBehavior>crossNetworkStrategy</writeBehavior>
 		<redundancy>2</redundancy>
 		<lenientLimit>1</lenientLimit>
	</provider>
 
  	<provider id="state-aware" type="state-aware">
       <fileStoreDir>filestore1</fileStoreDir>
   	</provider>
 
	<provider id="remote" type="remote">
		<checkPeriod>15000</checkPeriod>
	 	<connectionTimeout>5000</connectionTimeout>
 		<socketTimeout>15000</socketTimeout>
	 	<maxConnections>200</maxConnections>
 		<connectionRetry>2</connectionRetry>
 		<zone>remote</zone>
	</provider>
</config>

Remote Binary Provider

This binary provider is not independent and will always be used as part of a more complex template chain of providers. In case of a failure in a read or write operation, this binary provider notifies its parent provider in the hierarchy.

The remote Binary Provider links a node to all other nodes in the cluster, meaning it enables each node to 'see' the filestore of every other node.

type
remote
connectionTimeout

Default: 5000 ms

Time before timing out an outgoing connection.
socketTimeout

Default: 15000 ms

Time before timing out an established connection (i.e. no data is sent over the wire).
maxConnections

Default: 200

Maximum outgoing connections from the provider.

connectionRetry

Default: 2

How many times to retry connecting to the remote endpoint.

zone
The name of the sharding zone the provider is part of (only applicable under a sharding provider).
checkPeriod

Default: 15000 ms

The minimum time to wait between trying to re-activate the provider if it had fatal errors at any point.

 

Example

The following is an example how a remote binary provider may be configured. To see how this can be integrated with a complete binarystore.xml configuration, please refer to the example under Sharding-Cluster Binary Provider.

<provider id="remote" type="remote">
	<checkPeriod>15000</checkPeriod>
 	<connectionTimeout>5000</connectionTimeout>
 	<socketTimeout>15000</socketTimeout>
 	<maxConnections>200</maxConnections>
 	<connectionRetry>2</connectionRetry>
 	<zone>remote</zone>
</provider>

Configuring a Custom Filestore From Scratch

In addition to the built-in filestore chain templates below, you may construct custom chain template to accommodate any filestore structure you need.
Since the different Binary providers in the filestore must be compatible with each other, misconfiguration might lead to data loss. For configuring a custom filestore, please contact JFrog Support.  


Configuring the Filestore for Older Artifactory Versions

For versions of Artifactory below 4.6, the filestore used is configured in the $ARTIFACTORY_HOME/etc/storage.properties file as follows

binary.provider.type

filesystem (default)
This means that metadata is stored in the database, but binaries are stored in the file system. The default location is under $ARTIFACTORY_HOME/data/filestore however this can be modified.

fullDb
All the metadata and the binaries are stored as BLOBs in the database.

cachedFS
Works the same way as filesystem but also has a binary LRU (Least Recently Used) cache for upload/download requests. Improves performance of instances with high IOPS (I/O Operations) or slow NFS access.

S3
This is the setting used for S3 Object Storage

binary.provider.cache.maxSize
This value specifies the maximum cache size (in bytes) to allocate on the system for caching BLOBs.
binary.provider.filesystem.dir
If binary.provider.type is set to filesystem this value specifies the location of the binaries (default: $ARTIFACTORY_HOME/data/filestore).
binary.provider.cache.dir
The location of the cache. This should be set to your $ARTIFACTORY_HOME directory directly (not on the NFS).