Uploaded image for project: 'Artifactory Binary Repository'
  1. Artifactory Binary Repository
  2. RTFACT-20601

The Remote Stats Download should not be tagged to an IP as the Download Count Metrics

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Normal
    • Resolution: Unresolved
    • Affects Version/s: 6.13.1
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      In an immutable infrastructure, IP addresses are constantly changing:
      https://www.digitalocean.com/community/tutorials/what-is-immutable-infrastructure

      Meaning, whenever a node from an Artifactory HA cluster is (re)deployed, the VM's are recycled, and as IPs are dynamic, they change every time a new VM is deployed.
      This causes the remote_stats table in the database to become polluted, as they are coupled with Artifactory IP addresses. Note that specifically for Artifactory, the node.id is being cycled as well.

      The structure(Columns of the remote_stats) table  has the following values:

        node_id   |    origin     | download_count | last_downloaded | last_downloaded_by | path

      The Origin column in the above remote_stats table are populated with each IP address of the Artifactory instances; this is not useful since the IP's are dynamic in an immutable infrastructure. So when you run an AQL query to get remote downloads, you get an array of information, that contains downloads from before and current:

      "path" : "pool/n/nodejs-node",
      "name" : "public.deb",
      "created" : "2019-02-06T23:17:00.993Z",
      "stats" : [

      { "downloaded" : "2019-10-23T22:38:14.861Z", "downloads" : 277, "remote_downloaded" : "2019-09-06T00:04:28.015Z", "remote_downloads" : 4 }

      ,

      { "downloaded" : "2019-10-23T22:38:14.861Z", "downloads" : 277, "remote_downloaded" : "2019-10-03T22:14:17.134Z", "remote_downloads" : 20 }

      ,

      { "downloaded" : "2019-10-23T22:38:14.861Z", "downloads" : 277, "remote_downloaded" : "2019-10-30T22:38:07.371Z", "remote_downloads" : 1 }

      ,

      { "downloaded" : "2019-10-23T22:38:14.861Z", "downloads" : 277, "remote_downloaded" : "2019-10-04T19:33:28.873Z", "remote_downloads" : 35 }

      , {
      "downloaded" : "2019-10-23T22:38:14.861Z",
      "downloads" : 277,
      "remote_downloaded" : "2019-09-18T23:16:25.916Z",
      "remote_downloads" : 6

       
      Since Artifactory isn't using Tomcat's RemoteIpValve:

      https://tomcat.apache.org/tomcat-8.5-doc/api/org/apache/catalina/valves/RemoteIpValve.html

      We cannot use the X-Forwarded-For requests and overwrite the origin value in the remote_stats table.

      Instead of the having the IP address of the VM's as the origin values in the remote_stats table, if we have a more static value like the servername or the License's hash as the Origin, the count value will be more relevant and so that we do not lose on the historical data on the remote_stats table in the database for the download counts. 

      example:
      select * from stats_remote;

      node_id   |    origin     | download_count | last_downloaded | last_downloaded_by | path
      2260136,35.193.64.65,2,1576608482492,admin,35.193.64.65
      2260136,10.128.0.51,150,1576608269970,admin,10.128.0.51
      

      AQL:

      items.find(
        {
             "repo" : "npm-remote-cache",
             "path" : { "$match" : "a/*" },
             "name" : { "$match" : "a*" }
        }
      ).include("created","path","name","stat.downloaded","stat.downloads","stat.remote_downloads","stat.remote_downloaded")
      

      returns:

      {
      "results" : [ {
        "path" : "a/-",
        "name" : "a-0.0.6.tgz",
        "created" : "2019-10-24T15:49:20.052Z",
        "stats" : [ {
          "downloaded" : "2019-12-17T18:47:57.366Z",
          "downloads" : 6,
          "remote_downloaded" : "2019-12-17T18:44:29.970Z",
          "remote_downloads" : 150
        }, {
          "downloaded" : "2019-12-17T18:47:57.366Z",
          "downloads" : 6,
          "remote_downloaded" : "2019-12-17T18:48:02.492Z",
          "remote_downloads" : 2
        } ]
      } ],
      "range" : {
        "start_pos" : 0,
        "end_pos" : 1,
        "total" : 1
      }
      }
      

       

       

        Attachments

          Activity

            People

            Assignee:
            Unassigned
            Reporter:
            manojt Manoj Tuguru
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated: