Details

    • Type: Bug
    • Status: Open
    • Priority: Normal
    • Resolution: Unresolved
    • Affects Version/s: 6.2.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      Database: Oracle 12C
      Oracle client driver: ojdbc7.jar (12.1.0.2.0, md5: 03efd5a0799c4cc0f043c8ee6a83940d)
      OS: CentOS 7.4 (running inside docker with docker.bintray.io/jfrog/artifactory-pro:6.2.0)

      Description

      After running artifactory for approximately 90 days we got two misbehaving db sessions that were using up all the DB's CPUs, which made artifactory very slow. After restarting it everything went back to normal. Unfortunately, we were not able to reproduce the behaviour so far.

      This query was run insistently by two artifactory sessions in a loop (I was not able to see how that query would be generated in artifactory-oss, so I presume it's part of proprietary code in artifactory):

      select distinct  n.repo as itemRepo,n.node_path as itemPath,n.node_name as itemName,n.created as itemCreated,n.modified
      as itemModified,n.updated as itemUpdated,n.created_by as itemCreatedBy,n.modified_by as itemModifiedBy,n.node_type as it
      emType,n.bin_length as itemSize,n.node_id as itemId,n.depth as itemDepth,n.sha1_actual as itemActualSha1,n.sha1_original
      as itemOriginalSha1,n.md5_actual as itemActualMd5,n.md5_original as itemOriginalMd5,n.sha256 as itemSha2 from  nodes n
      left outer join node_props np100 on np100.node_id = n.node_id where ( n.repo = :1  and( np100.prop_key = :2  and  np100.
      prop_value like :3 )) and n.node_type = :4

      Although we have maven, docker, bower, among other repos in that instance, the 5 repos we saw being passed to that query were all PyPI repos (they are also the ones with most activity). Looking at the logs, it doesn't seem like the GC was running when the issue happened (and all GC reports said it took 13s or less).

      I ran the following command:

      $ grep 'Starting indexing' artifactory.log | wc -l
      714
      

      That number is slightly higher than our average, but we had higher number of that. In terms of number of indexing events. The average is 4.2, the largest being 94. The incident happened around the time the Starting indexing of 94 events entry was logged. But it finished only 2 seconds after it started, so I wouldn't expect it to be the culprit. The rate and the time it takes to run these indexing events doesn't seem to justify the high usage we saw.

      The artifactory process itself did not have any unexpected behaviour at the time in terms of memory or CPU usage.

      Our DBA noticed something weird which could be related somehow. The oracle SQL cache had 95 versions of that SQL statement. This was due to bind variable mis-matches. That means the value being passed by the client keeps changing its declaration. Eg:

      variable a varchar2(100);
      select count(*) from p where name = :a
       
      variable a varchar2(400);
      select count(*) from p where name = :a
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              rafael Rafael Cunha de Almeida
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: