[RTFACT-12909] Docker hitting scaling issues when a particular base layer is a component of thousands of images Created: 28/Nov/16  Updated: 15/Oct/18  Resolved: 26/Mar/17

Status: Resolved
Project: Artifactory Binary Repository
Component/s: Docker
Affects Version/s: 4.12.1, 4.14.2
Fix Version/s: 5.2.0

Type: Performance Priority: High
Reporter: Mark Mielke Assignee: Yuval Reches
Resolution: Fixed Votes: 1
Labels: None

Issue Links:
relates to RTFACT-14296 Docker hitting scaling issues when a ... Open
relates to RTFACT-17630 Improving docker virtual manifest sync Resolved
is related to RTFACT-12908 Database queries for node_props do no... Resolved
Sprint: Leap 7, Leap 9


We are experiencing significant slowdown due to our use of Docker repository hosting with Artifactory. One of the issues we are facing seems to be a scaling issue with the Docker handling of layers, where if the same layer is used by thousands of images, the processing overhead for Docker operations becomes progressively slower. This is the problematic query:

[2016-11-28 08:35:39.748 UTC] <artifactory@> artifactory: [unknown]: LOG:  duration: 270.904 ms  execute S_5: select distinct  n.repo as itemRepo,n.node_path as itemPath,n.node_name as itemName,n.created as itemCreated,n.modified as itemModified,n.updated as itemUpdated,n.created_by as itemCreatedBy,n.modified_by as itemModifiedBy,n.node_type as itemType,n.bin_length as itemSize,n.node_id as itemId,n.depth as itemDepth,n.sha1_actual as itemActualSha1,n.sha1_original as itemOriginalSha1,n.md5_actual as itemActualMd5,n.md5_original as itemOriginalMd5 from  nodes n left outer join node_props np100 on np100.node_id = n.node_id where (( np100.prop_key = $1 and  np100.prop_value = $2) and n.node_type = $3) and(n.repo != $4 or n.repo is null) 
[2016-11-28 08:35:39.748 UTC] <artifactory@> artifactory: [unknown]: DETAIL:  parameters: $1 = 'sha256', $2 = 'a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4', $3 = '1', $4 = 'auto-trashcan'

In our case, we have a particular layer that shows up in 10,000+ images. Note that in the below results, it is taking advantage of an index I created, as described here:


This previously took 250+ milliseconds with 12,000+ matching nodes. We removed several hundred obsolete Docker images and brought it down to just over 10,000 matching nodes. This now takes 170 - 200 milliseconds to query:

artifactory=# select count(*) from node_props where prop_value = 'a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4' and prop_key = 'sha256';
(1 row)

Time: 9.996 ms
artifactory=# explain analyze execute statement1('sha256', 'a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4', 1, 'auto-trashcan');         
                                                                                                    QUERY PLAN                                                                                                     
 Unique  (cost=120.46..120.50 rows=1 width=258) (actual time=162.251..171.768 rows=10181 loops=1)
   ->  Sort  (cost=120.46..120.46 rows=1 width=258) (actual time=162.250..166.517 rows=10181 loops=1)
         Sort Key: n.repo, n.node_path, n.node_name, n.created, n.modified, n.updated, n.created_by, n.modified_by, n.bin_length, n.node_id, n.depth, n.sha1_actual, n.sha1_original, n.md5_actual, n.md5_original
         Sort Method: external merge  Disk: 2880kB
         ->  Nested Loop  (cost=5.18..120.45 rows=1 width=258) (actual time=2.276..46.881 rows=10181 loops=1)
               ->  Bitmap Heap Scan on node_props np100  (cost=4.76..111.99 rows=1 width=8) (actual time=2.246..11.370 rows=10181 loops=1)
                     Recheck Cond: ((prop_value)::text = $2)
                     Filter: ((prop_key)::text = $1)
                     Heap Blocks: exact=4703
                     ->  Bitmap Index Scan on node_props_prop_value_idx_2  (cost=0.00..4.76 rows=28 width=0) (actual time=1.532..1.532 rows=10231 loops=1)
                           Index Cond: ((prop_value)::text = $2)
               ->  Index Scan using nodes_pk on nodes n  (cost=0.42..8.45 rows=1 width=258) (actual time=0.003..0.003 rows=1 loops=10181)
                     Index Cond: (node_id = np100.node_id)
                     Filter: ((((repo)::text <> $4) OR (repo IS NULL)) AND (node_type = $3))
 Execution time: 172.888 ms
(15 rows)

Time: 173.297 ms

To join and output 10,000 records it is taking 173 milliseconds. Does the Docker plugin really need to get ALL of the results from this query? Or might it be satisfied with only a few results?

In our case, this query is happening many times a second during peak, where peak is during continuous build operations that invoke Docker push and pull form Artifactory. Before I implemented the index hack, it was reporting a variety of SHA-256 values. After I implemented the index hack to speed up how fast it finds the SHA-256 values, now only a few values such as the one above remain as a concern. Due to the 10,000+ results that are being prepared and processed multiple times a second, it is really affecting the performance of Artifactory as a whole. It is causing Artifactory to be laggy in the day for all users of Artifactory - whether the UI, Maven, Docker, or many of the others.

Comment by Mark Mielke [ 30/Nov/16 ]

After further investigation, we determined that this particular layer is common to most images as it is a very basic set of Dockerfile instructions. I expect if you query any Artifactory database that supports Docker, that has a large number of images in it, you will find that some layers are much more common than other layers, and you may actually have this exact same layer in your own database.

Comment by Brandon Sanders [ 11/Jan/17 ]

We have this same layer on 190,000 images in our repository and due to regulatory requirements it is problematic for us to prune those images. Even with the index from RTFACT-12908 and all operations done in memory this query takes 4-5 seconds to run.

Comment by Mark Mielke [ 13/Mar/17 ]

I was investigating a bit if we could work-around this in any way until JFrog deploys a patch, and I encountered two aspects I want to make sure get covered:

1) The "SELECT DISTINCT" has the effect of forcing PostgreSQL to query all the records, de-dupe them, and then start returning records. So, I think the "DISTINCT" needs to be dropped to turn this into any sort of partial query. In our case, we're up to 17,000 matching rows for the "empty layer" used to create environment variables, and just on the PostgreSQL side is taking the 500+ ms.

2) The current implementation using an "AqlEagerResult". If I am understanding this correctly - findBlobsGlobally() determines canRead() and other checks all 17,000 results before dumping it into a List, and passing it to the caller. So not only is the PostgreSQL side doing quite a lot of processing, but the Java side is too.

For a minimum fix here, I am thinking it should use AqlLazyResult with a streamable query from PostgreSQL (i.e. not "SELECT DISTINCT") and process one item at a time, exiting the loop as soon as it gets a first result that sufficiently meets requirements. In our case, of the 17,000 rows - probably 16,990 of the rows would be acceptable candidates, and it should normally find 1 of them within the first 10 results.

For other use cases, I wonder if something more sophisticated might be required. If you had 500 repositories, each owned by a different team, and they all used this very common "empty image" layer, the canRead() might have to scan 500 or more rows before finding a result. It would still be faster than scanning 17,000 rows, but I would wonder if there was some way to make it more likely that a local result would be found? For example, maybe the query should start by looking for rows where property "sha256" is a particular value and the repository is the same?

Maybe you already had all these ideas, and are working on a better fix. I just want to make sure we won't get this feature, and still see 500+ ms queries, because one of the elements of the problem wasn't recognized.


Comment by Mark Mielke [ 28/May/17 ]

This issue was partially addressed. Our server has gone from failing to surviving, but the system load is still quite high during peak, and we remain concerned. I opened a new issue to capture the remaining concerns: RTFACT-14296

Generated at Sun Jan 26 22:41:44 UTC 2020 using JIRA 7.6.16#76018-sha1:9ed376192612a49536ac834c64177a0fed6290f5.