Version control is one of the top two predictors of deployment lead time, deployment frequency and MTTR. Why is it so important for DevOps organizations?
During JavaOne last September, I participated in an online panel on the subject of Artifact Repository, as part of Continuous Discussions (#c9d9), a series of community panels about Agile, Continuous Delivery and DevOps. Continuous Discussions is a community initiative by Electric Cloud, which powers Continuous Delivery at businesses like SpaceX, Cisco, GE and E*TRADE by automating their build, test and deployment processes.
Watch the recording of the panel right here:
Or just read a few insights from my contribution to the panel:
Benefits of artifact repositories
I’m fascinated with the answers that we get, because that’s exactly what we hear from everybody and trying to be the user feedback dream company, it aligns perfectly with what we try to preach. What I heard are two things, first is the tooling of continuous integration and deployment being kind of the pipes that software goes through from the developers all the way to production and that means automation, Rest API around it, being able to connect everything from any direction, the CI server from one side and the deployment tools from the other side, all the compliance and the scanning and all those parts, that’s kind of a one pure pipe, and the other is information, what I know about artifacts, the metadata that I have about my artifacts, about third party artifacts and that goes back to what is version? Is a version enough? Does the version express the state of the artifact? When I look at the artifact, can I know what QA level it passed? Where is it in my continuous delivery pipeline? All that is metadata.
I would say that the biggest difference between an artifact repository and having some files thrown somewhere and you fetch there, are two these things. The file systems are stupid, we don’t know much about the files that are there and the file systems aren’t very friendly for automation, when you take a binary artifact repository, it is kind of a file system storage, but with those two layers on top, you can have information about the artifacts, you can add it, you can inquiry it and you can automate around it for your deployment tool, should be able to query the artifact, give me all the artifacts from the latest build but only if they passed certain quality gates and match the target platform and do it in an automatic manner so I don’t need any human intervention in my delivery process. Those two peelers I would say are the essence of a good artifact repository, it brings tones of different features with that, for example it must be universal because it needs to support everything at one place both from automation perspective but also for the information perspective, because you need to track all those different pieces from all those different technologies into one piece of information, you have a Jar file that when it go through those Russian dolls of having pack inside WAR, pack inside Debian package, pack inside Docker image and then part of a Vagrant box. When you have a full Vagrant box, you need to be able to point the finger on the line in the Java code that screwed up everything. It’s also possible when you can get this stream of metadata through all this chain of Russian dolls and for that you need one tool, so universal is kind of a big deal, it’s a consequence of those two, automation and information.
Challenges with enterprise repos: dependency management and traceability
Package management is a surprisingly hard domain, it doesn’t look that way, it doesn’t feel that way, how hard can it be to grab a couple of files from a server and put them in some known place on your file system, but for decades people couldn’t get it right. I had talks about it which I called “dependency management, welcome to hell,” we go through all the different packaging types that we learned in JFrog when we added support for them to our Artifactory and Bintray and how they all suck, some of them in the same way, the others in various bizarre and different ways, but all of them suck.
The problem with that is that these problems are licking to the users, when something like the “npm unpublish” gate happens, everybody suffers and that’s just bad design of the NPM registry but it affects everybody. The dependency management which is baked into an artifact repository is intended to shield the developers from those problems. Getting back to the NPM gate, once that happened for the NPM caching all the users didn’t even noticed, that’s true for all those technologies. Artifactory started as being this shield when Maven Central was a complete mess as the NPM registry right now, this is what we started and now it is absolutely clear for us that we need to keep doing that for all the new technologies that we do.
Docker hub suffers from a massive slowness every couple of days and again whoever uses Artifactory don’t even know about it. Dependency management is a piece of a central repository because it’s abstract and simplifies the dependency management for the users, making them actually more productive, for me that’s the most important piece.
Yesterday I had a talk which was a comparison between Maven, Bazel and Gradle. Bazel do not support transitive dependencies and Gradle in their very early days didn’t support that as well. In maven there is no simple way to disable it, you can fail the build with the Maven Enforcer plugin if you have transitive dependencies and these kind of stuff, but it’s kind of baked into, and you don’t have this flexibility to disable it. We had a very interesting discussion with the audience who thinks that transitive dependencies are a good thing, or not. I think the golden path here is having transitive dependencies is a very strict dependency management, when you’re actually can nail down the versions without dealing with the dependencies themselves, if you happened to have this package, make sure that it will be this version, but I don’t actually care if you need this package or not, but if you need, use that.
Challenges with enterprise repos: distributed teams/apps/infrastructure
Locality means huge deal when we are talking about binaries, we spoke about how bits are tolerant to the size of it, and we mentioned that it’s to a point. It’s extremely important for development teams to bring their artifact repository to the team, because there is latency, and a slow artifact repository is as annoying as any other repository, you need to do something that takes forever to download, and the productivity of course takes a dive. It’s kind of a standard now that in a good productive working environment you have an artifact repository nearby, and the real question is how can you synchronize between all of them, how the team in Europe after they have produced an artifact and go to sleep, the team in the United States that is starting to work will get these artifact as soon as possible. That’s where it’s important to your artifact repository to be installed locally but to be able to synchronize in any possible topology and through all the network obstacles like firewalls.
That’s another kind of locality that we can talk about, the locality that I’ve mentioned is the locality to development team, but locality to your CI pipeline is another one. If your CI server is in the cloud it’s very important to have your artifact repository where your CI servers is, if you use Amazon then you need to have your artifact repository in Amazon, if you have them on Google Cloud you need to have them on Google Cloud. That is a whole different set of problems that we solved successfully by providing Artifactory as a service on each of those clouds and each of those regions, so if you don’t to have your CI server on Amazon in this region, we will make sure that your Artifactory server is there. That’s a new location and it should synchronize with the developer locations that we’ve spoke earlier. Another layer on top of that is management of all these stuff, now you need to have a consistent configuration across your 25 servers, 10 of them are on one continent, 10 on other continent and 5 in the cloud, now you’re adding a new Docker production repository, how can you make sure that all of the configurations will be the same, and how long it will take to log in into each and every one of them and actually do the change. This is something that you need to keep in mind when you’re rolling out an enterprise set up of artifact repositories across the world.
Good news everybody, the latest version of Artifactory can clean up unused layers automatically, and more so reuse the same layers using the Checksum based storage, if you have a lot of Docker images that rely on the same base image, all the common layers of this base image will be stored on new ones. The things that we can try to fix in the package managements that we write upon, there are some problems that we can’t deal with, this is in the hands of the dependency managers, whatever they screwed up, we can’t fix, but when it comes to storage and serving those binaries, we do what we can to improve that. Docker is a good example of amazing technology company and brilliant technology people, they’re absolutely brilliant and genius in when it comes to all these Linux kernel isolation processes and all of these stuff, but what we fell is not their strongest part is the dependency management and the artifact management.
Challenges with enterprise repos: security and governance
We have two types of security concerns, first is the access control and the second is the content control. In access control, our need to separate the permission comes from different perspective, first it can be really who should see some stuff but when we’re talking about in house environments, that’s really less of an issue because usually people are allowed to see and then use packages or artifacts from different repositories and different groups. This control, we use it to make sure that whatever needed gets to the right places, for example we won’t end up with untested artifacts in the production environment, not because we’re mean people and want to limit it to make it on “need only basis,” but just because when we put those fences, when we build that wall, we can guaranty that whatever needs to get from one place to another will not mix up and find itself somewhere else, of course if we have publicly facing service then authenticating people is a big deal as well.
The other aspect is the content trust, how can we know that what is inside our packages and our artifacts is actually secure? Here I have mixed feelings about it, because I know that there are companies that build their own business strategy on this threat that open source is dangerous and there are huge threats in packages and sometimes they get lucky and they get a Heartbleed or something, but usually I feel that it’s an industry of threatening people into buying their product. On the other side, digging into artifacts and finding stuff in that is an interesting topic of itself, not only for security, security can definitely be a part of it and if we can discover a vulnerability that would be nice, I wouldn’t recommend building your strategy on top of that, but being able to do this impact analysis and know about stuff just because it’s important, it has the security aspect as well.
Knowing about what’s going on in your artifact not only for security but also for license compliance, also for performance, I want to know if I have some slow component in my architecture, is it less important or more important than knowing that I have some vulnerability? It’s probably a big deal as well. It’s mainly for quality then truly for security, but it helps to build both.