This article provides basic guidance to help teams make decisions on how and when to incorporate continuous integration (CI) and continuous delivery (CD) practices into their work.
Investing in a streamlined CI/CD process is a critical part of any modern software project. From a business perspective, it’s about amplifying developer productivity, reducing timelines for feature delivery, and reducing mean time to recovery in the event of failures.
Underinvestment in CI/CD leads to software releases that are late, brittle, and high in risk. For this reason, it’s important to establish an appropriate CI/CD process and discipline as early in a project’s life as possible—ideally at the very start, for the very first delivery.
However, reaching full CI/CD maturity can be a very deep rabbit hole, and it’s not advisable to aim for a 100% ideal CI/CD process at the very start of a project. Just like in the actual product the team is building, it’s best to take an incremental approach at delivering value—in this case, the development team are the users, and the value delivered is the addressing of their user pain, ultimately contributing to value in the actual product in the form of improvements to the stability and rate of delivery.
As such, it’s important for the management team to listen to the user pain of their team members and prioritize incremental improvements to CI/CD alongside product features, and to be smart about when to accrue technical debt and when to pay it back.
In short, when it comes to managing a high-functioning continuous integration and continuous delivery pipeline, the right choice is continuous investment—starting early and continuing throughout the life of the project.
Infrastructure as Code
Because embracing CI/CD is about delivering changes to the product in a fully-automated, repeatable and transparent way, it’s important to give the same treatment to the CI/CD pipeline configuration itself.
The team ensures that their CI/CD pipeline is configured via infrastructure-as-code, being as fully-automated as possible to deliver changes to the pipeline’s behavior.
Every major CI/CD system out there these days supports this kind of workflow, but sometimes teams are using them in more manual ways (for example, some users of Jenkins still configure their pipelines in the graphical user interface, even though “Jenkinsfile”-based configuration is available), so this is an important baseline to set for the team’s solution.
Leveraging Prior Art vs Forging a New Path
The first thing to look for when the team starts a new project is to understand what other software teams within the company are doing with respect to CI/CD.
Can the team be successful by just adopting the same services and patterns others are already using? If there are risks or inefficiencies in the client’s current practices, can the team offer to help them improve their existing systems instead of building new ones? How long would it take for such changes to fully materialize, and can the team wait that long?
In terms of collaboration, what are the risks associated with depending on the existing teams that own these CI/CD practices? For example, will the team face long lead times for basic administrative requests? On the other hand, what are the risks to the client of the team setting up entirely new CI/CD practices that will have to be maintained for the life of the product?
SaaS vs Self-Hosted
If the team decides to forge a new path for CI/CD, the next step is to consider the tradeoffs of self-hosting the CI/CD capability compared to paying for a service that will maintain it on the team’s behalf.
This decision has major effects both on the time cost (and timeline) to build out the CI/CD capability initially, as well as the time cost to maintain it over time. Choosing a self-hosted solution means accepting that some of the team members will spend a more significant amount of their time and attention building and maintaining the CI/CD capability. However, it’s important to remember that choosing a SaaS doesn’t make these costs drop to zero—even if the team is not building the infrastructure to run the CI/CD pipelines, they still will invest time in building the pipelines themselves.
Cost Centers for Self-Hosted CI/CD Systems
The most notable time costs for a self-hosted CI/CD system are up front: establishing the initial capability. If one team member says “I can stand up a Jenkins cluster in less than an hour—what’s the big deal?”, it’s important to consider some of the necessities that may be initially overlooked for a robust system.
The team gives consideration to questions like:
How Will We Expose The CI/CD System Securely?
It’s one thing to ask developers to SSH-tunnel into the servers to look at builds (and even that is a bit onerous), but if the CI/CD system needs to integrate with a public VCS service like GitHub, it usually needs to be exposed over the public WAN to that service.
With public WAN exposure comes the immediate and unavoidable concerns of secure authentication and TLS certificate-based encryption. With the need for TLS certificates comes the need for DNS registration (required for any certificate signed by a public CA). And registering a DNS name and TLS certificate for a WAN-exposed port immediately “paints a target on one’s back” with services like Censys, allowing attackers to easily search for attack surfaces—so the team had better trust the security of their authentication setup.
On the Topic of Authentication, How Will Users Be Managed?
Ideally the team would use some kind of SSO (like “Login with GitHub” or Okta SAML) if one is available. Otherwise, the team will need to establish new processes to manage granting access to new team members and revoking access from people who have left the team. And again, the team will have to make sure that whatever authentication scheme they set up won’t interfere with whatever service-based webhook integrations they may have set up with their VCS provider. Beyond that, the integration with the VCS often needs to be bi-directional (so that the codebase can be pulled by the CI/CD system), so the team also often needs service authentication in the other direction.
All of these concerns (along with the concerns of actually setting up a job-running cluster of servers) will require a fair bit of infrastructure setup, which should all arguably be implemented as infrastructure-as-code to avoid relying on brittle human-centric processes for the team’s setup. But how will changes to this infrastructure be rolled out over time? Will the team practice automated CI/CD for their CI/CD system infrastructure, or treat it as a special/exempt case and initiate those deployments by hand?
These are all solvable problems, but the question at hand is whether the team is prepared to dedicate the time needed to solve them correctly and in a maintainable way.
Cost Centers for SaaS CI/CD Systems
By contrast, the most notable costs for a SaaS-based CI/CD system usually come later, as the team and project grow in size and maturity. That is, the team can get up and running quite quickly but they may face growing pains later down the line.
Firstly, in terms of service costs, SaaS CI/CD platforms typically have very cheap (usually free) offerings at the bottom tier for teams that don’t have high-bandwidth needs, but the costs scale quickly as their bandwidth needs grow, so it is often not cost-effective for very large teams (or teams with very long-running test suites and/or artifact builds). From a business perspective, this is basically a loss leader on their part, hoping the team will get too hooked on their service to migrate when their needs grow.
As a result, if the project grows to the point where CI/CD bandwidth becomes a bottleneck, the first step is to have team members dedicate some time to optimizing the CI/CD pipeline, so that test suites and image builds take less time from the pipeline, or are skipped entirely for some kinds of changes (like commits updating internal documentation). This time cost can add up, sometimes facing diminishing returns in which a significant amount of investment in research and optimization might yield a comparatively insignificant amount in actual pipeline time saved. Of course, pipeline optimization can also be a concern in self-hosted CI/CD systems, but for those systems it is often very easy and comparatively cheap to just add more job-running servers, compared to the more steeply increasing costs of the SaaS.
Another important point of consideration with SaaS CI/CD is the stability of their service. Even “soft” outages like a temporary reduction in build bandwidth can wreak havoc in a large team that is firmly dependent on their CI/CD system. Outages can of course happen in a self-hosted system as well, and can last much longer than a SaaS outage if the team isn’t well-prepared to deal with the issues, but at least in such a situation, the team has full control over the situation and the time to recovery, so a large and well-equipped reliability team managing a self-hosted solution can be the best choice for mitigating this risk.
One more commonly-overlooked point to be aware of with SaaS CI/CD systems is that the team is directly tied to their product choices, with little recourse if they are missing a feature that the team needs or decide to take their product in a different direction that isn’t helpful for the team. In such situations, using an open-source self-hosted system instead would at least give the team the last-resort option of building/modifying the behavior themselves and deploying a patched version to their self-hosted cluster.
Avoiding Vendor Lock-In
For the reasons discussed above and others, it may be necessary to change vendors for components of the CI/CD system over the life of the project.
To ease such a transition, care is taken throughout the team’s continuous investment in the CI/CD pipeline to avoid (where possible) investing in vendor-specific infrastructure-as-code. For example, when the team writes new logic for their CI/CD pipeline, they consider keeping that code in separate shell scripts that merely get invoked by the pipeline, rather than writing it into the pipeline itself. This may not always make sense, but it’s a baseline practice to keep in mind.