LEARN

CI/CD Pipeline Monitoring: An Introduction

Application performance monitoring has traditionally focused on monitoring and analyzing just applications and the infrastructure that hosts them.

In today’s DevOps-centric world, however, where new application releases and updates are delivered continuously using CI/CD pipelines, monitoring CI/CD operations has become a third key pillar for optimizing overall application performance. Even the best-written code or the most flawless application will result in a poor user experience if problems in the CI/CD pipeline prevent smooth and continuous deployment.

 

Likewise, if CI/CD problems make it difficult to assess the performance impact of code or configuration changes, you’ll be shooting in the dark and struggling to optimize performance.

Here’s a primer on how to monitor the CI/CD delivery pipeline and how to correlate that data with other metrics in order to achieve optimal overall performance of your applications.

Why CI/CD Monitoring Matters

The CI/CD pipeline is distinct from the software environment that hosts your application, but it’s nonetheless linked inextricably to it. A healthy pipeline is one that allows your team to write, build, test, and deploy code and configuration changes into the production environment on a continuous basis.

An unhealthy CI/CD pipeline can hamper your ability to achieve optimal application performance in a variety of ways. For example:

Slow deployments

If your CI/CD operations are slow and you are unable to push out new releases quickly, you may not be able to deploy fixes to performance bugs before they become critical problems for your end-users.

Testing completeness

Inefficient CI/CD operations (such as slow builds, or messy handoffs of new code from developers to the software testing team) hamper your inability to test software completely before you deploy. They force you to choose between deploying releases that haven’t been fully tested or delaying deployments while you wait on tests to complete. Neither outcome is good for end-users.

Testing coverage

CI/CD operations issues may also make it difficult to test each release against a wide variety of configuration variables. If you don’t have as much time to test as you would ideally, you may have to test only for some use cases or some environment configurations, which makes it more difficult to ensure adequate application performance for all users once code reaches production.

Technical debt

Lack of visibility into the CI/CD process can lead to technical debt. When you can’t systematically measure the performance of each part of your CI/CD pipeline, it’s much harder to determine processes that are causing technical debt.

Deployment agility

Total visibility into the CI/CD pipeline makes it easier to achieve deployment agility, such as the ability to deploy to a new kind of production environment (a different cloud, for instance) or to make major configuration changes to the environment. When you know exactly how each CI/CD process is going and what a successful CI/CD operation looks like, you can modify your operations with confidence, knowing that you’ll be able to assess rapidly whether the changes positively or negatively impact application health.

Adding CI/CD Monitoring to Application Performance Monitoring

To reduce the risk of problems or inefficiencies like those described above, teams should monitor CI/CD operations as closely as they monitor their applications and environment. CI/CD monitoring means collecting and analyzing metrics like the following:

Deployment frequency

How many deploys do you successfully push out each day or week?

Deployment time

How long does it take to execute each deployment? In other words, how long does it take to move a validated release from dev/test into production?

Lead time for changes

When your team decides to implement a code or configuration change in the application, how long does it take to implement and deploy that change?

Mean time to recover/repair

When a problem that is detected in production necessitates a new release that includes a fix, how quickly is your team able to push out the fix?

Change failure rate

How many attempted changes result in failures because the release in which they were implemented failed tests or otherwise was not deployed successfully?

Work-in-progress

How many in-progress code or configuration changes are in your pipeline at a given time?

To deliver the greatest level of visibility, these metrics should be correlated with other data, including log analytics and traces from your application environment. For example, if tracing shows a performance problem in production that requires a code change to fix, CI/CD pipeline metrics about work-in-progress and deployment time will help predict how long it will take to implement the fix. Likewise, if you compare deployment frequency to baseline application performance metrics and notice that application performance is decreasing over time, it may be a sign that you are deploying so frequently that you’re cutting corners on quality.

Conclusion

Gaining complete visibility into application performance requires monitoring not just application environments themselves, but also the CI/CD pipelines that power them. By correlating CI/CD data with other metrics, traces, and log analytics, you put yourself in the strongest position to optimize application performance and delight your users, even in fast-moving continuous delivery chains. Start by taking a real-time, NoSample™ full-fidelity approach in application performance monitoring that allows for unlimited cardinality exploration with Splunk APM. Learn how faster troubleshooting, easier root cause analysis and more efficient remediation leads to happier SRE and IT teams!

What is Splunk?

This is a guest blog post from Chris Tozzi, Senior Editor of content and a DevOps Analyst at Fixate IO. Chris Tozzi has worked as a journalist and Linux systems administrator. He has particular interests in open source, agile infrastructure, and networking. He is Senior Editor of content and a DevOps Analyst at Fixate IO. This posting does not necessarily represent Splunk's position, strategies, or opinion.

Stephen Watts
Posted by

Stephen Watts

Stephen Watts works in growth marketing at Splunk. Stephen holds a degree in Philosophy from Auburn University and is an MSIS candidate at UC Denver. He contributes to a variety of publications including CIO.com, Search Engine Journal, ITSM.Tools, IT Chronicles, DZone, and CompTIA.