DEMO

Optimizing Pipelines with Real-Time Telemetry

Adriel demonstrates Liatrio’s solution to GitLab’s telemetry gaps using OpenTelemetry processors to extract CI/CD component data and visualize it in OpenObserve dashboards. This eliminates fragile pipeline scripts, enabling version tracking, error monitoring, and trace analysis while offering cost-effective, self-hosted telemetry storage.

Breaking Down CI/CD Bottlenecks with Real-Time Telemetry Insights

Observability is crucial for modern software delivery, enabling teams to track performance, diagnose issues, and optimize workflows. However, CI/CD pipelines often pose challenges—especially when dealing with complex configurations and limited visibility into version usage. To address this, Liatrio developed a scalable, event-driven solution using CI/CD components, OpenTelemetry, and OpenObserve, allowing teams to unlock actionable telemetry and drive continuous improvement.

CI/CD Components: Modular but Not Without Challenges

The introduction of CI/CD components in platforms like GitLab was intended to modularize workflows, similar to GitHub Actions. Components are stored in structured templates folders with dedicated inputs, solving common problems like variable clobbering. Additionally, CI/CD catalogs act as developer portals, helping teams quickly access shared components.

Despite these advancements, challenges persist. The merging of configuration files into large, nested YAML structures complicates version tracking and usage monitoring. This complexity hinders teams from identifying which versions of components are active across projects.

Liatrio’s Event-Driven Telemetry Solution

To tackle these limitations, Liatrio developed a GitLab processor that integrates with OpenTelemetry, allowing teams to dynamically track and monitor version usage across pipelines. Here’s how it works:

CI/CD events are emitted to a webhook receiver connected to an OpenTelemetry collector.
The processor parses YAML files, identifies component names and versions, and attaches this information to events.
Events are forwarded to OpenObserve, where dashboards provide real-time insights into version usage and pipeline health.

This event-driven approach ensures that telemetry data is continuously captured and updated without requiring manual interventions or fragile scripts.

Unlocking Actionable Insights with Telemetry

Within minutes of deploying this solution, teams can visualize component usage across CI/CD pipelines using real-time dashboards. By tracking key metrics, they can quickly spot outdated components, problematic versions, or inconsistencies across projects.

For example, if a specific version introduces errors, teams can easily identify where it’s being used and either downgrade or update it as needed. By extending telemetry beyond logs to traces, teams gain deeper insights into failure points and bottlenecks, driving faster troubleshooting and proactive optimization.

Why OpenObserve is Ideal for Telemetry Storage

In our demo, OpenObserve served as a self-hosted backend for storing telemetry data. Unlike traditional observability platforms that often require multiple data stores, OpenObserve consolidates telemetry data efficiently into a single backend. It’s cost-effective, scalable, and ideal for long-term storage—making it a strong option for organizations with self-hosting requirements.

Driving Continuous Improvement Through Observability

With Liatrio’s telemetry solution, teams can overcome native CI/CD limitations and gain full visibility into their pipelines. By tracking component versions and pipeline health externally, they maintain flexibility, reduce downtime, and continuously optimize their delivery pipelines.

Ready to unlock meaningful telemetry in your CI/CD processes? Contact Liatrio today to explore how we can help transform your software delivery and observability strategy.

TRANSCRIPT

Hey, everybody. I'm going to swap between a couple of different screenshares, so hopefully, it's not too jittery. No slides, but I do have web pages, so I don’t have to rely on them. This session is going to be pretty technical. They’ve started thinking about starter kits and how to accelerate their organization by sharing common patterns. If you’ve ever interacted with GitLab, you might remember how complex it can get—everything includes one billion things, leading to nested and chaotic YAML files, variable clobbering, and difficulty tracking what’s actually happening. It’s hard to version and maintain it. At the end of the day, it’s all one gigantic merged YAML file. GitLab, along with Lululemon and others, has introduced a new feature called CI/CD components. While the result is still one merged YAML file, the framework is much better for structuring shared, reusable workflows. Ironically, it’s moving towards what GitHub already does with actions and shared workflows, but again, the underlying YAML merging issue remains. Here’s what this looks like in practice. We have a component section within our Liatrio-to-org setup. A component looks more action-like compared to the typical GitLab CI YAML files filled with long variable definitions. The cool thing here is that you can run actions directly against your components. These components are stored in a templates folder, making them modular and manageable. One great improvement is that global environment variable clobbering is no longer an issue. Components now have dedicated inputs, similar to GitHub Actions. Another cool addition GitLab made is the CI/CD catalog. It acts like a mini developer portal where you can view and select components your organization uses for CI/CD. For example, you can check the inputs they require, how to import them, and what versions are available. This makes it easy for developers to get started quickly, say, by building a Maven app using the appropriate component. However, despite these improvements, it’s still one merged YAML file. This creates challenges for detecting usage and tracking who is using which versions of components. GitLab does not provide a built-in solution to address this problem, and the event stream that GitLab emits doesn’t include this information because of the way the YAML is merged. So how did we solve this problem? Before getting into that, let me show you a demo project that imports four different components at varying versions. These components handle various stages of the pipeline—linting, testing, building, and releasing the application. What we did was build a processor that injects information into the event stream as it passes through. Conceptually, this works similarly to what an OpenTelemetry (OTel) collector does. It receives events, processes them, and exports them to a backend. In a few days, I created a processor specifically for GitLab. The way it works is that GitLab emits pipeline events to a webhook receiver connected to the OpenTelemetry collector. The collector applies transformations, drops unwanted data, and ensures that the changes conform to OpenTelemetry’s semantic conventions. The GitLab processor then forwards the data to OpenObserve. The GitLab processor examines each event log. If a log matches certain semantic conventions and contains specific data, we query GitLab to parse the CI YAML file at a specific commit. If components are detected, we attach attributes like component name and version to the event and forward it to the next step in the pipeline. Let me show you what that looks like in practice. I triggered a pipeline, and you can see logs coming in. After parsing and transformation, we get meaningful attributes such as component names and versions. This allows us to build dashboards showing version usage across pipelines in the last 15 minutes. For example, you can see the various versions being used within the system. I built this dashboard in five minutes—it was quick and easy because of the semantic data in place. You can scale this approach, tracking things like errors or making data-driven decisions based on telemetry. For example, you could identify which versions are causing issues and which components are frequently used. You can even extend this concept by monitoring traces instead of just logs. You could attach log events to spans within a trace, providing additional context and enabling further analysis. Let’s make a quick change in the demo pipeline to bump the component versions to 0.4. I’ve shipped the change to the main branch using trunk-based development. The pipeline runs quickly, and you can see that the changes were detected, processed, and reflected in our OpenObserve backend. The dashboard updates to show the new versions in use. The advantage of this approach is that we don’t need to inject fragile scripts into the pipeline itself. Instead, we listen to events externally, making it easier to collect telemetry and build meaningful metrics. The telemetry is processed as log events, so you can perform various analyses without being constrained by traditional metrics limitations. I also demonstrated a tracing solution to show what’s possible. If you attended KubeCon, you might have seen a talk by some folks from Clario who built a GitLab receiver for traces. We have a similar receiver focused on metrics, but it would be easy to modify the processor to handle traces as well. This would allow us to attach log events to spans and further enrich our telemetry. Regarding semantics, we rely on conventions like repository name and head revision to query GitLab and parse the correct YAML file. This is a quick and effective mechanism for injecting meaningful context into the event stream. The last thing I want to show is OpenObserve, the backend I’m using locally. OpenObserve is an open-core solution that allows you to self-host telemetry data. What I like about it is that it’s not Grafana—it consolidates telemetry into a single backend (like S3) without the complexity of multiple data stores. It’s cost-effective, supports long-term storage when necessary, and avoids issues related to high cardinality. While I don’t typically recommend self-hosting observability platforms (I usually suggest using SaaS solutions like Honeycomb), OpenObserve is a strong option if you need to self-host for business reasons. In fact, we’re planning to use it in KPV3 to host DORA and other data that we don’t want to send to external SaaS providers. That’s it. I just wanted to show how you can get meaningful telemetry out of a system that doesn’t provide it natively and how you can fix this issue. Any questions?