Observability in Agile Teams: From Incidents to Feedback
A practical guide to debugging production systems using observability data.
Observability is not a dashboard. It's a diagnostic process.
This talk explores how to move from "something is wrong" to "here's the fix" using a systematic approach to debugging production systems.
Abstract
Every engineer has been there: alerts fire, dashboards show anomalies, but finding the actual root cause feels like searching for a needle in a haystack. We collect terabytes of metrics, logs, and traces, yet debugging still feels like guesswork.
This talk presents a structured approach to production debugging that turns observability data into actionable insights. We'll explore:
- Why dashboards alone aren't enough
- The three questions every debugging session should answer
- How to correlate signals across metrics, logs, and traces
- Real-world examples of debugging complex distributed systems
What You'll Learn
By the end of this talk, you'll have a mental framework for approaching any production incident, along with practical techniques for using your observability stack more effectively.
Outline
- The Problem with Dashboards - Why visualization isn't investigation
- The Diagnostic Mindset - Thinking like a detective
- Signal Correlation - Connecting metrics, logs, and traces
- Case Studies - Real debugging sessions from production systems
- Building Better Alerts - From symptoms to causes
Target Audience
This talk is for anyone who has ever stared at a dashboard wondering "why is this happening?" Whether you're debugging your first production incident or your hundredth, you'll find practical techniques to add to your toolkit.
Interested in this topic for your team or conference?