DOP208-R1: Amazon's approach to failing successfully

AWS re:Invent 2019 - A podcast by AWS

Categories:

Welcome to the real world, where things don't always go your way. Systems can fail despite being designed to be highly available, scalable, and resilient. These failures, if used correctly, can be a powerful lever for gaining a deep understanding of how a system actually works, as well as a tool for learning how to avoid future failures. In this session, we cover Amazon's favorite techniques for defining and reviewing metrics-watching the systems before they fail-as well as how to do an effective postmortem that drives both learning and meaningful improvement.

Visit the podcast's native language site