Loading…
Attending this event?
Thursday August 22, 2024 11:30am - 12:00pm PDT
As an industry we’ve grown accustomed to the black-and-white definition of the system status – it’s either “Up” or “Down.” But what about the vast domain of insidious gray failures in-between that go undetected during “normal” operations? In this session, we share our experience in how we built ‘PRECog,’ a tool that enables engineering teams to proactively identify high-severity gray failures that had previously been overlooked before they resulted in degraded customer experiences. ‘PRECog” utilizes an advanced form of chaos experimentation we developed called Latency Squeeze Injection (LSI). We explore why we built PRECog at CapitalOne and how we use it to verify Service Level Objectives (SLOs) continuously, proactively identify service degradation points, and improve key aspects of system resilience such as retries, fallbacks, timeouts, circuit breakers, and more.

In this session, Aaron Rinehart and Kyle Smith, Distinguished Engineers at Capital One will share how they have used this new approach to confidently explore system safety boundaries to build more resilient and reliable distributed systems for their products and services.
Speakers
avatar for Aaron Rinehart

Aaron Rinehart

Global Cybersecurity Leader
Prior to Capital One, Aaron has collected a diverse set of experiences solving complex challenging engineering problems throughout his tech career at companies that span across multiple consumer industries and cover the range from startup to the Fortune 4.Aaron is most notably known... Read More →
Thursday August 22, 2024 11:30am - 12:00pm PDT
Capri 8

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link