Home > Store

Site Reliability Engineering Essentials (Video Course)

Site Reliability Engineering Essentials (Video Course)

Your browser doesn't support playback of this video. Please download the file to view it.

Online Video

Register your product to gain access to bonus material or receive a coupon.

Description

  • Copyright 2025
  • Edition: 1st
  • Online Video
  • ISBN-10: 0-13-541500-4
  • ISBN-13: 978-0-13-541500-9

4+ Hours of Video Instruction

Master the essentials of Site Reliability Engineering to effectively manage production systems with real-world insights and techniques.

Unlock the power of Site Reliability Engineering (SRE) with this comprehensive video course. SRE is a critical discipline that combines software engineering with IT operations to ensure high system reliability, scalability, and performance. This course provides a deep dive into the core principles and practices of SRE, equipping you with the tools to build reliable systems and improve operational efficiency.

The course covers key SRE concepts, including Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets, with practical examples that help you apply these principles to your own organization. You will learn how to build and optimize a robust monitoring and observability system using essential telemetry data, such as logs, metrics, and traces. Through an in-depth exploration of observability platforms, you will learn how to effectively monitor and maintain system health.

The course also addresses crucial aspects of incident management, such as managing on-call duties, running war rooms for critical incidents, and conducting blameless postmortems to learn from failures. Gain insights into reliable system architecture patterns, such as load balancing, auto-scaling, and the CAP theorem, to ensure your infrastructure remains resilient under high traffic.

Additionally, you will discover release management strategies that minimize user impact during deployments, monitor your CI/CD pipeline, and ensure progressive rollouts. The course also guides you through implementing SRE practices within your organization, including setting up a central SRE team and conducting production readiness reviews to ensure your systems are always production ready.

By the end of this course, you will have a solid understanding of SRE best practices and the knowledge to enhance the reliability and scalability of your systems while reducing downtime and improving overall operational efficiency.

Learn How To:

  • Set a strong foundation by implementing core Site Reliability Engineering (SRE) principles to ensure system reliability and performance.
  • Build and optimize a robust monitoring and observability system using essential telemetry data such as logs, metrics, and traces.
  • Monitor system health effectively through observability platforms to maintain optimal system performance.
  • Apply Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets to improve system reliability and performance.
  • Manage incidents effectively, run war rooms for critical situations, and conduct blameless postmortems to learn from failures.
  • Design reliable system architectures, including load balancing, auto-scaling, and implementing the CAP theorem for system resilience.
  • Minimize user impact during software deployments by using release management strategies and ensuring progressive rollouts.
  • Monitor your CI/CD pipeline to detect issues early and ensure smooth, efficient deployments.
  • Implement SRE practices within your organization, including setting up a central SRE team and conducting Production Readiness Reviews to ensure systems are always production ready.

Who Should Take This Course: 

This course is designed for Site Reliability Engineers, DevOps engineers, application support engineers, software engineers and architects, as well as managers and directors of software engineering teams.

About Pearson Video Training:

Pearson publishes expert-led video tutorials covering a wide selection of technology topics designed to teach you the skills you need to succeed. These professional and personal technology videos feature world-leading author instructors published by your trusted technology brands: Addison-Wesley, Cisco Press, Pearson IT Certification, Prentice Hall, Sams, and Que Topics include: IT Certification, Network Security, Cisco Technology, Programming, Web Development, Mobile Development, and more.  Learn more about Pearson Video training at  http://www.informit.com/video.

Video Lessons are available for download for offline viewing within the streaming format. Look for the green arrow in each lesson.

Sample Content

Table of Contents

Course Introduction

Lesson 1: Introduction to Site Reliability Engineering 

What is Site Reliability Engineering?

Core Tenets of SRE

Benefits of SRE

DevOps vs. SRE vs. Platform Engineering

A Typical Day of an SRE

    

Lesson 2: Observability  

 What to Monitor

 Logs, Metrics, and Traces

 The Four Golden Signals

 Observability Platforms

 Demo: Monitoring Using Splunk

    

 Lesson 3: SLO, SLI and SLA

Service Level Objectives 

 Service Level Indicators and Service Level Agreements

 Implementing SLOs: Real-world Examples        

 Using Error Budgets

 Demo: SLO/SLI

    

Lesson 4: Incident Management, SRE style

Managed vs. Unmanaged Incidents

Running War Rooms

Conducting Blameless Postmortems

Using Postmortem Templates

 Being On-call

    

Lesson 5: Reliable System Architectures

Load Balancing

Handling Failures

CAP Theorem and its Implementation

Auto Scaling

Lesson 6: Release Management

Progressive Rollout

Minimizing user Impact During Releases

Monitoring the CI/CD Pipeline           

Rolling Back Changes

    

Lesson 7: Implementing SRE

Four Ways of Implementing in Your Organization    

Benefits of a Central SRE Team

Production Readiness Review

    

Lesson 8: Course Conclusion and Next Steps

Course Summary

Next Steps

Summary

Next steps and fresh thinking

Updates

Submit Errata

More Information

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.