Introduction to Site Reliability Engineering
- Thorough understanding of Site Reliability Engineering
- Understand the core principles of Site Reliability Engineering, and how cloud computing enables this
- DevOps vs SRE
Public Cloud Overview and Linux Basics
- Public Cloud Overview – Compute, Containers, Storage and Observability
- Characteristics of a good SRE and SRE Foundational Skillset
- Linux, Automation, IP Address Subnetting, VI Editor
Application deployment
- Setup CI/CD Pipeline
- Infrastructure as a Code using Terraform
- Build Infra, Deploy app and Implement Observability
- Deploy a simple Microservice application
Application Monitoring and performance tuning
- Install a monitoring solution to monitor cluster and application resources
- Check vulnerabilities in Terraform code and Kubernetes cluster
- Understand the concept of reliability and its significance in ensuring system stability and performance
Service Level Objectives (SLO), Service Level Indicators (SLI) and Error Budgeting
- Identify different types of Service Level Indicators (SLIs) and their role in measuring system performance
- Define Service Level Objectives (SLOs) and recognize various types along with best practices for setting them effectively
- Gain proficiency in managing Error Budgets and implementing Error Budget Policies to maintain service reliability within defined thresholds
- Differentiate between SLIs, SLOs, and Error Budget Policies, and articulate their importance in ensuring system resilience
- Explore Non-functional requirements and their impact on system design and performance
- Discover the concept of observability and familiarize yourself with monitoring tools essential for maintaining system health
- Apply theoretical knowledge to practical scenarios by analyzing examples of SLIs and SLOs in real-world contexts.
- Identify key roles that contribute significantly to ensuring system reliability and understand their responsibilities in fostering a culture of reliability.
Prerequisites:
- Understandings of Cloud computing
- Technical education
Suitable for:
- Anyone – wanting to kickstart a career in SRE
- Software Engineers
- Platform Engineers
- System Admins
- DevOps Engineers