- Company Name
- Blue Shield of California
- Job Title
- Site Reliability Engineer, Consultant
- Job Description
-
**Job Title:** Site Reliability Engineer, Consultant
**Role Summary:**
Deliver continuous reliability, scalability, and performance for production infrastructure and applications. Design, automate, monitor, and incident‑resolve across multi‑cloud environments while collaborating with development and operations teams to enhance system architecture and delivery pipelines.
**Expactations:**
- Operate 24/7 observability and incident response.
- Maintain high availability and meet defined SLOs with proactive error budget management.
- Lead automation initiatives that reduce manual effort and increase deployment velocity.
- Provide thorough root‑cause analysis and post‑incident reviews.
**Key Responsibilities:**
- Design, build, and maintain scalable infrastructure on Azure, AWS, or GCP.
- Create and maintain CI/CD pipelines (Jenkins, GitHub Actions, GitLab CI, Argo CD, Spinnaker).
- Develop automation scripts (Python, Go, Bash, PowerShell, Ansible, Chef, Puppet).
- Configure container orchestration (Kubernetes, OpenShift, Docker, Helm).
- Implement monitoring and observability solutions (Prometheus, Grafana, Datadog, New Relic, ELK Stack, Dynatrace, Splunk, SolarWinds).
- Conduct root‑cause analysis and drive post‑mortems.
- Define and enforce SLOs, error budgets, and capacity planning.
- Collaborate with cross‑functional teams to improve system architecture and reliability.
**Required Skills:**
- Cloud platforms: Azure, AWS, GCP.
- Programming/scripting: Python, Go, Java, Bash, PowerShell.
- Containerization/orchestration: Kubernetes, OpenShift, Docker, Helm.
- Observability: Prometheus, Grafana, Datadog, New Relic, ELK Stack, Dynatrace, Splunk.
- CI/CD & configuration management: Jenkins, GitHub Actions, GitLab CI, Argo CD, Spinnaker, Ansible, Chef, Puppet.
- Experience with intelligent automation and agentic AI systems for incident resolution.
**Required Education & Certifications:**
- BS in Computer Science or equivalent (minimum) | MS preferred.
- 7+ years of engineering/execution experience with production systems.
- Certifications in cloud (e.g., AWS Certified Solutions Architect, Azure Solutions Architect, GCP Professional Cloud Architect) or relevant SIAMP, Kubernetes, or DevOps certifications are a plus.
California, United states
Hybrid
Senior
19-10-2025