- Company Name
- Agile Fuel | World-class Dedicated Engineering Teams
- Job Title
- Site Reliability Engineer/ Platform Engineer
- Job Description
-
**Job Title:**
Site Reliability Engineer / Platform Engineer
**Role Summary:**
Design, build, and operate a scalable Azure-based cloud infrastructure and internal developer platform for an AI-focused technology company. Drive automation, observability, and reliability across CI/CD pipelines, deployment workflows, and monitoring systems to enable high-speed, reliable software delivery.
**Expectations:**
- Deliver production‑grade, fault‑tolerant platform components with strong focus on automation and cost efficiency.
- Partner with backend, data engineering, and external service teams to align infrastructure with AI product growth.
- Foster an AI‑native culture, leveraging LLM‑powered tooling to accelerate engineering workflows.
**Key Responsibilities:**
1. Architect and maintain Azure Container Apps, ACR, serverless services, and managed databases.
2. Own CI/CD pipelines, deployment workflows, and full observability stack (monitoring, logging, alerting, SLO/SLA).
3. Develop Python‑based tooling and automation for platform reliability and AI workloads.
4. Design secure, fault‑tolerant integrations with GitHub, Jira, Azure services, Redis, Sentry, etc.
5. Optimize cost, reliability, and deployment velocity continuously.
6. Scale AI infrastructure to support organizational transition to an AI‑native engineering model.
7. Drive automation initiatives and implement LLM‑powered pipelines where applicable.
**Required Skills:**
- 5+ years in DevOps / SRE / Platform Engineering.
- Expert in Azure cloud (container services, serverless, networking, identity, PaaS).
- Strong Python programming for tooling and backend services.
- Proficiency in Docker, container orchestration, and cloud‑native patterns.
- Experience with CI/CD, monitoring, alerting, and observability tools.
- Knowledge of resilience engineering (retries, backoff, idempotency).
- Automation mindset; ability to identify and implement automated solutions.
- Startup mentality with ownership, speed, and multi‑skill flexibility.
- Upper‑intermediate English proficiency.
**Bonus Skills:**
- IaC (Bicep, Terraform).
- Python/Django or data pipeline background.
- Celery, distributed queues, event‑driven systems.
- SOC2/enterprise‑grade environment experience.
- Building internal developer platforms (IDPs) or self‑service infra.
**Required Education & Certifications:**
- Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent practical experience).
- Certifications in Azure (e.g., Azure Solutions Architect, Azure DevOps Engineer) are advantageous but not mandatory.
Mountain view, United states
On site
Mid level
27-11-2025