- Company Name
- Nulogy
- Job Title
- Head of Infrastructure
- Job Description
-
Job Title: Head of Infrastructure
Role Summary:
Lead the design, implementation, and operation of cloud infrastructure for a global manufacturing technology platform. Ensure high availability, scalability, security, and cost efficiency of Kubernetes‑based services while driving automation, reliability engineering, and continuous improvement across production environments.
Expectations:
- Define and execute a strategic infrastructure roadmap aligned with business goals.
- Maintain mission‑critical uptime, recoverability, and performance for supply‑chain clients.
- Optimize costs, especially within AWS, while preserving scalability and security.
- Lead a Site Reliability Engineering team in incident management, DR drills, and architectural decisions.
- Foster collaboration with product, dev‑ops, and security teams to embed reliability into the software lifecycle.
Key Responsibilities:
- Oversee highly available Kubernetes clusters (EKS), relational databases (PostgreSQL/MySQL), and supporting AWS services (RDS, S3, Kafka, GuardDuty).
- Manage incident response, root‑cause analysis, and disaster‑recovery readiness; conduct regular DR drills.
- Develop and maintain CI/CD pipelines (Buildkite, Helm) and infrastructure‑as‑code (Terraform, CloudFormation) for rapid, reliable deployments.
- Identify and recommend cost‑reduction opportunities and performance optimizations across all infrastructure layers.
- Implement and maintain monitoring, alerting, and logging solutions to support visibility and rapid issue resolution.
- Author and maintain architectural documentation, reference guides, and operational playbooks.
- Participate in on‑call rotations and mentor team members on reliability practices.
- Promote a culture of automation, testing, and continuous improvement within both infrastructure and development teams.
Required Skills:
- Advanced proficiency in AWS (EKS, RDS, S3, Kafka, GuardDuty) and Kubernetes cluster management.
- Expertise in relational database performance tuning (PostgreSQL/MySQL).
- Strong knowledge of security best practices (OWASP, IAM, network segmentation).
- Experience building and managing CI/CD pipelines (e.g., Buildkite) and developing infrastructure as code (Terraform, CloudFormation, Helm).
- Familiarity with audited environments (SOC 2, ISO 27001) and related compliance requirements.
- Proficiency in shell scripting, developer tooling, and automation.
- Analytical, data‑driven decision‑making with a meticulous approach to reliability and detail.
Required Education & Certifications:
- Bachelor’s degree in Computer Science, Engineering, or equivalent, or substantial equivalent experience.
- Demonstrated knowledge of SOC 2 and ISO 27001 audit processes.
- Preferred: AWS Certified Solutions Architect, Certified Kubernetes Administrator, or similar credentials.