Search by job, company or skills

woxa group

Site Reliability Engineer

Save
  • Posted 12 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Description:

  • Deeply technical and resilient SRE & Cloud Infrastructure Lead to architect, build, and defend the reliability of our complex Hybrid/Multi-Cloud infrastructure.
  • Act as Builder and the Uptime Guardian. Your team will own the entire lifecycle of our infrastructure
  • Manage infrastructure with IaC tool such as Terraform/OpenTofu
  • Define the Service Level Objectives (SLOs), Service Level Indicator(SLIs)
  • Manage workloads across On-Premise data centers, AWS, GCP, and Digital Ocean, running heavily on Kubernetes
  • Play a critical role in our ISO 27001 compliance journey.
  • Own Capacity Management, Business Continuity, Disaster Recovery(BCDR), and production Incident Management
  • Cultivating a culture where automation replaces toil and production failures drive blameless, continuous improvement.

Required Qualifications & Skills:

  • Experience: 5+ years in Site Reliability Engineering, Cloud Infrastructure, or DevOps, with leading and scaling technical teams in a 24/7 production environment.
  • Infrastructure as Code (IaC): Expert-level proficiency in writing modular, scalable, and secure infrastructure using Terraform or OpenTofu.
  • Kubernetes & Container Mastery: Expertise building and managing Kubernetes (K8s) clusters in production (kubeadm, EKS, GKE, etc.).Strong understanding of service meshes and ingress controllers.
  • Hybrid/Multi-Cloud Architecture: Extensive experience architecting VPCs, IAM roles, databases, and networking across bare-metal/on-premise servers and major cloud providers.
  • Observability Tooling: Expert-level knowledge of modern monitoring, logging, and tracing tools.
  • Software Engineering Fundamentals: Strong programming skills in Go, Python, or Bash. You approach operational and infrastructure challenges as software engineering problems.
  • Mindset & Leadership: Analytical leader with ability to negotiate with stakeholders when feature velocity needs to be sacrificed for system stability.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 149044313

Similar Jobs

Remote, India

Skills:

GcpDatadogPrometheusAzureTerraformGrafanaJenkinsAnsibleGitHub ActionsAI-OpsGCP Operations SuiteAzure Monitor

Remote

Skills:

TerraformCloud InfrastructureDevopsSRE