Search by job, company or skills

DoHome Public Company Limited

Site Reliability Engineer Manager

This job is no longer accepting applications

new job description bg glownew job description bg glownew job description bg svg
  • Posted 2 months ago

Job Description

Job Responsibilities

  • Collect and analyze operating system and application metrics to support performance tuning and incident troubleshooting
  • Collaborate with development teams to improve service quality through rigorous testing and release processes
  • Participate in system design consulting, platform operations, and capacity planning
  • Develop sustainable systems and services through automation and continuous improvements
  • Balance feature development velocity with system reliability using well-defined service-level objectives (SLOs)
  • Design and build software and systems to manage platform infrastructure and applications
  • Enhance reliability, quality, and time-to-market of the organization's software solutions
  • Measure, analyze, and optimize system performance to anticipate customer needs and drive continuous innovation
  • Provide primary operational support and engineering ownership for multiple large-scale distributed software applications

Qualifications

  • Bachelor's degree or equivalent experience in Computer Science or a related field
  • Proficiency in structured and object-oriented programming using one or more high-level languages such as Go, Python, Java, C/C++, Ruby, or JavaScript
  • Experience with distributed storage technologies including NFS, HDFS, Ceph, and cloud-based object storage
  • Hands-on experience with dynamic resource management and orchestration frameworks such as Apache Mesos, Kubernetes, or YARN
  • Proactive mindset with the ability to identify system issues, performance bottlenecks, and opportunities for improvement

More Info

Job Type:
Industry:
Employment Type:

Job ID: 140440771