Site Reliability Engineer

Unison Consulting

  • Kuala Lumpur
  • Tetap
  • Sepenuh masa
  • 1 bulan lepas
  • Permohonan Mudah
  • Design and implement resilient system architectures that support high availability and scalability.
  • Develop automation tools and scripts to enhance operational efficiency and reduce manual effort.
  • Define, track, and analyze SLOs and SLIs to ensure reliability and performance meet business needs.
  • Conduct thorough post-mortem analyses following incidents, driving continuous improvement through root cause identification and solution implementation.
  • Collaborate with development and operations teams to establish best practices in system reliability and incident management.
  • Troubleshoot and resolve issues related to database performance, network connectivity, and deployment failures, including diagnosing problems at the underlying platform level (e.g., Kubernetes, virtual machines).
  • Ensure that issues are resolved within the stipulated Service Level Agreements (SLAs), maintaining high standards of service delivery.
  • Identify and troubleshoot performance bottlenecks across systems, providing actionable recommendations for enhancements.
  • Maintain detailed documentation of processes and incident responses to support knowledge sharing and compliance.
Requirements
  • Proficiency in programming languages such as Python, Golang, Java, or similar, focusing on operational efficiency.
  • Demonstrated experience in system architecture and design, prioritizing reliability, and scalability.
  • Strong understanding of SRE principles, including SLOs, SLIs, toil reduction, and incident post-mortems.
  • Experience with cloud environments (e.g., AWS, Azure, Google Cloud) and their operational management.
  • Strong expertise in Linux system administration.
  • Proven experience in troubleshooting application support issues with a focus on performance and connectivity.
  • Familiarity with networking concepts and effective troubleshooting techniques.
  • Excellent problem-solving abilities and a proactive approach to operational challenges.
  • Ability to work independently while effectively collaborating within a team environment.
Preferred Skills:
  • Familiarity with monitoring tools and performance optimization techniques.
  • Experience in scripting or automation for system administration tasks.
  • Knowledge of networking concepts and troubleshooting methodologies.
  • Hands-on knowledge of cloud platforms (e.g., AWS, Azure, Google Cloud) and their services.
  • Familiarity with DevOps practices and frameworks, including CI/CD, infrastructure as code, and containerization.

Unison Consulting

Pekerjaan yang sama

  • Site Supervisor

    A&T Creative Home Sdn Bhd

    • Kuala Lumpur
    Mohon Kelayakan Diploma dalam bidang berkaitan seperti Pembinaan, Kejuruteraan Awam, atau bidang yang setara. Sekurang-kurangnya 2 tahun pengalaman kerja yang relevan dalam p…
    • 28 hari lepas
  • Site Supervisor - Bored Piles

    • Kuala Lumpur
    Tanggungjawab: Menyelia kerja-kerja structural piling dan bored pile di tapak pembinaan mengikut spesifikasi teknikal, pelan kejuruteraan, dan jadual kerja. Menyemak dan mengesahka…
    • 21 hari lepas
  • Site Manager (Construction - Earthwork and Infrastructure)

    Landhon Builders Sdn Bhd

    • Puchong, Selangor
    Mohon Kelayakan Diploma/Ijazah dalam Kejuruteraan Awam, Pengurusan Pembinaan, atau yang setara. Minimum 5 tahun pengalaman dalam projek bangunan atau infrastruktur. Berkebol…
    • 6 hari lepas