Search by job, company or skills

N

Senior Site Reliability Engineer

4-6 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted a month ago
  • Be among the first 10 applicants
Early Applicant

Job Description


Role Description

  • Apply SRE principles to ensure the reliability, availability, scalability, and performance of production systems
  • Design, implement, and maintain automation and Infrastructure as Code to reduce operational toil and manual intervention
  • Operate and Optimize services in AWS and containerized environments (EKS/ECS)
  • Ensure platform aligns with compliance requirements
  • Build and operate CI/CD pipelines using Gitlab
  • Define, and implement Service Level Objectives (SLOs), and error budgets
  • Implement and maintain observability solutions including metrics, logs, and traces to proactively detect and diagnose system issues
  • Contribute to incident response, including triage, mitigation, root cause analysis (RCA), and post-incident reviews
  • Identify systemic reliability risks, performance bottlenecks, and capacity constraints; collaborate with the team to address them
  • Work closely with devs to ensure systems are designed for operability, resilience, and maintainability
  • Perform performance testing, capacity planning, and availability analysis to support system growth and scaling
  • Continuously evaluate and improve tooling related to reliability, monitoring, alerting, and cost efficiency
  • Document operational knowledge, runbooks, and best practices to improve operational readiness

Qualifications

  • 4 years+ experience
  • Must have AWS, Terraform, CI/CD, Jenkins/Helm,and Python.
  • Hybrid setup (3x/week) in Mandaluyong, dayshift schedule.
  • Can start ASAP

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 140871525

Similar Jobs