
Search by job, company or skills
People & Team Leadership
Lead, coach, and mentor IT engineers to build strong technical and leadership capabilities.
Set clear performance goals aligned with our Beliefs, Vision, Mission, Methods (BVMM).
Conduct 1:1s, performance reviews, and career growth discussions.
Foster a culture of ownership, collaboration, and continuous learning.
Maintain balanced workloads, shift coverage, and clear succession plans to sustain healthy 24×7 operations.
Service Operations & Reliability
Oversee daily service health, capacity, and reliability across all supported environments.
Ensure compliance with operational KPIs through proactive planning and improvement.
Balance demand vs. capacity and manage shift coverage to prevent burnout.
Partner with engineering teams to maintain runbooks, knowledge bases, and escalation paths.
Drive automation and workflow optimization to reduce manual overhead.
Use data insights to guide decisions and improvements.
Incident & Problem Management
Lead end-to-end incident response, triage, communication, and resolution in real time.
Act as Incident Commander for high-impact events across a global environment.
Track and improve metrics like MTTD, MTTM, and MTTR.
Champion blameless Post-Incident Reviews (PIRs) and translate learnings into long-term system and process improvements.
Qualifications
• Bachelor's degree in Computer Science, Information Technology, Engineering, or a related discipline
• 3+ years in Service Delivery, Incident Response, or Operations Leadership within enterprise-scale, 24×7 environments
• Proven experience managing technical teams, driving performance, and leading through critical situations
• Strong grounding in ITSM / ITIL principles (Incident & Problem Management)
• Familiarity with cloud, distributed systems, or enterprise infrastructure
• Skilled in monitoring, alerting, and ticketing tools (e.g., PagerDuty, Datadog, Grafana, Splunk, ServiceNow)
Job ID: 145021885