Search by job, company or skills

Atos International

Monitoring Tools Administrator

6-8 Years
Save
  • Posted 11 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Role: Zabbix Administrator

Location: 100% Remote -Philippines

Fulltime with -ATOS

Key Responsibilities:

  • Zabbix Platform Management: Install, configure, and maintain Zabbix servers (latest versions), proxies, and agents across multiple data center and cloud environments. Ensure high availability, performance tuning, and regular upgrades/patching of the Zabbix monitoring platform for 24x7 operational reliability.
  • Enterprise Monitoring Design: Architect and implement enterprise-wide monitoring strategies using Zabbix for diverse IT assets – including servers, networks, databases, applications, and cloud services – to provide comprehensive visibility into system health and performance. Develop and manage advanced monitoring templates, Low-Level Discovery (LLD) rules, and custom triggers to detect key events and performance anomalies proactively.
  • Alerting & Incident Response: Configure intelligent alerting and notification policies, integrating Zabbix with IT Service Management (ITSM) tools (e.g., ServiceNow) to ensure automated ticket creation and timely incident escalation. Rapidly respond to monitoring alerts and high-severity incidents, performing deep-dive root cause analysis and remediation for complex issues across infrastructure and applications.
  • Performance Optimization: Continuously analyze monitoring data and performance metrics to identify trends, recurring problems, or capacity issues. Implement improvements to reduce false positives, optimize thresholds, and enhance monitoring accuracy, aiming for proactive issue prevention and improved system stability.
  • Automation & Integration: Leverage scripting (e.g., Bash, Python) and automation tools (e.g., Ansible, Chef) to automate routine monitoring tasks and integrate Zabbix with other enterprise tools and DevOps/CI-CD pipelines where applicable. Ensure monitoring covers hybrid cloud architectures by linking Zabbix with cloud-native observability services (AWS CloudWatch, Azure Monitor, etc.) for a unified monitoring approach.
  • Collaboration & Stakeholder Communication: Work closely with infrastructure, application, cloud, and network teams to define monitoring requirements and onboard new systems or updates into the Zabbix environment. Interface with client stakeholders and service managers to report on monitoring health, service levels, and improvement initiatives. Coordinate with L1/L2 support teams to provide guidance, transfer knowledge, and ensure efficient resolution of monitoring-related incidents, acting as the final escalation point for complex problems.
  • Documentation & Continuous Improvement: Maintain detailed technical documentation, including monitoring configurations, procedures, and knowledge base articles for use by L1/L2 teams. Contribute to continuous service improvement initiatives by identifying opportunities to enhance monitoring coverage, reduce incident frequency, and refine processes in line with ITIL best practices.

Required Skills & Qualifications:

  • Proven Monitoring Expertise: 5+ years of experience in IT systems/infrastructure support, with at least 3+ years of hands-on Zabbix administration in large-scale environments. In-depth knowledge of Zabbix architecture, configuration, and key features (e.g., templates, triggers, actions, macros, API) for enterprise monitoring solutions.
  • Technical Proficiency: Strong background in Linux system administration (installation, shell scripting) to support Zabbix on Linux servers. Familiarity with network protocols (SNMP, TCP/IP) and performance metrics used in monitoring network devices, servers, and applications. Working knowledge of database systems (MySQL/PostgreSQL) for Zabbix backend management and performance tuning.
  • Advanced Troubleshooting: Demonstrated ability to troubleshoot and resolve complex monitoring issues, including analyzing Zabbix logs, performance data, and system metrics to identify root causes. Able to optimize Zabbix configurations (e.g., proxies, pollers, escalators) for high availability and efficient resource usage in mission-critical setups.
  • Scripting & Automation Skills: Proficiency in scripting languages (Bash, Python, or similar) to automate monitoring tasks, create custom scripts for data collection, and integrate Zabbix with external systems or APIs.
  • ITSM & Process Knowledge: Experience working within an ITIL-aligned environment (especially Incident, Problem, and Change Management processes) and using IT Service Management tools (e.g., ServiceNow) to manage alerts, incidents, and change requests in a structured AMS/outsourcing context.
  • Education: Bachelor's degree in Computer Science, Information Technology, Engineering, or equivalent practical experience in a related field.

Preferred Skills (Nice to Have):

  • Cloud & DevOps Monitoring: Experience monitoring cloud platforms (AWS, Azure, GCP) and containerized environments (Docker, Kubernetes) using Zabbix or complementary tools. Familiarity with DevOps/CI-CD practices and how monitoring fits into automated deployment and delivery pipelines.
  • Additional Monitoring Tools: Exposure to other monitoring and observability tools (e.g., Prometheus, Grafana, ELK/Elastic Stack, Splunk, etc.) and integrating them with Zabbix for comprehensive dashboards and analytics.
  • Advanced Zabbix Features: Experience implementing Zabbix high-availability clusters or distributed monitoring (multiple proxies, redundant servers) and using the Zabbix API for custom integrations or advanced configurations.
  • Domain Knowledge: Hospitality or similar industry experience supporting mission-critical systems, understanding the need for high uptime and rapid issue response in a 24x7 customer-facing environment.

Experience Level:

  • Senior (L3) Level: This is a senior individual contributor position requiring a seasoned professional who can act as a subject matter expert (SME) in enterprise monitoring. Minimum 6-8 years of overall IT operations experience, including significant tenure in monitoring/observability roles. Prior experience as an L3 support engineer or tool administrator in a managed services or large enterprise NOC/SOC environment is highly desirable.

Certifications:

  • Zabbix Certifications:Zabbix Certified Specialist (ZCS) or Zabbix Certified Professional (ZCP) certification (or higher) is strongly preferred, demonstrating advanced knowledge of Zabbix deployment and management.
  • ITIL Certification: ITIL v3/v4 Foundation or higher is a plus, indicating understanding of IT service management best practices in an AMS context.
  • Relevant IT Certifications: Any additional relevant technical certifications are advantageous (e.g., Linux administration, cloud certifications like AWS/Azure, DevOps tools, or related system monitoring credentials).

Working Conditions:

  • 24x7 Operations & Shifts: Willingness to work in a rotational shift schedule (including night shifts and weekends) to support continuous 24x7 monitoring operations for a global hospitality client. This includes participation in an on-call rotation for after-hours emergency support or critical escalations.
  • Offshore Delivery Model: Based at Atos Global Delivery Center (GDC) Philippines, providing remote support to global client sites. Must collaborate effectively with both local team members and international colleagues across different time zones.
  • AMS/ITIL Environment: Work within a structured Application Management Services framework, adhering to Atos and client policies and procedures. Follow ITIL-aligned processes for incident management, change management, and problem management, ensuring SLAs and quality standards are consistently met.
  • Tools & Infrastructure: Operate in a fully equipped professional IT environment with access to enterprise-grade tools (e.g., ServiceNow for ITSM, collaboration platforms, secure remote access infrastructure). Ensure compliance with Atos and client security and data protection policies while performing all duties.

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 149117501