
Search by job, company or skills
Role Overview
We are seeking a Highly skilled Observability SME/Lead to lead the Observability tower. The SME will be responsible for designing, implementing, and optimizing observability frameworks to ensure seamless monitoring and visibility into applications, infrastructure, and networks.
Required qualifications:
10+ years of Deep expertise in observability and monitoring platforms (Prometheus, Grafana, Splunk, Datadog, Dynatrace, ELK, AppDynamics, etc.).
Prior experience leading enterprise-wide Observability transformations.
Key Responsibilities
1. Design and implement end-to-end observability strategies/Frameworks
2. Develop and maintain architecture standards, best practices, and governance models for observability solutions.
3. Integrate Observability with other tools (e.g., ITSM, AIOps, and logging platforms).
4. Ensure scalability, high availability, and performance of monitoring solutions.
5. Collaborate with DevOps, IT, and business teams to align observability strategies with organizational goals.
6. Conduct training sessions to empower teams with observability best practices.
7. Analyze metrics, logs, and traces to detect anomalies and performance bottlenecks.
8. Generate and distribute performance reports to stakeholders.
9. Fine-tune alerting thresholds and configurations.
10. Collaborate with incident response teams to troubleshoot and resolve issues.
Required Skills
Expertise in implementation and admin activities
Strong knowledge in open telemetry.
Good experience in data enrichment and data standardization
Proficiency in cloud platforms (AWS, Azure, GCP) and their monitoring capabilities.
Experience in containerized environments (Kubernetes, Docker) and related monitoring.
Knowledge of scripting (Python, Bash, PowerShell) for automation.
Understanding of AIOps and integration with observability platforms.
Familiarity with protocols like SNMP, REST API, and log forwarding.
Proficiency in creating dashboards, custom queries, and alerts in Dynatrace and Zabbix.
Understanding of monitoring key performance indicators (KPIs) for applications and infrastructure.
Soft Skills
Leadership in cross-functional team environments.
Excellent problem-solving and analytical skills.
Ability to convey complex observability concepts to stakeholders.
Job ID: 148959349
We don’t charge any money for job offers