Job Description: Tooling SME
Position Overview
The SME is responsible for providing technical expertise in the implementation, configuration, integration and optimization of IT operations and monitoring tools across enterprise environments. The role involves enabling end to end observability, ensuring tools availability, managing upgrades, maintaining tool performance and aligning tool capabilities with business service objectives and operational process
Key Responsibilities
- Deploy, configure, and optimize enterprise monitoring tools
- Lead onboarding of applications of business services into the monitoring ecosystem ensuring complete telemetry coverage
- Create, maintain and optimize dashboards, alerts and service level reports for various stakeholders.
- Integrate observability tools with ITSM, CMDB and automation platforms to enable event correlation and incident automation
- Ensure tool health, performance and scalability through proactive maintenance and upgrades
- Support resilience and availability reporting by aligning tool outputs with business service KPIs and SLIs
- Participate in POCs and evaluate new tools to enhance observability
Required skillset & experience:
- 5+ years of experience in IT monitoring/ operations/ observability management
- Technical expertise in atleast 2 areas below
- APM tools,: APPD, ELK, Science logic
- Infra monitoring: Solarwinds, Grafana
- Event management: ServiceNow, Moogsoft
- Log management: ELK, Splunk
- Integrations using rest APIs , CMDB synchronization and data enrichment
- Automation (Python, Ansible)
- Familiarity with incident, problem, change management processes
- Strong analytical, reporting and stakeholder management skills.