Search by job, company or skills

I

Application Consultant-Site Reliability

Save
  • Posted 9 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Introduction

We are hiring for a newly established Production Operations Observability Pod, responsible for real time monitoring, alert triage, and first level support across critical production systems. The analysts will actively monitor dashboards, alerts, and logs using Splunk, Dynatrace, and Grafana, execute operational runbooks, troubleshoot issues, create incidents, and engage the appropriate engineering teams. Candidates must be eager to learn new technologies, work effectively in a fast paced environment, and be ready to provide 24/7 rotational support.

Your Role And Responsibilities

  • 2–5 years of experience in Production Support / NOC / L1 Ops roles.
  • Strong hands on experience with Splunk, Dynatrace, Grafana.
  • Ability to analyze logs, metrics, and traces to identify first level issues.
  • Experience executing operational runbooks.
  • Knowledge of ITSM tools like ServiceNow.
  • Strong communication and documentation skills.
  • Ability to work in fast paced production environments.
  • Willingness to work 24/7 rotational shifts.
  • Eagerness to learn new tools and technologies.

Required Technical And Professional Expertise

  • 2–5 years of experience in Production Support / NOC / L1 Ops roles.
  • Strong hands on experience with Splunk, Dynatrace, Grafana.
  • Ability to analyze logs, metrics, and traces to identify first level issues.
  • Experience executing operational runbooks.
  • Knowledge of ITSM tools like ServiceNow.
  • Strong communication and documentation skills.
  • Ability to work in fast paced production environments.
  • Willingness to work 24/7 rotational shifts.
  • Eagerness to learn new tools and technologies.

Preferred Technical And Professional Experience

  • Basic understanding of AWS.
  • Familiarity with microservices, APIs, Linux basics, networking.
  • Exposure to Prometheus, CloudWatch.
  • Understanding of incident management and SRE/DevOps culture.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 148970651