
Search by job, company or skills
Qualifications
Educational Background
BS Degree of any related course
Work experience required
3-5 years of experience
Roles and Responsibilities
The Rafay Kubernetes Platform Administrator will be responsible for managing and
supporting Kubernetes clusters on the Rafay platform during Day 2 operations. This
role focuses on ongoing maintenance, monitoring, troubleshooting, and
optimization to ensure high availability,and performance of Kubernetes platform.
Key Responsibilities
Cluster Health Monitoring: Continuously monitor Kubernetes clusters for
performance, resource utilization, and availability.
Patch & Upgrade Management: Apply patches and perform version
upgrades for Kubernetes clusters and associated components.
Backup & Restore Operations: Execute scheduled backups and validate
restore processes for disaster recovery readiness.
Policy Enforcement: Maintain and update governance/security policies
(OPA) as per compliance requirements.
Access Management: Manage user roles, permissions, and zero-trust
access configurations.
Observability & Dashboards: Maintain dashboards for cluster health,
application performance, and cost visibility.
Incident Management: Troubleshoot and resolve cluster level issues,
including RCA (Root Cause Analysis).
Service Mesh Maintenance: Manage mTLS certificates and traffic policies
for secure service-to-service communication.
Network Policy Updates: Review and update network policies for workload
isolation and security posture.
Periodic Reviews: Conduct health checks, compliance audits, and
performance reviews.
Customer Support & Advisory: Provide guidance on best practices, new
features, and optimization strategies.
Required Skills & Experience
Strong hands-on experience with Kubernetes administration and Rafay
Kubernetes Management Platform.
Experience with monitoring tools (Prometheus, Grafana) and backup
solutions for Kubernetes.
Knowledge of network policies, RBAC, and zero-trust security principles.
Ability to perform root cause analysis and resolve complex cluster issues.
Excellent communication and documentation skills.
Job ID: 134864097